The aquatic animals' transcriptome resource for comparative functional analysis.
Chou, Chih-Hung; Huang, Hsi-Yuan; Huang, Wei-Chih; Hsu, Sheng-Da; Hsiao, Chung-Der; Liu, Chia-Yu; Chen, Yu-Hung; Liu, Yu-Chen; Huang, Wei-Yun; Lee, Meng-Lin; Chen, Yi-Chang; Huang, Hsien-Da
2018-05-09
Aquatic animals have great economic and ecological importance. Among them, non-model organisms have been studied regarding eco-toxicity, stress biology, and environmental adaptation. Due to recent advances in next-generation sequencing techniques, large amounts of RNA-seq data for aquatic animals are publicly available. However, currently there is no comprehensive resource exist for the analysis, unification, and integration of these datasets. This study utilizes computational approaches to build a new resource of transcriptomic maps for aquatic animals. This aquatic animal transcriptome map database dbATM provides de novo assembly of transcriptome, gene annotation and comparative analysis of more than twenty aquatic organisms without draft genome. To improve the assembly quality, three computational tools (Trinity, Oases and SOAPdenovo-Trans) were employed to enhance individual transcriptome assembly, and CAP3 and CD-HIT-EST software were then used to merge these three assembled transcriptomes. In addition, functional annotation analysis provides valuable clues to gene characteristics, including full-length transcript coding regions, conserved domains, gene ontology and KEGG pathways. Furthermore, all aquatic animal genes are essential for comparative genomics tasks such as constructing homologous gene groups and blast databases and phylogenetic analysis. In conclusion, we establish a resource for non model organism aquatic animals, which is great economic and ecological importance and provide transcriptomic information including functional annotation and comparative transcriptome analysis. The database is now publically accessible through the URL http://dbATM.mbc.nctu.edu.tw/ .
Computational analysis of conserved RNA secondary structure in transcriptomes and genomes.
Eddy, Sean R
2014-01-01
Transcriptomics experiments and computational predictions both enable systematic discovery of new functional RNAs. However, many putative noncoding transcripts arise instead from artifacts and biological noise, and current computational prediction methods have high false positive rates. I discuss prospects for improving computational methods for analyzing and identifying functional RNAs, with a focus on detecting signatures of conserved RNA secondary structure. An interesting new front is the application of chemical and enzymatic experiments that probe RNA structure on a transcriptome-wide scale. I review several proposed approaches for incorporating structure probing data into the computational prediction of RNA secondary structure. Using probabilistic inference formalisms, I show how all these approaches can be unified in a well-principled framework, which in turn allows RNA probing data to be easily integrated into a wide range of analyses that depend on RNA secondary structure inference. Such analyses include homology search and genome-wide detection of new structural RNAs.
Kairov, Ulykbek; Cantini, Laura; Greco, Alessandro; Molkenov, Askhat; Czerwinska, Urszula; Barillot, Emmanuel; Zinovyev, Andrei
2017-09-11
Independent Component Analysis (ICA) is a method that models gene expression data as an action of a set of statistically independent hidden factors. The output of ICA depends on a fundamental parameter: the number of components (factors) to compute. The optimal choice of this parameter, related to determining the effective data dimension, remains an open question in the application of blind source separation techniques to transcriptomic data. Here we address the question of optimizing the number of statistically independent components in the analysis of transcriptomic data for reproducibility of the components in multiple runs of ICA (within the same or within varying effective dimensions) and in multiple independent datasets. To this end, we introduce ranking of independent components based on their stability in multiple ICA computation runs and define a distinguished number of components (Most Stable Transcriptome Dimension, MSTD) corresponding to the point of the qualitative change of the stability profile. Based on a large body of data, we demonstrate that a sufficient number of dimensions is required for biological interpretability of the ICA decomposition and that the most stable components with ranks below MSTD have more chances to be reproduced in independent studies compared to the less stable ones. At the same time, we show that a transcriptomics dataset can be reduced to a relatively high number of dimensions without losing the interpretability of ICA, even though higher dimensions give rise to components driven by small gene sets. We suggest a protocol of ICA application to transcriptomics data with a possibility of prioritizing components with respect to their reproducibility that strengthens the biological interpretation. Computing too few components (much less than MSTD) is not optimal for interpretability of the results. The components ranked within MSTD range have more chances to be reproduced in independent studies.
TCW: Transcriptome Computational Workbench
Soderlund, Carol; Nelson, William; Willer, Mark; Gang, David R.
2013-01-01
Background The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. Methodology The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. Conclusion It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw. PMID:23874959
TCW: transcriptome computational workbench.
Soderlund, Carol; Nelson, William; Willer, Mark; Gang, David R
2013-01-01
The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw.
This week, we are excited to announce the launch of the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) Proteogenomics Computational DREAM Challenge. The aim of this Challenge is to encourage the generation of computational methods for extracting information from the cancer proteome and for linking those data to genomic and transcriptomic information. The specific goals are to predict proteomic and phosphoproteomic data from other multiple data types including transcriptomics and genetics.
Joyce, Blake L.; Haug-Baltzell, Asher K.; Hulvey, Jonathan P.; McCarthy, Fiona; Devisetty, Upendra Kumar; Lyons, Eric
2017-01-01
This workflow allows novice researchers to leverage advanced computational resources such as cloud computing to carry out pairwise comparative transcriptomics. It also serves as a primer for biologists to develop data scientist computational skills, e.g. executing bash commands, visualization and management of large data sets. All command line code and further explanations of each command or step can be found on the wiki (https://wiki.cyverse.org/wiki/x/dgGtAQ). The Discovery Environment and Atmosphere platforms are connected together through the CyVerse Data Store. As such, once the initial raw sequencing data has been uploaded there is no more need to transfer large data files over an Internet connection, minimizing the amount of time needed to conduct analyses. This protocol is designed to analyze only two experimental treatments or conditions. Differential gene expression analysis is conducted through pairwise comparisons, and will not be suitable to test multiple factors. This workflow is also designed to be manual rather than automated. Each step must be executed and investigated by the user, yielding a better understanding of data and analytical outputs, and therefore better results for the user. Once complete, this protocol will yield de novo assembled transcriptome(s) for underserved (non-model) organisms without the need to map to previously assembled reference genomes (which are usually not available in underserved organism). These de novo transcriptomes are further used in pairwise differential gene expression analysis to investigate genes differing between two experimental conditions. Differentially expressed genes are then functionally annotated to understand the genetic response organisms have to experimental conditions. In total, the data derived from this protocol is used to test hypotheses about biological responses of underserved organisms. PMID:28518075
2010-01-01
Background Recent developments in high-throughput methods of analyzing transcriptomic profiles are promising for many areas of biology, including ecophysiology. However, although commercial microarrays are available for most common laboratory models, transcriptome analysis in non-traditional model species still remains a challenge. Indeed, the signal resulting from heterologous hybridization is low and difficult to interpret because of the weak complementarity between probe and target sequences, especially when no microarray dedicated to a genetically close species is available. Results We show here that transcriptome analysis in a species genetically distant from laboratory models is made possible by using MAXRS, a new method of analyzing heterologous hybridization on microarrays. This method takes advantage of the design of several commercial microarrays, with different probes targeting the same transcript. To illustrate and test this method, we analyzed the transcriptome of king penguin pectoralis muscle hybridized to Affymetrix chicken microarrays, two organisms separated by an evolutionary distance of approximately 100 million years. The differential gene expression observed between different physiological situations computed by MAXRS was confirmed by real-time PCR on 10 genes out of 11 tested. Conclusions MAXRS appears to be an appropriate method for gene expression analysis under heterologous hybridization conditions. PMID:20509979
ERIC Educational Resources Information Center
Grenville-Briggs, Laura J.; Stansfield, Ian
2011-01-01
This report describes a linked series of Masters-level computer practical workshops. They comprise an advanced functional genomics investigation, based upon analysis of a microarray dataset probing yeast DNA damage responses. The workshops require the students to analyse highly complex transcriptomics datasets, and were designed to stimulate…
Gonzalez, Sergio; Clavijo, Bernardo; Rivarola, Máximo; Moreno, Patricio; Fernandez, Paula; Dopazo, Joaquín; Paniego, Norma
2017-02-22
In the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species. We developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results. ATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net .
Lloréns-Rico, Verónica; Serrano, Luis; Lluch-Senar, Maria
2014-07-29
RNA sequencing methods have already altered our view of the extent and complexity of bacterial and eukaryotic transcriptomes, revealing rare transcript isoforms (circular RNAs, RNA chimeras) that could play an important role in their biology. We performed an analysis of chimera formation by four different computational approaches, including a custom designed pipeline, to study the transcriptomes of M. pneumoniae and P. aeruginosa, as well as mixtures of both. We found that rare transcript isoforms detected by conventional pipelines of analysis could be artifacts of the experimental procedure used in the library preparation, and that they are protocol-dependent. By using a customized pipeline we show that optimal library preparation protocol and the pipeline to analyze the results are crucial to identify real chimeric RNAs.
Microarray-Based Gene Expression Analysis for Veterinary Pathologists: A Review.
Raddatz, Barbara B; Spitzbarth, Ingo; Matheis, Katja A; Kalkuhl, Arno; Deschl, Ulrich; Baumgärtner, Wolfgang; Ulrich, Reiner
2017-09-01
High-throughput, genome-wide transcriptome analysis is now commonly used in all fields of life science research and is on the cusp of medical and veterinary diagnostic application. Transcriptomic methods such as microarrays and next-generation sequencing generate enormous amounts of data. The pathogenetic expertise acquired from understanding of general pathology provides veterinary pathologists with a profound background, which is essential in translating transcriptomic data into meaningful biological knowledge, thereby leading to a better understanding of underlying disease mechanisms. The scientific literature concerning high-throughput data-mining techniques usually addresses mathematicians or computer scientists as the target audience. In contrast, the present review provides the reader with a clear and systematic basis from a veterinary pathologist's perspective. Therefore, the aims are (1) to introduce the reader to the necessary methodological background; (2) to introduce the sequential steps commonly performed in a microarray analysis including quality control, annotation, normalization, selection of differentially expressed genes, clustering, gene ontology and pathway analysis, analysis of manually selected genes, and biomarker discovery; and (3) to provide references to publically available and user-friendly software suites. In summary, the data analysis methods presented within this review will enable veterinary pathologists to analyze high-throughput transcriptome data obtained from their own experiments, supplemental data that accompany scientific publications, or public repositories in order to obtain a more in-depth insight into underlying disease mechanisms.
Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.
Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin
2013-09-22
High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.
The cancer transcriptome is shaped by genetic changes, variation in gene transcription, mRNA processing, editing and stability, and the cancer microbiome. Deciphering this variation and understanding its implications on tumorigenesis requires sophisticated computational analyses. Most RNA-Seq analyses rely on methods that first map short reads to a reference genome, and then compare them to annotated transcripts or assemble them. However, this strategy can be limited when the cancer genome is substantially different than the reference or for detecting sequences from the cancer microbiome.
Sreedharan, Vipin T; Schultheiss, Sebastian J; Jean, Géraldine; Kahles, André; Bohnert, Regina; Drewe, Philipp; Mudrakarta, Pramod; Görnitz, Nico; Zeller, Georg; Rätsch, Gunnar
2014-05-01
We present Oqtans, an open-source workbench for quantitative transcriptome analysis, that is integrated in Galaxy. Its distinguishing features include customizable computational workflows and a modular pipeline architecture that facilitates comparative assessment of tool and data quality. Oqtans integrates an assortment of machine learning-powered tools into Galaxy, which show superior or equal performance to state-of-the-art tools. Implemented tools comprise a complete transcriptome analysis workflow: short-read alignment, transcript identification/quantification and differential expression analysis. Oqtans and Galaxy facilitate persistent storage, data exchange and documentation of intermediate results and analysis workflows. We illustrate how Oqtans aids the interpretation of data from different experiments in easy to understand use cases. Users can easily create their own workflows and extend Oqtans by integrating specific tools. Oqtans is available as (i) a cloud machine image with a demo instance at cloud.oqtans.org, (ii) a public Galaxy instance at galaxy.cbio.mskcc.org, (iii) a git repository containing all installed software (oqtans.org/git); most of which is also available from (iv) the Galaxy Toolshed and (v) a share string to use along with Galaxy CloudMan.
NASA Astrophysics Data System (ADS)
Eom, Hyun-Jeong; Liu, Yuedan; Kwak, Gyu-Suk; Heo, Muyoung; Song, Kyung Seuk; Chung, Yun Doo; Chon, Tae-Soo; Choi, Jinhee
2017-06-01
We conducted an inhalation toxicity test on the alternative animal model, Drosophila melanogaster, to investigate potential hazards of indoor air pollution. The inhalation toxicity of toluene and formaldehyde was investigated using comprehensive transcriptomics and computational behavior analyses. The ingenuity pathway analysis (IPA) based on microarray data suggests the involvement of pathways related to immune response, stress response, and metabolism in formaldehyde and toluene exposure based on hub molecules. We conducted a toxicity test using mutants of the representative genes in these pathways to explore the toxicological consequences of alterations of these pathways. Furthermore, extensive computational behavior analysis showed that exposure to either toluene or formaldehyde reduced most of the behavioral parameters of both wild-type and mutants. Interestingly, behavioral alteration caused by toluene or formaldehyde exposure was most severe in the p38b mutant, suggesting that the defects in the p38 pathway underlie behavioral alteration. Overall, the results indicate that exposure to toluene and formaldehyde via inhalation causes severe toxicity in Drosophila, by inducing significant alterations in gene expression and behavior, suggesting that Drosophila can be used as a potential alternative model in inhalation toxicity screening.
Eom, Hyun-Jeong; Liu, Yuedan; Kwak, Gyu-Suk; Heo, Muyoung; Song, Kyung Seuk; Chung, Yun Doo; Chon, Tae-Soo; Choi, Jinhee
2017-01-01
We conducted an inhalation toxicity test on the alternative animal model, Drosophila melanogaster, to investigate potential hazards of indoor air pollution. The inhalation toxicity of toluene and formaldehyde was investigated using comprehensive transcriptomics and computational behavior analyses. The ingenuity pathway analysis (IPA) based on microarray data suggests the involvement of pathways related to immune response, stress response, and metabolism in formaldehyde and toluene exposure based on hub molecules. We conducted a toxicity test using mutants of the representative genes in these pathways to explore the toxicological consequences of alterations of these pathways. Furthermore, extensive computational behavior analysis showed that exposure to either toluene or formaldehyde reduced most of the behavioral parameters of both wild-type and mutants. Interestingly, behavioral alteration caused by toluene or formaldehyde exposure was most severe in the p38b mutant, suggesting that the defects in the p38 pathway underlie behavioral alteration. Overall, the results indicate that exposure to toluene and formaldehyde via inhalation causes severe toxicity in Drosophila, by inducing significant alterations in gene expression and behavior, suggesting that Drosophila can be used as a potential alternative model in inhalation toxicity screening. PMID:28621308
Comparison of normalization methods for differential gene expression analysis in RNA-Seq experiments
Maza, Elie; Frasse, Pierre; Senin, Pavel; Bouzayen, Mondher; Zouine, Mohamed
2013-01-01
In recent years, RNA-Seq technologies became a powerful tool for transcriptome studies. However, computational methods dedicated to the analysis of high-throughput sequencing data are yet to be standardized. In particular, it is known that the choice of a normalization procedure leads to a great variability in results of differential gene expression analysis. The present study compares the most widespread normalization procedures and proposes a novel one aiming at removing an inherent bias of studied transcriptomes related to their relative size. Comparisons of the normalization procedures are performed on real and simulated data sets. Real RNA-Seq data sets analyses, performed with all the different normalization methods, show that only 50% of significantly differentially expressed genes are common. This result highlights the influence of the normalization step on the differential expression analysis. Real and simulated data sets analyses give similar results showing 3 different groups of procedures having the same behavior. The group including the novel method named “Median Ratio Normalization” (MRN) gives the lower number of false discoveries. Within this group the MRN method is less sensitive to the modification of parameters related to the relative size of transcriptomes such as the number of down- and upregulated genes and the gene expression levels. The newly proposed MRN method efficiently deals with intrinsic bias resulting from relative size of studied transcriptomes. Validation with real and simulated data sets confirmed that MRN is more consistent and robust than existing methods. PMID:26442135
Fang, Xiang; Li, Ning-qiu; Fu, Xiao-zhe; Li, Kai-bin; Lin, Qiang; Liu, Li-hui; Shi, Cun-bin; Wu, Shu-qin
2015-07-01
As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.
Melicher, Dacotah; Torson, Alex S; Dworkin, Ian; Bowsher, Julia H
2014-03-12
The Sepsidae family of flies is a model for investigating how sexual selection shapes courtship and sexual dimorphism in a comparative framework. However, like many non-model systems, there are few molecular resources available. Large-scale sequencing and assembly have not been performed in any sepsid, and the lack of a closely related genome makes investigation of gene expression challenging. Our goal was to develop an automated pipeline for de novo transcriptome assembly, and to use that pipeline to assemble and analyze the transcriptome of the sepsid Themira biloba. Our bioinformatics pipeline uses cloud computing services to assemble and analyze the transcriptome with off-site data management, processing, and backup. It uses a multiple k-mer length approach combined with a second meta-assembly to extend transcripts and recover more bases of transcript sequences than standard single k-mer assembly. We used 454 sequencing to generate 1.48 million reads from cDNA generated from embryo, larva, and pupae of T. biloba and assembled a transcriptome consisting of 24,495 contigs. Annotation identified 16,705 transcripts, including those involved in embryogenesis and limb patterning. We assembled transcriptomes from an additional three non-model organisms to demonstrate that our pipeline assembled a higher-quality transcriptome than single k-mer approaches across multiple species. The pipeline we have developed for assembly and analysis increases contig length, recovers unique transcripts, and assembles more base pairs than other methods through the use of a meta-assembly. The T. biloba transcriptome is a critical resource for performing large-scale RNA-Seq investigations of gene expression patterns, and is the first transcriptome sequenced in this Dipteran family.
FIT: statistical modeling tool for transcriptome dynamics under fluctuating field conditions
Iwayama, Koji; Aisaka, Yuri; Kutsuna, Natsumaro
2017-01-01
Abstract Motivation: Considerable attention has been given to the quantification of environmental effects on organisms. In natural conditions, environmental factors are continuously changing in a complex manner. To reveal the effects of such environmental variations on organisms, transcriptome data in field environments have been collected and analyzed. Nagano et al. proposed a model that describes the relationship between transcriptomic variation and environmental conditions and demonstrated the capability to predict transcriptome variation in rice plants. However, the computational cost of parameter optimization has prevented its wide application. Results: We propose a new statistical model and efficient parameter optimization based on the previous study. We developed and released FIT, an R package that offers functions for parameter optimization and transcriptome prediction. The proposed method achieves comparable or better prediction performance within a shorter computational time than the previous method. The package will facilitate the study of the environmental effects on transcriptomic variation in field conditions. Availability and Implementation: Freely available from CRAN (https://cran.r-project.org/web/packages/FIT/). Contact: anagano@agr.ryukoku.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online PMID:28158396
Karakülah, Gökhan
2017-06-28
Novel transcript discovery through RNA sequencing has substantially improved our understanding of the transcriptome dynamics of biological systems. Endogenous target mimicry (eTM) transcripts, a novel class of regulatory molecules, bind to their target microRNAs (miRNAs) by base pairing and block their biological activity. The objective of this study was to provide a computational analysis framework for the prediction of putative eTM sequences in plants, and as an example, to discover previously un-annotated eTMs in Prunus persica (peach) transcriptome. Therefore, two public peach transcriptome libraries downloaded from Sequence Read Archive (SRA) and a previously published set of long non-coding RNAs (lncRNAs) were investigated with multi-step analysis pipeline, and 44 putative eTMs were found. Additionally, an eTM-miRNA-mRNA regulatory network module associated with peach fruit organ development was built via integration of the miRNA target information and predicted eTM-miRNA interactions. My findings suggest that one of the most widely expressed miRNA families among diverse plant species, miR156, might be potentially sponged by seven putative eTMs. Besides, the study indicates eTMs potentially play roles in the regulation of development processes in peach fruit via targeting specific miRNAs. In conclusion, by following the step-by step instructions provided in this study, novel eTMs can be identified and annotated effectively in public plant transcriptome libraries.
Brain transcriptome atlases: a computational perspective.
Mahfouz, Ahmed; Huisman, Sjoerd M H; Lelieveldt, Boudewijn P F; Reinders, Marcel J T
2017-05-01
The immense complexity of the mammalian brain is largely reflected in the underlying molecular signatures of its billions of cells. Brain transcriptome atlases provide valuable insights into gene expression patterns across different brain areas throughout the course of development. Such atlases allow researchers to probe the molecular mechanisms which define neuronal identities, neuroanatomy, and patterns of connectivity. Despite the immense effort put into generating such atlases, to answer fundamental questions in neuroscience, an even greater effort is needed to develop methods to probe the resulting high-dimensional multivariate data. We provide a comprehensive overview of the various computational methods used to analyze brain transcriptome atlases.
Walker, Joseph F; Yang, Ya; Feng, Tao; Timoneda, Alfonso; Mikenas, Jessica; Hutchison, Vera; Edwards, Caroline; Wang, Ning; Ahluwalia, Sonia; Olivieri, Julia; Walker-Hale, Nathanael; Majure, Lucas C; Puente, Raúl; Kadereit, Gudrun; Lauterbach, Maximilian; Eggli, Urs; Flores-Olvera, Hilda; Ochoterena, Helga; Brockington, Samuel F; Moore, Michael J; Smith, Stephen A
2018-03-01
The Caryophyllales contain ~12,500 species and are known for their cosmopolitan distribution, convergence of trait evolution, and extreme adaptations. Some relationships within the Caryophyllales, like those of many large plant clades, remain unclear, and phylogenetic studies often recover alternative hypotheses. We explore the utility of broad and dense transcriptome sampling across the order for resolving evolutionary relationships in Caryophyllales. We generated 84 transcriptomes and combined these with 224 publicly available transcriptomes to perform a phylogenomic analysis of Caryophyllales. To overcome the computational challenge of ortholog detection in such a large data set, we developed an approach for clustering gene families that allowed us to analyze >300 transcriptomes and genomes. We then inferred the species relationships using multiple methods and performed gene-tree conflict analyses. Our phylogenetic analyses resolved many clades with strong support, but also showed significant gene-tree discordance. This discordance is not only a common feature of phylogenomic studies, but also represents an opportunity to understand processes that have structured phylogenies. We also found taxon sampling influences species-tree inference, highlighting the importance of more focused studies with additional taxon sampling. Transcriptomes are useful both for species-tree inference and for uncovering evolutionary complexity within lineages. Through analyses of gene-tree conflict and multiple methods of species-tree inference, we demonstrate that phylogenomic data can provide unparalleled insight into the evolutionary history of Caryophyllales. We also discuss a method for overcoming computational challenges associated with homolog clustering in large data sets. © 2018 The Authors. American Journal of Botany is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America.
Buettner, Florian; Natarajan, Kedar N; Casale, F Paolo; Proserpio, Valentina; Scialdone, Antonio; Theis, Fabian J; Teichmann, Sarah A; Marioni, John C; Stegle, Oliver
2015-02-01
Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.
Profiling the venom gland transcriptomes of Costa Rican snakes by 454 pyrosequencing
2011-01-01
Background A long term research goal of venomics, of applied importance for improving current antivenom therapy, but also for drug discovery, is to understand the pharmacological potential of venoms. Individually or combined, proteomic and transcriptomic studies have demonstrated their feasibility to explore in depth the molecular diversity of venoms. In the absence of genome sequence, transcriptomes represent also valuable searchable databases for proteomic projects. Results The venom gland transcriptomes of 8 Costa Rican taxa from 5 genera (Crotalus, Bothrops, Atropoides, Cerrophidion, and Bothriechis) of pitvipers were investigated using high-throughput 454 pyrosequencing. 100,394 out of 330,010 masked reads produced significant hits in the available databases. 5.165,220 nucleotides (8.27%) were masked by RepeatMasker, the vast majority of which corresponding to class I (retroelements) and class II (DNA transposons) mobile elements. BLAST hits included 79,991 matches to entries of the taxonomic suborder Serpentes, of which 62,433 displayed similarity to documented venom proteins. Strong discrepancies between the transcriptome-computed and the proteome-gathered toxin compositions were obvious at first sight. Although the reasons underlaying this discrepancy are elusive, since no clear trend within or between species is apparent, the data indicate that individual mRNA species may be translationally controlled in a species-dependent manner. The minimum number of genes from each toxin family transcribed into the venom gland transcriptome of each species was calculated from multiple alignments of reads matched to a full-length reference sequence of each toxin family. Reads encoding ORF regions of Kazal-type inhibitor-like proteins were uniquely found in Bothriechis schlegelii and B. lateralis transcriptomes, suggesting a genus-specific recruitment event during the early-Middle Miocene. A transcriptome-based cladogram supports the large divergence between A. mexicanus and A. picadoi, and a closer kinship between A. mexicanus and C. godmani. Conclusions Our comparative next-generation sequencing (NGS) analysis reveals taxon-specific trends governing the formulation of the venom arsenal. Knowledge of the venom proteome provides hints on the translation efficiency of toxin-coding transcripts, contributing thereby to a more accurate interpretation of the transcriptome. The application of NGS to the analysis of snake venom transcriptomes, may represent the tool for opening the door to systems venomics. PMID:21605378
Trapnell, Cole; Roberts, Adam; Goff, Loyal; Pertea, Geo; Kim, Daehwan; Kelley, David R; Pimentel, Harold; Salzberg, Steven L; Rinn, John L; Pachter, Lior
2012-01-01
Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions. This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol's execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ~1 h of hands-on time. PMID:22383036
PIVOT: platform for interactive analysis and visualization of transcriptomics data.
Zhu, Qin; Fisher, Stephen A; Dueck, Hannah; Middleton, Sarah; Khaladkar, Mugdha; Kim, Junhyong
2018-01-05
Many R packages have been developed for transcriptome analysis but their use often requires familiarity with R and integrating results of different packages requires scripts to wrangle the datatypes. Furthermore, exploratory data analyses often generate multiple derived datasets such as data subsets or data transformations, which can be difficult to track. Here we present PIVOT, an R-based platform that wraps open source transcriptome analysis packages with a uniform user interface and graphical data management that allows non-programmers to interactively explore transcriptomics data. PIVOT supports more than 40 popular open source packages for transcriptome analysis and provides an extensive set of tools for statistical data manipulations. A graph-based visual interface is used to represent the links between derived datasets, allowing easy tracking of data versions. PIVOT further supports automatic report generation, publication-quality plots, and program/data state saving, such that all analysis can be saved, shared and reproduced. PIVOT will allow researchers with broad background to easily access sophisticated transcriptome analysis tools and interactively explore transcriptome datasets.
Trinity | Informatics Technology for Cancer Research (ITCR)
Trinity Cancer Transcriptome Analysis Toolkit (CTAT) including de novo transcriptome assembly with downstream support for expression analysis and focused analyses on cancer transcriptomes, incorporating mutation and fusion transcript discovery, and single cell analysis.
Guedes, Rafael Lucas Muniz; Rodrigues, Carla Monadeli Filgueira; Coatnoan, Nicolas; Cosson, Alain; Cadioli, Fabiano Antonio; Garcia, Herakles Antonio; Gerber, Alexandra Lehmkuhl; Machado, Rosangela Zacarias; Minoprio, Paola Marcella Camargo; Teixeira, Marta Maria Geraldes; de Vasconcelos, Ana Tereza Ribeiro
2018-02-27
Trypanosoma vivax is a parasite widespread across Africa and South America. Immunological methods using recombinant antigens have been developed aiming at specific and sensitive detection of infections caused by T. vivax. Here, we sequenced for the first time the transcriptome of a virulent T. vivax strain (Lins), isolated from an outbreak of severe disease in South America (Brazil) and performed a computational integrated analysis of genome, transcriptome and in silico predictions to identify and characterize putative linear B-cell epitopes from African and South American T. vivax. A total of 2278, 3936 and 4062 linear B-cell epitopes were respectively characterized for the transcriptomes of T. vivax LIEM-176 (Venezuela), T. vivax IL1392 (Nigeria) and T. vivax Lins (Brazil) and 4684 for the genome of T. vivax Y486 (Nigeria). The results presented are a valuable theoretical source that may pave the way for highly sensitive and specific diagnostic tools. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Jia, Zhilong; Liu, Ying; Guan, Naiyang; Bo, Xiaochen; Luo, Zhigang; Barnes, Michael R
2016-05-27
Drug repositioning, finding new indications for existing drugs, has gained much recent attention as a potentially efficient and economical strategy for accelerating new therapies into the clinic. Although improvement in the sensitivity of computational drug repositioning methods has identified numerous credible repositioning opportunities, few have been progressed. Arguably the "black box" nature of drug action in a new indication is one of the main blocks to progression, highlighting the need for methods that inform on the broader target mechanism in the disease context. We demonstrate that the analysis of co-expressed genes may be a critical first step towards illumination of both disease pathology and mode of drug action. We achieve this using a novel framework, co-expressed gene-set enrichment analysis (cogena) for co-expression analysis of gene expression signatures and gene set enrichment analysis of co-expressed genes. The cogena framework enables simultaneous, pathway driven, disease and drug repositioning analysis. Cogena can be used to illuminate coordinated changes within disease transcriptomes and identify drugs acting mechanistically within this framework. We illustrate this using a psoriatic skin transcriptome, as an exemplar, and recover two widely used Psoriasis drugs (Methotrexate and Ciclosporin) with distinct modes of action. Cogena out-performs the results of Connectivity Map and NFFinder webservers in similar disease transcriptome analyses. Furthermore, we investigated the literature support for the other top-ranked compounds to treat psoriasis and showed how the outputs of cogena analysis can contribute new insight to support the progression of drugs into the clinic. We have made cogena freely available within Bioconductor or https://github.com/zhilongjia/cogena . In conclusion, by targeting co-expressed genes within disease transcriptomes, cogena offers novel biological insight, which can be effectively harnessed for drug discovery and repositioning, allowing the grouping and prioritisation of drug repositioning candidates on the basis of putative mode of action.
NASA Astrophysics Data System (ADS)
Blasi, Thomas; Buettner, Florian; Strasser, Michael K.; Marr, Carsten; Theis, Fabian J.
2017-06-01
Accessing gene expression at a single-cell level has unraveled often large heterogeneity among seemingly homogeneous cells, which remains obscured when using traditional population-based approaches. The computational analysis of single-cell transcriptomics data, however, still imposes unresolved challenges with respect to normalization, visualization and modeling the data. One such issue is differences in cell size, which introduce additional variability into the data and for which appropriate normalization techniques are needed. Otherwise, these differences in cell size may obscure genuine heterogeneities among cell populations and lead to overdispersed steady-state distributions of mRNA transcript numbers. We present cgCorrect, a statistical framework to correct for differences in cell size that are due to cell growth in single-cell transcriptomics data. We derive the probability for the cell-growth-corrected mRNA transcript number given the measured, cell size-dependent mRNA transcript number, based on the assumption that the average number of transcripts in a cell increases proportionally to the cell’s volume during the cell cycle. cgCorrect can be used for both data normalization and to analyze the steady-state distributions used to infer the gene expression mechanism. We demonstrate its applicability on both simulated data and single-cell quantitative real-time polymerase chain reaction (PCR) data from mouse blood stem and progenitor cells (and to quantitative single-cell RNA-sequencing data obtained from mouse embryonic stem cells). We show that correcting for differences in cell size affects the interpretation of the data obtained by typically performed computational analysis.
Lott, Steffen C; Wolfien, Markus; Riege, Konstantin; Bagnacani, Andrea; Wolkenhauer, Olaf; Hoffmann, Steve; Hess, Wolfgang R
2017-11-10
RNA-Sequencing (RNA-Seq) has become a widely used approach to study quantitative and qualitative aspects of transcriptome data. The variety of RNA-Seq protocols, experimental study designs and the characteristic properties of the organisms under investigation greatly affect downstream and comparative analyses. In this review, we aim to explain the impact of structured pre-selection, classification and integration of best-performing tools within modularized data analysis workflows and ready-to-use computing infrastructures towards experimental data analyses. We highlight examples for workflows and use cases that are presented for pro-, eukaryotic and mixed dual RNA-Seq (meta-transcriptomics) experiments. In addition, we are summarizing the expertise of the laboratories participating in the project consortium "Structured Analysis and Integration of RNA-Seq experiments" (de.STAIR) and its integration with the Galaxy-workbench of the RNA Bioinformatics Center (RBC). Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Koseki, Jun; Matsui, Hidetoshi; Konno, Masamitsu; Nishida, Naohiro; Kawamoto, Koichi; Kano, Yoshihiro; Mori, Masaki; Doki, Yuichiro; Ishii, Hideshi
2016-02-01
Bioinformatics and computational modelling are expected to offer innovative approaches in human medical science. In the present study, we performed computational analyses and made predictions using transcriptome and metabolome datasets obtained from fluorescence-based visualisations of chemotherapy-resistant cancer stem cells (CSCs) in the human oesophagus. This approach revealed an uncharacterized role for the ornithine metabolic pathway in the survival of chemotherapy-resistant CSCs. The present study fastens this rationale for further characterisation that may lead to the discovery of innovative drugs against robust CSCs.
Hanriot, Lucie; Keime, Céline; Gay, Nadine; Faure, Claudine; Dossat, Carole; Wincker, Patrick; Scoté-Blachon, Céline; Peyron, Christelle; Gandrillon, Olivier
2008-01-01
Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method. PMID:18796152
Hur, Manhoi; Campbell, Alexis Ann; Almeida-de-Macedo, Marcia; Li, Ling; Ransom, Nick; Jose, Adarsh; Crispin, Matt; Nikolau, Basil J; Wurtele, Eve Syrkin
2013-04-01
Discovering molecular components and their functionality is key to the development of hypotheses concerning the organization and regulation of metabolic networks. The iterative experimental testing of such hypotheses is the trajectory that can ultimately enable accurate computational modelling and prediction of metabolic outcomes. This information can be particularly important for understanding the biology of natural products, whose metabolism itself is often only poorly defined. Here, we describe factors that must be in place to optimize the use of metabolomics in predictive biology. A key to achieving this vision is a collection of accurate time-resolved and spatially defined metabolite abundance data and associated metadata. One formidable challenge associated with metabolite profiling is the complexity and analytical limits associated with comprehensively determining the metabolome of an organism. Further, for metabolomics data to be efficiently used by the research community, it must be curated in publicly available metabolomics databases. Such databases require clear, consistent formats, easy access to data and metadata, data download, and accessible computational tools to integrate genome system-scale datasets. Although transcriptomics and proteomics integrate the linear predictive power of the genome, the metabolome represents the nonlinear, final biochemical products of the genome, which results from the intricate system(s) that regulate genome expression. For example, the relationship of metabolomics data to the metabolic network is confounded by redundant connections between metabolites and gene-products. However, connections among metabolites are predictable through the rules of chemistry. Therefore, enhancing the ability to integrate the metabolome with anchor-points in the transcriptome and proteome will enhance the predictive power of genomics data. We detail a public database repository for metabolomics, tools and approaches for statistical analysis of metabolomics data, and methods for integrating these datasets with transcriptomic data to create hypotheses concerning specialized metabolisms that generate the diversity in natural product chemistry. We discuss the importance of close collaborations among biologists, chemists, computer scientists and statisticians throughout the development of such integrated metabolism-centric databases and software.
Hur, Manhoi; Campbell, Alexis Ann; Almeida-de-Macedo, Marcia; Li, Ling; Ransom, Nick; Jose, Adarsh; Crispin, Matt; Nikolau, Basil J.
2013-01-01
Discovering molecular components and their functionality is key to the development of hypotheses concerning the organization and regulation of metabolic networks. The iterative experimental testing of such hypotheses is the trajectory that can ultimately enable accurate computational modelling and prediction of metabolic outcomes. This information can be particularly important for understanding the biology of natural products, whose metabolism itself is often only poorly defined. Here, we describe factors that must be in place to optimize the use of metabolomics in predictive biology. A key to achieving this vision is a collection of accurate time-resolved and spatially defined metabolite abundance data and associated metadata. One formidable challenge associated with metabolite profiling is the complexity and analytical limits associated with comprehensively determining the metabolome of an organism. Further, for metabolomics data to be efficiently used by the research community, it must be curated in publically available metabolomics databases. Such databases require clear, consistent formats, easy access to data and metadata, data download, and accessible computational tools to integrate genome system-scale datasets. Although transcriptomics and proteomics integrate the linear predictive power of the genome, the metabolome represents the nonlinear, final biochemical products of the genome, which results from the intricate system(s) that regulate genome expression. For example, the relationship of metabolomics data to the metabolic network is confounded by redundant connections between metabolites and gene-products. However, connections among metabolites are predictable through the rules of chemistry. Therefore, enhancing the ability to integrate the metabolome with anchor-points in the transcriptome and proteome will enhance the predictive power of genomics data. We detail a public database repository for metabolomics, tools and approaches for statistical analysis of metabolomics data, and methods for integrating these dataset with transcriptomic data to create hypotheses concerning specialized metabolism that generates the diversity in natural product chemistry. We discuss the importance of close collaborations among biologists, chemists, computer scientists and statisticians throughout the development of such integrated metabolism-centric databases and software. PMID:23447050
2012-01-01
Background We present a comprehensive transcriptome analysis of the fungus Ascosphaera apis, an economically important pathogen of the Western honey bee (Apis mellifera) that causes chalkbrood disease. Our goals were to further annotate the A. apis reference genome and to identify genes that are candidates for being differentially expressed during host infection versus axenic culture. Results We compared A. apis transcriptome sequence from mycelia grown on liquid or solid media with that dissected from host-infected tissue. 454 pyrosequencing provided 252 Mb of filtered sequence reads from both culture types that were assembled into 10,087 contigs. Transcript contigs, protein sequences from multiple fungal species, and ab initio gene predictions were included as evidence sources in the Maker gene prediction pipeline, resulting in 6,992 consensus gene models. A phylogeny based on 12 of these protein-coding loci further supported the taxonomic placement of Ascosphaera as sister to the core Onygenales. Several common protein domains were less abundant in A. apis compared with related ascomycete genomes, particularly cytochrome p450 and protein kinase domains. A novel gene family was identified that has expanded in some ascomycete lineages, but not others. We manually annotated genes with homologs in other fungal genomes that have known relevance to fungal virulence and life history. Functional categories of interest included genes involved in mating-type specification, intracellular signal transduction, and stress response. Computational and manual annotations have been made publicly available on the Bee Pests and Pathogens website. Conclusions This comprehensive transcriptome analysis substantially enhances our understanding of the A. apis genome and its expression during infection of honey bee larvae. It also provides resources for future molecular studies of chalkbrood disease and ultimately improved disease management. PMID:22747707
Kim, Taemook; Seo, Hogyu David; Hennighausen, Lothar; Lee, Daeyoup
2018-01-01
Abstract Octopus-toolkit is a stand-alone application for retrieving and processing large sets of next-generation sequencing (NGS) data with a single step. Octopus-toolkit is an automated set-up-and-analysis pipeline utilizing the Aspera, SRA Toolkit, FastQC, Trimmomatic, HISAT2, STAR, Samtools, and HOMER applications. All the applications are installed on the user's computer when the program starts. Upon the installation, it can automatically retrieve original files of various epigenomic and transcriptomic data sets, including ChIP-seq, ATAC-seq, DNase-seq, MeDIP-seq, MNase-seq and RNA-seq, from the gene expression omnibus data repository. The downloaded files can then be sequentially processed to generate BAM and BigWig files, which are used for advanced analyses and visualization. Currently, it can process NGS data from popular model genomes such as, human (Homo sapiens), mouse (Mus musculus), dog (Canis lupus familiaris), plant (Arabidopsis thaliana), zebrafish (Danio rerio), fruit fly (Drosophila melanogaster), worm (Caenorhabditis elegans), and budding yeast (Saccharomyces cerevisiae) genomes. With the processed files from Octopus-toolkit, the meta-analysis of various data sets, motif searches for DNA-binding proteins, and the identification of differentially expressed genes and/or protein-binding sites can be easily conducted with few commands by users. Overall, Octopus-toolkit facilitates the systematic and integrative analysis of available epigenomic and transcriptomic NGS big data. PMID:29420797
Li, Jing-Woei; Lee, Heung-Man; Wang, Ying; Tong, Amy Hin-Yan; Yip, Kevin Y.; Tsui, Stephen Kwok-Wing; Lok, Si; Ozaki, Risa; Luk, Andrea O; Kong, Alice P. S.; So, Wing-Yee; Ma, Ronald C. W.; Chan, Juliana C. N.; Chan, Ting-Fung
2016-01-01
Protein interactions play significant roles in complex diseases. We analyzed peripheral blood mononuclear cells (PBMC) transcriptome using a multi-method strategy. We constructed a tissue-specific interactome (T2Di) and identified 420 molecular signatures associated with T2D-related comorbidity and symptoms, mainly implicated in inflammation, adipogenesis, protein phosphorylation and hormonal secretion. Apart from explaining the residual associations within the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) study, the T2Di signatures were enriched in pathogenic cell type-specific regulatory elements related to fetal development, immunity and expression quantitative trait loci (eQTL). The T2Di revealed a novel locus near a well-established GWAS loci AChE, in which SRRT interacts with JAZF1, a T2D-GWAS gene implicated in pancreatic function. The T2Di also included known anti-diabetic drug targets (e.g. PPARD, MAOB) and identified possible druggable targets (e.g. NCOR2, PDGFR). These T2Di signatures were validated by an independent computational method, and by expression data of pancreatic islet, muscle and liver with some of the signatures (CEBPB, SREBF1, MLST8, SRF, SRRT and SLC12A9) confirmed in PBMC from an independent cohort of 66 T2D and 66 control subjects. By combining prior knowledge and transcriptome analysis, we have constructed an interactome to explain the multi-layered regulatory pathways in T2D. PMID:27752041
Sma3s: A universal tool for easy functional annotation of proteomes and transcriptomes.
Casimiro-Soriguer, Carlos S; Muñoz-Mérida, Antonio; Pérez-Pulido, Antonio J
2017-06-01
The current cheapening of next-generation sequencing has led to an enormous growth in the number of sequenced genomes and transcriptomes, allowing wet labs to get the sequences from their organisms of study. To make the most of these data, one of the first things that should be done is the functional annotation of the protein-coding genes. But it used to be a slow and tedious step that can involve the characterization of thousands of sequences. Sma3s is an accurate computational tool for annotating proteins in an unattended way. Now, we have developed a completely new version, which includes functionalities that will be of utility for fundamental and applied science. Currently, the results provide functional categories such as biological processes, which become useful for both characterizing particular sequence datasets and comparing results from different projects. But one of the most important implemented innovations is that it has now low computational requirements, and the complete annotation of a simple proteome or transcriptome usually takes around 24 hours in a personal computer. Sma3s has been tested with a large amount of complete proteomes and transcriptomes, and it has demonstrated its potential in health science and other specific projects. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Zhang, Zijun; Xing, Yi
2017-09-19
Crosslinking or RNA immunoprecipitation followed by sequencing (CLIP-seq or RIP-seq) allows transcriptome-wide discovery of RNA regulatory sites. As CLIP-seq/RIP-seq reads are short, existing computational tools focus on uniquely mapped reads, while reads mapped to multiple loci are discarded. We present CLAM (CLIP-seq Analysis of Multi-mapped reads). CLAM uses an expectation-maximization algorithm to assign multi-mapped reads and calls peaks combining uniquely and multi-mapped reads. To demonstrate the utility of CLAM, we applied it to a wide range of public CLIP-seq/RIP-seq datasets involving numerous splicing factors, microRNAs and m6A RNA methylation. CLAM recovered a large number of novel RNA regulatory sites inaccessible by uniquely mapped reads. The functional significance of these sites was demonstrated by consensus motif patterns and association with alternative splicing (splicing factors), transcript abundance (AGO2) and mRNA half-life (m6A). CLAM provides a useful tool to discover novel protein-RNA interactions and RNA modification sites from CLIP-seq and RIP-seq data, and reveals the significant contribution of repetitive elements to the RNA regulatory landscape of the human transcriptome. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Molinaro, Alyssa M; Pearson, Bret J
2016-04-27
The planarian Schmidtea mediterranea is a master regenerator with a large adult stem cell compartment. The lack of transgenic labeling techniques in this animal has hindered the study of lineage progression and has made understanding the mechanisms of tissue regeneration a challenge. However, recent advances in single-cell transcriptomics and analysis methods allow for the discovery of novel cell lineages as differentiation progresses from stem cell to terminally differentiated cell. Here we apply pseudotime analysis and single-cell transcriptomics to identify adult stem cells belonging to specific cellular lineages and identify novel candidate genes for future in vivo lineage studies. We purify 168 single stem and progeny cells from the planarian head, which were subjected to single-cell RNA sequencing (scRNAseq). Pseudotime analysis with Waterfall and gene set enrichment analysis predicts a molecularly distinct neoblast sub-population with neural character (νNeoblasts) as well as a novel alternative lineage. Using the predicted νNeoblast markers, we demonstrate that a novel proliferative stem cell population exists adjacent to the brain. scRNAseq coupled with in silico lineage analysis offers a new approach for studying lineage progression in planarians. The lineages identified here are extracted from a highly heterogeneous dataset with minimal prior knowledge of planarian lineages, demonstrating that lineage purification by transgenic labeling is not a prerequisite for this approach. The identification of the νNeoblast lineage demonstrates the usefulness of the planarian system for computationally predicting cellular lineages in an adult context coupled with in vivo verification.
Tripathi, Kumar Parijat; Evangelista, Daniela; Zuccaro, Antonio; Guarracino, Mario Rosario
2015-01-01
RNA-seq is a new tool to measure RNA transcript counts, using high-throughput sequencing at an extraordinary accuracy. It provides quantitative means to explore the transcriptome of an organism of interest. However, interpreting this extremely large data into biological knowledge is a problem, and biologist-friendly tools are lacking. In our lab, we developed Transcriptator, a web application based on a computational Python pipeline with a user-friendly Java interface. This pipeline uses the web services available for BLAST (Basis Local Search Alignment Tool), QuickGO and DAVID (Database for Annotation, Visualization and Integrated Discovery) tools. It offers a report on statistical analysis of functional and Gene Ontology (GO) annotation's enrichment. It helps users to identify enriched biological themes, particularly GO terms, pathways, domains, gene/proteins features and protein-protein interactions related informations. It clusters the transcripts based on functional annotations and generates a tabular report for functional and gene ontology annotations for each submitted transcript to the web server. The implementation of QuickGo web-services in our pipeline enable the users to carry out GO-Slim analysis, whereas the integration of PORTRAIT (Prediction of transcriptomic non coding RNA (ncRNA) by ab initio methods) helps to identify the non coding RNAs and their regulatory role in transcriptome. In summary, Transcriptator is a useful software for both NGS and array data. It helps the users to characterize the de-novo assembled reads, obtained from NGS experiments for non-referenced organisms, while it also performs the functional enrichment analysis of differentially expressed transcripts/genes for both RNA-seq and micro-array experiments. It generates easy to read tables and interactive charts for better understanding of the data. The pipeline is modular in nature, and provides an opportunity to add new plugins in the future. Web application is freely available at: http://www-labgtp.na.icar.cnr.it/Transcriptator.
Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H
2014-11-19
Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.
Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud
Griffith, Malachi; Walker, Jason R.; Spies, Nicholas C.; Ainscough, Benjamin J.; Griffith, Obi L.
2015-01-01
Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki. PMID:26248053
Bizama, Carolina; Benavente, Felipe; Salvatierra, Edgardo; Gutiérrez-Moraga, Ana; Espinoza, Jaime A; Fernández, Elmer A; Roa, Iván; Mazzolini, Guillermo; Sagredo, Eduardo A; Gidekel, Manuel; Podhajcer, Osvaldo L
2014-02-15
Studies on the low-abundance transcriptome are of paramount importance for identifying the intimate mechanisms of tumor progression that can lead to novel therapies. The aim of the present study was to identify novel markers and targetable genes and pathways in advanced human gastric cancer through analyses of the low-abundance transcriptome. The procedure involved an initial subtractive hybridization step, followed by global gene expression analysis using microarrays. We observed profound differences, both at the single gene and gene ontology levels, between the low-abundance transcriptome and the whole transcriptome. Analysis of the low-abundance transcriptome led to the identification and validation by tissue microarrays of novel biomarkers, such as LAMA3 and TTN; moreover, we identified cancer type-specific intracellular pathways and targetable genes, such as IRS2, IL17, IFNγ, VEGF-C, WISP1, FZD5 and CTBP1 that were not detectable by whole transcriptome analyses. We also demonstrated that knocking down the expression of CTBP1 sensitized gastric cancer cells to mainstay chemotherapeutic drugs. We conclude that the analysis of the low-abundance transcriptome provides useful insights into the molecular basis and treatment of cancer. © 2013 UICC.
Dhanyalakshmi, K H; Naika, Mahantesha B N; Sajeevan, R S; Mathew, Oommen K; Shafi, K Mohamed; Sowdhamini, Ramanathan; N Nataraja, Karaba
2016-01-01
The modern sequencing technologies are generating large volumes of information at the transcriptome and genome level. Translation of this information into a biological meaning is far behind the race due to which a significant portion of proteins discovered remain as proteins of unknown function (PUFs). Attempts to uncover the functional significance of PUFs are limited due to lack of easy and high throughput functional annotation tools. Here, we report an approach to assign putative functions to PUFs, identified in the transcriptome of mulberry, a perennial tree commonly cultivated as host of silkworm. We utilized the mulberry PUFs generated from leaf tissues exposed to drought stress at whole plant level. A sequence and structure based computational analysis predicted the probable function of the PUFs. For rapid and easy annotation of PUFs, we developed an automated pipeline by integrating diverse bioinformatics tools, designated as PUFs Annotation Server (PUFAS), which also provides a web service API (Application Programming Interface) for a large-scale analysis up to a genome. The expression analysis of three selected PUFs annotated by the pipeline revealed abiotic stress responsiveness of the genes, and hence their potential role in stress acclimation pathways. The automated pipeline developed here could be extended to assign functions to PUFs from any organism in general. PUFAS web server is available at http://caps.ncbs.res.in/pufas/ and the web service is accessible at http://capservices.ncbs.res.in/help/pufas.
Toker, Lilah; Rocco, Brad; Sibille, Etienne
2017-01-01
Establishing the molecular diversity of cell types is crucial for the study of the nervous system. We compiled a cross-laboratory database of mouse brain cell type-specific transcriptomes from 36 major cell types from across the mammalian brain using rigorously curated published data from pooled cell type microarray and single-cell RNA-sequencing (RNA-seq) studies. We used these data to identify cell type-specific marker genes, discovering a substantial number of novel markers, many of which we validated using computational and experimental approaches. We further demonstrate that summarized expression of marker gene sets (MGSs) in bulk tissue data can be used to estimate the relative cell type abundance across samples. To facilitate use of this expanding resource, we provide a user-friendly web interface at www.neuroexpresso.org. PMID:29204516
Transcriptome assembly and digital gene expression atlas of the rainbow trout
USDA-ARS?s Scientific Manuscript database
Background: Transcriptome analysis is a preferred method for gene discovery, marker development and gene expression profiling in non-model organisms. Previously, we sequenced a transcriptome reference using Sanger-based and 454-pyrosequencing, however, a transcriptome assembly is still incomplete an...
OperomeDB: A Database of Condition-Specific Transcription Units in Prokaryotic Genomes.
Chetal, Kashish; Janga, Sarath Chandra
2015-01-01
Background. In prokaryotic organisms, a substantial fraction of adjacent genes are organized into operons-codirectionally organized genes in prokaryotic genomes with the presence of a common promoter and terminator. Although several available operon databases provide information with varying levels of reliability, very few resources provide experimentally supported results. Therefore, we believe that the biological community could benefit from having a new operon prediction database with operons predicted using next-generation RNA-seq datasets. Description. We present operomeDB, a database which provides an ensemble of all the predicted operons for bacterial genomes using available RNA-sequencing datasets across a wide range of experimental conditions. Although several studies have recently confirmed that prokaryotic operon structure is dynamic with significant alterations across environmental and experimental conditions, there are no comprehensive databases for studying such variations across prokaryotic transcriptomes. Currently our database contains nine bacterial organisms and 168 transcriptomes for which we predicted operons. User interface is simple and easy to use, in terms of visualization, downloading, and querying of data. In addition, because of its ability to load custom datasets, users can also compare their datasets with publicly available transcriptomic data of an organism. Conclusion. OperomeDB as a database should not only aid experimental groups working on transcriptome analysis of specific organisms but also enable studies related to computational and comparative operomics.
Preliminary profiling of blood transcriptome in a rat model of hemorrhagic shock.
Braga, D; Barcella, M; D'Avila, F; Lupoli, S; Tagliaferri, F; Santamaria, M H; DeLano, F A; Baselli, G; Schmid-Schönbein, G W; Kistler, E B; Aletti, F; Barlassina, C
2017-08-01
Hemorrhagic shock is a leading cause of morbidity and mortality worldwide. Significant blood loss may lead to decreased blood pressure and inadequate tissue perfusion with resultant organ failure and death, even after replacement of lost blood volume. One reason for this high acuity is that the fundamental mechanisms of shock are poorly understood. Proteomic and metabolomic approaches have been used to investigate the molecular events occurring in hemorrhagic shock but, to our knowledge, a systematic analysis of the transcriptomic profile is missing. Therefore, a pilot analysis using paired-end RNA sequencing was used to identify changes that occur in the blood transcriptome of rats subjected to hemorrhagic shock after blood reinfusion. Hemorrhagic shock was induced using a Wigger's shock model. The transcriptome of whole blood from shocked animals shows modulation of genes related to inflammation and immune response (Tlr13, Il1b, Ccl6, Lgals3), antioxidant functions (Mt2A, Mt1), tissue injury and repair pathways (Gpnmb, Trim72) and lipid mediators (Alox5ap, Ltb4r, Ptger2) compared with control animals. These findings are congruent with results obtained in hemorrhagic shock analysis by other authors using metabolomics and proteomics. The analysis of blood transcriptome may be a valuable tool to understand the biological changes occurring in hemorrhagic shock and a promising approach for the identification of novel biomarkers and therapeutic targets. Impact statement This study provides the first pilot analysis of the changes occurring in transcriptome expression of whole blood in hemorrhagic shock (HS) rats. We showed that the analysis of blood transcriptome is a useful approach to investigate pathways and functional alterations in this disease condition. This pilot study encourages the possible application of transcriptome analysis in the clinical setting, for the molecular profiling of whole blood in HS patients.
Wenger, Yvan; Galliot, Brigitte
2013-03-25
Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.
2013-01-01
Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871
A practical data processing workflow for multi-OMICS projects.
Kohl, Michael; Megger, Dominik A; Trippler, Martin; Meckel, Hagen; Ahrens, Maike; Bracht, Thilo; Weber, Frank; Hoffmann, Andreas-Claudius; Baba, Hideo A; Sitek, Barbara; Schlaak, Jörg F; Meyer, Helmut E; Stephan, Christian; Eisenacher, Martin
2014-01-01
Multi-OMICS approaches aim on the integration of quantitative data obtained for different biological molecules in order to understand their interrelation and the functioning of larger systems. This paper deals with several data integration and data processing issues that frequently occur within this context. To this end, the data processing workflow within the PROFILE project is presented, a multi-OMICS project that aims on identification of novel biomarkers and the development of new therapeutic targets for seven important liver diseases. Furthermore, a software called CrossPlatformCommander is sketched, which facilitates several steps of the proposed workflow in a semi-automatic manner. Application of the software is presented for the detection of novel biomarkers, their ranking and annotation with existing knowledge using the example of corresponding Transcriptomics and Proteomics data sets obtained from patients suffering from hepatocellular carcinoma. Additionally, a linear regression analysis of Transcriptomics vs. Proteomics data is presented and its performance assessed. It was shown, that for capturing profound relations between Transcriptomics and Proteomics data, a simple linear regression analysis is not sufficient and implementation and evaluation of alternative statistical approaches are needed. Additionally, the integration of multivariate variable selection and classification approaches is intended for further development of the software. Although this paper focuses only on the combination of data obtained from quantitative Proteomics and Transcriptomics experiments, several approaches and data integration steps are also applicable for other OMICS technologies. Keeping specific restrictions in mind the suggested workflow (or at least parts of it) may be used as a template for similar projects that make use of different high throughput techniques. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013 Elsevier B.V. All rights reserved.
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.
Johnson, Benjamin K; Scholz, Matthew B; Teal, Tracy K; Abramovitch, Robert B
2016-02-04
Many tools exist in the analysis of bacterial RNA sequencing (RNA-seq) transcriptional profiling experiments to identify differentially expressed genes between experimental conditions. Generally, the workflow includes quality control of reads, mapping to a reference, counting transcript abundance, and statistical tests for differentially expressed genes. In spite of the numerous tools developed for each component of an RNA-seq analysis workflow, easy-to-use bacterially oriented workflow applications to combine multiple tools and automate the process are lacking. With many tools to choose from for each step, the task of identifying a specific tool, adapting the input/output options to the specific use-case, and integrating the tools into a coherent analysis pipeline is not a trivial endeavor, particularly for microbiologists with limited bioinformatics experience. To make bacterial RNA-seq data analysis more accessible, we developed a Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis (SPARTA). SPARTA is a reference-based bacterial RNA-seq analysis workflow application for single-end Illumina reads. SPARTA is turnkey software that simplifies the process of analyzing RNA-seq data sets, making bacterial RNA-seq analysis a routine process that can be undertaken on a personal computer or in the classroom. The easy-to-install, complete workflow processes whole transcriptome shotgun sequencing data files by trimming reads and removing adapters, mapping reads to a reference, counting gene features, calculating differential gene expression, and, importantly, checking for potential batch effects within the data set. SPARTA outputs quality analysis reports, gene feature counts and differential gene expression tables and scatterplots. SPARTA provides an easy-to-use bacterial RNA-seq transcriptional profiling workflow to identify differentially expressed genes between experimental conditions. This software will enable microbiologists with limited bioinformatics experience to analyze their data and integrate next generation sequencing (NGS) technologies into the classroom. The SPARTA software and tutorial are available at sparta.readthedocs.org.
Interpreter of maladies: redescription mining applied to biomedical data analysis.
Waltman, Peter; Pearlman, Alex; Mishra, Bud
2006-04-01
Comprehensive, systematic and integrated data-centric statistical approaches to disease modeling can provide powerful frameworks for understanding disease etiology. Here, one such computational framework based on redescription mining in both its incarnations, static and dynamic, is discussed. The static framework provides bioinformatic tools applicable to multifaceted datasets, containing genetic, transcriptomic, proteomic, and clinical data for diseased patients and normal subjects. The dynamic redescription framework provides systems biology tools to model complex sets of regulatory, metabolic and signaling pathways in the initiation and progression of a disease. As an example, the case of chronic fatigue syndrome (CFS) is considered, which has so far remained intractable and unpredictable in its etiology and nosology. The redescription mining approaches can be applied to the Centers for Disease Control and Prevention's Wichita (KS, USA) dataset, integrating transcriptomic, epidemiological and clinical data, and can also be used to study how pathways in the hypothalamic-pituitary-adrenal axis affect CFS patients.
Multivariate inference of pathway activity in host immunity and response to therapeutics
Goel, Gautam; Conway, Kara L.; Jaeger, Martin; Netea, Mihai G.; Xavier, Ramnik J.
2014-01-01
Developing a quantitative view of how biological pathways are regulated in response to environmental factors is central for understanding of disease phenotypes. We present a computational framework, named Multivariate Inference of Pathway Activity (MIPA), which quantifies degree of activity induced in a biological pathway by computing five distinct measures from transcriptomic profiles of its member genes. Statistical significance of inferred activity is examined using multiple independent self-contained tests followed by a competitive analysis. The method incorporates a new algorithm to identify a subset of genes that may regulate the extent of activity induced in a pathway. We present an in-depth evaluation of specificity, robustness, and reproducibility of our method. We benchmarked MIPA's false positive rate at less than 1%. Using transcriptomic profiles representing distinct physiological and disease states, we illustrate applicability of our method in (i) identifying gene–gene interactions in autophagy-dependent response to Salmonella infection, (ii) uncovering gene–environment interactions in host response to bacterial and viral pathogens and (iii) identifying driver genes and processes that contribute to wound healing and response to anti-TNFα therapy. We provide relevant experimental validation that corroborates the accuracy and advantage of our method. PMID:25147207
Preliminary profiling of blood transcriptome in a rat model of hemorrhagic shock
Braga, D; Barcella, M; D’Avila, F; Lupoli, S; Tagliaferri, F; Santamaria, MH; DeLano, FA; Baselli, G; Schmid-Schönbein, GW; Kistler, EB; Aletti, F
2017-01-01
Hemorrhagic shock is a leading cause of morbidity and mortality worldwide. Significant blood loss may lead to decreased blood pressure and inadequate tissue perfusion with resultant organ failure and death, even after replacement of lost blood volume. One reason for this high acuity is that the fundamental mechanisms of shock are poorly understood. Proteomic and metabolomic approaches have been used to investigate the molecular events occurring in hemorrhagic shock but, to our knowledge, a systematic analysis of the transcriptomic profile is missing. Therefore, a pilot analysis using paired-end RNA sequencing was used to identify changes that occur in the blood transcriptome of rats subjected to hemorrhagic shock after blood reinfusion. Hemorrhagic shock was induced using a Wigger’s shock model. The transcriptome of whole blood from shocked animals shows modulation of genes related to inflammation and immune response (Tlr13, Il1b, Ccl6, Lgals3), antioxidant functions (Mt2A, Mt1), tissue injury and repair pathways (Gpnmb, Trim72) and lipid mediators (Alox5ap, Ltb4r, Ptger2) compared with control animals. These findings are congruent with results obtained in hemorrhagic shock analysis by other authors using metabolomics and proteomics. The analysis of blood transcriptome may be a valuable tool to understand the biological changes occurring in hemorrhagic shock and a promising approach for the identification of novel biomarkers and therapeutic targets. Impact statement This study provides the first pilot analysis of the changes occurring in transcriptome expression of whole blood in hemorrhagic shock (HS) rats. We showed that the analysis of blood transcriptome is a useful approach to investigate pathways and functional alterations in this disease condition. This pilot study encourages the possible application of transcriptome analysis in the clinical setting, for the molecular profiling of whole blood in HS patients. PMID:28661205
Li, Qike; Schissler, A Grant; Gardeux, Vincent; Achour, Ikbel; Kenost, Colleen; Berghout, Joanne; Li, Haiquan; Zhang, Hao Helen; Lussier, Yves A
2017-05-24
Transcriptome analytic tools are commonly used across patient cohorts to develop drugs and predict clinical outcomes. However, as precision medicine pursues more accurate and individualized treatment decisions, these methods are not designed to address single-patient transcriptome analyses. We previously developed and validated the N-of-1-pathways framework using two methods, Wilcoxon and Mahalanobis Distance (MD), for personal transcriptome analysis derived from a pair of samples of a single patient. Although, both methods uncover concordantly dysregulated pathways, they are not designed to detect dysregulated pathways with up- and down-regulated genes (bidirectional dysregulation) that are ubiquitous in biological systems. We developed N-of-1-pathways MixEnrich, a mixture model followed by a gene set enrichment test, to uncover bidirectional and concordantly dysregulated pathways one patient at a time. We assess its accuracy in a comprehensive simulation study and in a RNA-Seq data analysis of head and neck squamous cell carcinomas (HNSCCs). In presence of bidirectionally dysregulated genes in the pathway or in presence of high background noise, MixEnrich substantially outperforms previous single-subject transcriptome analysis methods, both in the simulation study and the HNSCCs data analysis (ROC Curves; higher true positive rates; lower false positive rates). Bidirectional and concordant dysregulated pathways uncovered by MixEnrich in each patient largely overlapped with the quasi-gold standard compared to other single-subject and cohort-based transcriptome analyses. The greater performance of MixEnrich presents an advantage over previous methods to meet the promise of providing accurate personal transcriptome analysis to support precision medicine at point of care.
RNA-Skim: a rapid method for RNA-Seq quantification at transcript level
Zhang, Zhaojun; Wang, Wei
2014-01-01
Motivation: RNA-Seq technique has been demonstrated as a revolutionary means for exploring transcriptome because it provides deep coverage and base pair-level resolution. RNA-Seq quantification is proven to be an efficient alternative to Microarray technique in gene expression study, and it is a critical component in RNA-Seq differential expression analysis. Most existing RNA-Seq quantification tools require the alignments of fragments to either a genome or a transcriptome, entailing a time-consuming and intricate alignment step. To improve the performance of RNA-Seq quantification, an alignment-free method, Sailfish, has been recently proposed to quantify transcript abundances using all k-mers in the transcriptome, demonstrating the feasibility of designing an efficient alignment-free method for transcriptome quantification. Even though Sailfish is substantially faster than alternative alignment-dependent methods such as Cufflinks, using all k-mers in the transcriptome quantification impedes the scalability of the method. Results: We propose a novel RNA-Seq quantification method, RNA-Skim, which partitions the transcriptome into disjoint transcript clusters based on sequence similarity, and introduces the notion of sig-mers, which are a special type of k-mers uniquely associated with each cluster. We demonstrate that the sig-mer counts within a cluster are sufficient for estimating transcript abundances with accuracy comparable with any state-of-the-art method. This enables RNA-Skim to perform transcript quantification on each cluster independently, reducing a complex optimization problem into smaller optimization tasks that can be run in parallel. As a result, RNA-Skim uses <4% of the k-mers and <10% of the CPU time required by Sailfish. It is able to finish transcriptome quantification in <10 min per sample by using just a single thread on a commodity computer, which represents >100 speedup over the state-of-the-art alignment-based methods, while delivering comparable or higher accuracy. Availability and implementation: The software is available at http://www.csbio.unc.edu/rs. Contact: weiwang@cs.ucla.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24931995
Philipp, E E R; Kraemer, L; Mountfort, D; Schilhabel, M; Schreiber, S; Rosenstiel, P
2012-03-15
Next generation sequencing (NGS) technologies allow a rapid and cost-effective compilation of large RNA sequence datasets in model and non-model organisms. However, the storage and analysis of transcriptome information from different NGS platforms is still a significant bottleneck, leading to a delay in data dissemination and subsequent biological understanding. Especially database interfaces with transcriptome analysis modules going beyond mere read counts are missing. Here, we present the Transcriptome Analysis and Comparison Explorer (T-ACE), a tool designed for the organization and analysis of large sequence datasets, and especially suited for transcriptome projects of non-model organisms with little or no a priori sequence information. T-ACE offers a TCL-based interface, which accesses a PostgreSQL database via a php-script. Within T-ACE, information belonging to single sequences or contigs, such as annotation or read coverage, is linked to the respective sequence and immediately accessible. Sequences and assigned information can be searched via keyword- or BLAST-search. Additionally, T-ACE provides within and between transcriptome analysis modules on the level of expression, GO terms, KEGG pathways and protein domains. Results are visualized and can be easily exported for external analysis. We developed T-ACE for laboratory environments, which have only a limited amount of bioinformatics support, and for collaborative projects in which different partners work on the same dataset from different locations or platforms (Windows/Linux/MacOS). For laboratories with some experience in bioinformatics and programming, the low complexity of the database structure and open-source code provides a framework that can be customized according to the different needs of the user and transcriptome project.
Gazara, Rajesh K; Cardoso, Christiane; Bellieny-Rabelo, Daniel; Ferreira, Clélia; Terra, Walter R; Venancio, Thiago M
2017-09-05
Despite the great morphological diversity of insects, there is a regularity in their digestive functions, which is apparently related to their physiology. In the present work we report the de novo midgut transcriptomes of four non-model insects from four distinct orders: Spodoptera frugiperda (Lepidoptera), Musca domestica (Diptera), Tenebrio molitor (Coleoptera) and Dysdercus peruvianus (Hemiptera). We employed a computational strategy to merge assemblies obtained with two different algorithms, which substantially increased the quality of the final transcriptomes. Unigenes were annotated and analyzed using the eggNOG database, which allowed us to assign some level of functional and evolutionary information to 79.7% to 93.1% of the transcriptomes. We found interesting transcriptional patterns, such as: i) the intense use of lysozymes in digestive functions of M. domestica larvae, which are streamlined and adapted to feed on bacteria; ii) the up-regulation of orthologous UDP-glycosyl transferase and cytochrome P450 genes in the whole midguts different species, supporting the existence of an ancient defense frontline to counter xenobiotics; iii) evidence supporting roles for juvenile hormone binding proteins in the midgut physiology, probably as a way to activate genes that help fight anti-nutritional substances (e.g. protease inhibitors). The results presented here shed light on the digestive and structural properties of the digestive systems of these distantly related species. Furthermore, the produced datasets will also be useful for scientists studying these insects. Copyright © 2017. Published by Elsevier B.V.
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Fei; Maslov, Sergei; Yoo, Shinjae
Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset andmore » found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less
He, Fei; Maslov, Sergei; Yoo, Shinjae; ...
2016-05-25
Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset andmore » found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less
Integrated Analysis of Transcriptomic and Proteomic Data
Haider, Saad; Pal, Ranadip
2013-01-01
Until recently, understanding the regulatory behavior of cells has been pursued through independent analysis of the transcriptome or the proteome. Based on the central dogma, it was generally assumed that there exist a direct correspondence between mRNA transcripts and generated protein expressions. However, recent studies have shown that the correlation between mRNA and Protein expressions can be low due to various factors such as different half lives and post transcription machinery. Thus, a joint analysis of the transcriptomic and proteomic data can provide useful insights that may not be deciphered from individual analysis of mRNA or protein expressions. This article reviews the existing major approaches for joint analysis of transcriptomic and proteomic data. We categorize the different approaches into eight main categories based on the initial algorithm and final analysis goal. We further present analogies with other domains and discuss the existing research problems in this area. PMID:24082820
Salehi, Abdolreza; Rivera, Rocío Melissa
2018-01-01
RNA editing increases the diversity of the transcriptome and proteome. Adenosine-to-inosine (A-to-I) editing is the predominant type of RNA editing in mammals and it is catalyzed by the adenosine deaminases acting on RNA (ADARs) family. Here, we used a largescale computational analysis of transcriptomic data from brain, heart, colon, lung, spleen, kidney, testes, skeletal muscle and liver, from three adult animals in order to identify RNA editing sites in bovine. We developed a computational pipeline and used a rigorous strategy to identify novel editing sites from RNA-Seq data in the absence of corresponding DNA sequence information. Our methods take into account sequencing errors, mapping bias, as well as biological replication to reduce the probability of obtaining a false-positive result. We conducted a detailed characterization of sequence and structural features related to novel candidate sites and found 1,600 novel canonical A-to-I editing sites in the nine bovine tissues analyzed. Results show that these sites 1) occur frequently in clusters and short interspersed nuclear elements (SINE) repeats, 2) have a preference for guanines depletion/enrichment in the flanking 5′/3′ nucleotide, 3) occur less often in coding sequences than other regions of the genome, and 4) have low evolutionary conservation. Further, we found that a positive correlation exists between expression of ADAR family members and tissue-specific RNA editing. Most of the genes with predicted A-to-I editing in each tissue were significantly enriched in biological terms relevant to the function of the corresponding tissue. Lastly, the results highlight the importance of the RNA editome in nervous system regulation. The present study extends the list of RNA editing sites in bovine and provides pipelines that may be used to investigate the editome in other organisms. PMID:29470549
Arkas: Rapid reproducible RNAseq analysis
Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan
2017-01-01
The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments. We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways . Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing. Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import. Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134
Illuminator, a desktop program for mutation detection using short-read clonal sequencing.
Carr, Ian M; Morgan, Joanne E; Diggle, Christine P; Sheridan, Eamonn; Markham, Alexander F; Logan, Clare V; Inglehearn, Chris F; Taylor, Graham R; Bonthron, David T
2011-10-01
Current methods for sequencing clonal populations of DNA molecules yield several gigabases of data per day, typically comprising reads of < 100 nt. Such datasets permit widespread genome resequencing and transcriptome analysis or other quantitative tasks. However, this huge capacity can also be harnessed for the resequencing of smaller (gene-sized) target regions, through the simultaneous parallel analysis of multiple subjects, using sample "tagging" or "indexing". These methods promise to have a huge impact on diagnostic mutation analysis and candidate gene testing. Here we describe a software package developed for such studies, offering the ability to resolve pooled samples carrying barcode tags and to align reads to a reference sequence using a mutation-tolerant process. The program, Illuminator, can identify rare sequence variants, including insertions and deletions, and permits interactive data analysis on standard desktop computers. It facilitates the effective analysis of targeted clonal sequencer data without dedicated computational infrastructure or specialized training. Copyright © 2011 Elsevier Inc. All rights reserved.
Analysis of the Citrullus colocynthis Transcriptome during Water Deficit Stress
Wang, Zhuoyu; Hu, Hongtao; Goertzen, Leslie R.; McElroy, J. Scott; Dane, Fenny
2014-01-01
Citrullus colocynthis is a very drought tolerant species, closely related to watermelon (C. lanatus var. lanatus), an economically important cucurbit crop. Drought is a threat to plant growth and development, and the discovery of drought inducible genes with various functions is of great importance. We used high throughput mRNA Illumina sequencing technology and bioinformatic strategies to analyze the C. colocynthis leaf transcriptome under drought treatment. Leaf samples at four different time points (0, 24, 36, or 48 hours of withholding water) were used for RNA extraction and Illumina sequencing. qRT-PCR of several drought responsive genes was performed to confirm the accuracy of RNA sequencing. Leaf transcriptome analysis provided the first glimpse of the drought responsive transcriptome of this unique cucurbit species. A total of 5038 full-length cDNAs were detected, with 2545 genes showing significant changes during drought stress. Principle component analysis indicated that drought was the major contributing factor regulating transcriptome changes. Up regulation of many transcription factors, stress signaling factors, detoxification genes, and genes involved in phytohormone signaling and citrulline metabolism occurred under the water deficit conditions. The C. colocynthis transcriptome data highlight the activation of a large set of drought related genes in this species, thus providing a valuable resource for future functional analysis of candidate genes in defense of drought stress. PMID:25118696
Nam, Seungyoon
2017-04-01
Cancer transcriptome analysis is one of the leading areas of Big Data science, biomarker, and pharmaceutical discovery, not to forget personalized medicine. Yet, cancer transcriptomics and postgenomic medicine require innovation in bioinformatics as well as comparison of the performance of available algorithms. In this data analytics context, the value of network generation and algorithms has been widely underscored for addressing the salient questions in cancer pathogenesis. Analysis of cancer trancriptome often results in complicated networks where identification of network modularity remains critical, for example, in delineating the "druggable" molecular targets. Network clustering is useful, but depends on the network topology in and of itself. Notably, the performance of different network-generating tools for network cluster (NC) identification has been little investigated to date. Hence, using gastric cancer (GC) transcriptomic datasets, we compared two algorithms for generating pathway versus gene regulatory network-based NCs, showing that the pathway-based approach better agrees with a reference set of cancer-functional contexts. Finally, by applying pathway-based NC identification to GC transcriptome datasets, we describe cancer NCs that associate with candidate therapeutic targets and biomarkers in GC. These observations collectively inform future research on cancer transcriptomics, drug discovery, and rational development of new analysis tools for optimal harnessing of omics data.
Bi, Yanqi; Pei, Guangsheng; Sun, Tao; Chen, Zixi; Chen, Lei; Zhang, Weiwen
2018-01-01
Microbial small RNAs (sRNAs) play essential roles against many stress conditions in cyanobacteria. However, little is known on their regulatory mechanisms on biofuels tolerance. In our previous sRNA analysis, a trans -encoded sRNA Nc117 was found involved in the tolerance to ethanol and 1-butanol in Synechocystis sp. PCC 6803. However, its functional mechanism is yet to be determined. In this study, functional characterization of sRNA Nc117 was performed. Briefly, the exact length of the trans -encoded sRNA Nc117 was determined to be 102 nucleotides using 3' RACE, and the positive regulation of Nc117 on short chain alcohols tolerance was further confirmed. Then, computational target prediction and transcriptomic analysis were integrated to explore the potential targets of Nc117. A total of 119 up-regulated and 116 down-regulated genes were identified in nc117 overexpression strain compared with the wild type by comparative transcriptomic analysis, among which the upstream regions of five genes were overlapped with those predicted by computational target approach. Based on the phenotype analysis of gene deletion and overexpression strains under short chain alcohols stress, one gene slr0007 encoding D-glycero-alpha-D-manno-heptose 1-phosphate guanylyltransferase was determined as a potential target of Nc117, suggesting that the synthesis of LPS or S-layer glycoprotein may be responsible for the tolerance enhancement. As the first reported trans -encoded sRNA positively regulating biofuels tolerance in cyanobacteria, this study not only provided evidence for a new regulatory mechanism of trans -encoded sRNA in cyanobacteria, but also valuable information for rational construction of high-tolerant cyanobacterial chassis.
An OMIC biomarker detection algorithm TriVote and its application in methylomic biomarker detection.
Xu, Cheng; Liu, Jiamei; Yang, Weifeng; Shu, Yayun; Wei, Zhipeng; Zheng, Weiwei; Feng, Xin; Zhou, Fengfeng
2018-04-01
Transcriptomic and methylomic patterns represent two major OMIC data sources impacted by both inheritable genetic information and environmental factors, and have been widely used as disease diagnosis and prognosis biomarkers. Modern transcriptomic and methylomic profiling technologies detect the status of tens of thousands or even millions of probing residues in the human genome, and introduce a major computational challenge for the existing feature selection algorithms. This study proposes a three-step feature selection algorithm, TriVote, to detect a subset of transcriptomic or methylomic residues with highly accurate binary classification performance. TriVote outperforms both filter and wrapper feature selection algorithms with both higher classification accuracy and smaller feature number on 17 transcriptomes and two methylomes. Biological functions of the methylome biomarkers detected by TriVote were discussed for their disease associations. An easy-to-use Python package is also released to facilitate the further applications.
Quantitative RNA-seq analysis of the Campylobacter jejuni transcriptome
Chaudhuri, Roy R.; Yu, Lu; Kanji, Alpa; Perkins, Timothy T.; Gardner, Paul P.; Choudhary, Jyoti; Maskell, Duncan J.
2011-01-01
Campylobacter jejuni is the most common bacterial cause of foodborne disease in the developed world. Its general physiology and biochemistry, as well as the mechanisms enabling it to colonize and cause disease in various hosts, are not well understood, and new approaches are required to understand its basic biology. High-throughput sequencing technologies provide unprecedented opportunities for functional genomic research. Recent studies have shown that direct Illumina sequencing of cDNA (RNA-seq) is a useful technique for the quantitative and qualitative examination of transcriptomes. In this study we report RNA-seq analyses of the transcriptomes of C. jejuni (NCTC11168) and its rpoN mutant. This has allowed the identification of hitherto unknown transcriptional units, and further defines the regulon that is dependent on rpoN for expression. The analysis of the NCTC11168 transcriptome was supplemented by additional proteomic analysis using liquid chromatography-MS. The transcriptomic and proteomic datasets represent an important resource for the Campylobacter research community. PMID:21816880
Vashisht, Ira; Mishra, Prashant; Pal, Tarun; Chanumolu, Sreekrishna; Singh, Tiratha Raj; Chauhan, Rajinder Singh
2015-05-01
This study is the first endeavor on mining of miRNAs and analyzing their involvement in development and secondary metabolism of an endangered medicinal herb Picrorhiza kurroa (P. kurroa ). miRNAs are ubiquitous non-coding RNA species that target complementary sequences of mRNA and result in either translational repression or target degradation in eukaryotes. The role of miRNAs has not been investigated in P. kurroa which is a medicinal herb of industrial value due to the presence of secondary metabolites, picroside-I and picroside-II. Computational identification of miRNAs was done in 6 transcriptomes of P. kurroa generated from root, shoot, and stolon organs varying for growth, development, and culture conditions. All available plant miRNA entries were retrieved from miRBase and used as backend datasets to computationally identify conserved miRNAs in transcriptome data sets. Total 18 conserved miRNAs were detected in P. kurroa followed by target prediction and functional annotation which suggested their possible role in controlling various biological processes. Validation of miRNA and expression analysis by qRT-PCR and 5' RACE revealed that miRNA-4995 has a regulatory role in terpenoid biosynthesis ultimately affecting the production of picroside-I. miR-5532 and miR-5368 had negligible expression in field-grown samples as compared to in vitro-cultured samples suggesting their role in regulating P. kurroa growth in culture conditions. The study has thus identified novel functions for existing miRNAs which can be further validated for their potential regulatory role.
SEASTAR: systematic evaluation of alternative transcription start sites in RNA.
Qin, Zhiyi; Stoilov, Peter; Zhang, Xuegong; Xing, Yi
2018-05-04
Alternative first exons diversify the transcriptomes of eukaryotes by producing variants of the 5' Untranslated Regions (5'UTRs) and N-terminal coding sequences. Accurate transcriptome-wide detection of alternative first exons typically requires specialized experimental approaches that are designed to identify the 5' ends of transcripts. We developed a computational pipeline SEASTAR that identifies first exons from RNA-seq data alone then quantifies and compares alternative first exon usage across multiple biological conditions. The exons inferred by SEASTAR coincide with transcription start sites identified directly by CAGE experiments and bear epigenetic hallmarks of active promoters. To determine if differential usage of alternative first exons can yield insights into the mechanism controlling gene expression, we applied SEASTAR to an RNA-seq dataset that tracked the reprogramming of mouse fibroblasts into induced pluripotent stem cells. We observed dynamic temporal changes in the usage of alternative first exons, along with correlated changes in transcription factor expression. Using a combined sequence motif and gene set enrichment analysis we identified N-Myc as a regulator of alternative first exon usage in the pluripotent state. Our results demonstrate that SEASTAR can leverage the available RNA-seq data to gain insights into the control of gene expression and alternative transcript variation in eukaryotic transcriptomes.
Analysis of Transcriptomic Dose Response Data in the ...
Slide presentation at the HESI-HEALTH Canada-McGill Workshop on Transcriptomic Dose Response Data in the Context of Chemical Risk Assessment Slide presentation at the HESI-HEALTH Canada-McGill Workshop on Transcriptomic Dose Response Data in the Context of Chemical Risk Assessment
How to design a single-cell RNA-sequencing experiment: pitfalls, challenges and perspectives.
Dal Molin, Alessandra; Di Camillo, Barbara
2018-01-31
The sequencing of the transcriptome of single cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types in heterogeneous cell populations or for the study of stochastic gene expression. In recent years, various experimental methods and computational tools for analysing single-cell RNA-sequencing data have been proposed. However, most of them are tailored to different experimental designs or biological questions, and in many cases, their performance has not been benchmarked yet, thus increasing the difficulty for a researcher to choose the optimal single-cell transcriptome sequencing (scRNA-seq) experiment and analysis workflow. In this review, we aim to provide an overview of the current available experimental and computational methods developed to handle single-cell RNA-sequencing data and, based on their peculiarities, we suggest possible analysis frameworks depending on specific experimental designs. Together, we propose an evaluation of challenges and open questions and future perspectives in the field. In particular, we go through the different steps of scRNA-seq experimental protocols such as cell isolation, messenger RNA capture, reverse transcription, amplification and use of quantitative standards such as spike-ins and Unique Molecular Identifiers (UMIs). We then analyse the current methodological challenges related to preprocessing, alignment, quantification, normalization, batch effect correction and methods to control for confounding effects. © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis
Jones, Beryl M.; Wcislo, William T.; Robinson, Gene E.
2015-01-01
Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome for analysis of gene expression profiles throughout development. Gene Ontology analysis indicates that stage-specific genes are involved in ion transport, cell–cell signaling, and metabolism. A number of distinct biological processes are upregulated in each life stage, and transitions between life stages involve shifts in dominant functional processes, including shifts from transcriptional regulation in embryos to metabolism in larvae, and increased lipid metabolism in adults. We expect that this transcriptome will provide a useful resource for future analyses to better understand the molecular basis of the evolution of eusociality and, more generally, phenotypic plasticity. PMID:26276382
Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis.
Jones, Beryl M; Wcislo, William T; Robinson, Gene E
2015-08-14
Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome for analysis of gene expression profiles throughout development. Gene Ontology analysis indicates that stage-specific genes are involved in ion transport, cell-cell signaling, and metabolism. A number of distinct biological processes are upregulated in each life stage, and transitions between life stages involve shifts in dominant functional processes, including shifts from transcriptional regulation in embryos to metabolism in larvae, and increased lipid metabolism in adults. We expect that this transcriptome will provide a useful resource for future analyses to better understand the molecular basis of the evolution of eusociality and, more generally, phenotypic plasticity. Copyright © 2015 Jones et al.
A survey of the sorghum transcriptome using single-molecule long reads
Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; ...
2016-06-24
Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novelmore » splice isoforms. Additionally, we uncover APA ofB11,000 expressed genes and more than 2,100 novel genes. Lastly, these results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism.« less
A survey of the sorghum transcriptome using single-molecule long reads
Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; Ngam, Peter; Devitt, Nicholas; Schilkey, Faye; Ben-Hur, Asa; Reddy, Anireddy S. N.
2016-01-01
Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novel splice isoforms. Additionally, we uncover APA of ∼11,000 expressed genes and more than 2,100 novel genes. These results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism. PMID:27339290
Lovatt, Ditte; Ruble, Brittani K.; Lee, Jaehee; Dueck, Hannah; Kim, Tae Kyung; Fisher, Stephen; Francis, Chantal; Spaethling, Jennifer M.; Wolf, John A.; Grady, M. Sean; Ulyanova, Alexandra V.; Yeldell, Sean B.; Griepenburg, Julianne C.; Buckley, Peter T.; Kim, Junhyong; Sul, Jai-Yoon; Dmochowski, Ivan J.; Eberwine, James
2014-01-01
Transcriptome profiling is an indispensable tool in advancing the understanding of single cell biology, but depends upon methods capable of isolating mRNA at the spatial resolution of a single cell. Current capture methods lack sufficient spatial resolution to isolate mRNA from individual in vivo resident cells without damaging adjacent tissue. Because of this limitation, it has been difficult to assess the influence of the microenvironment on the transcriptome of individual neurons. Here, we engineered a Transcriptome In Vivo Analysis (TIVA)-tag, which upon photoactivation enables mRNA capture from single cells in live tissue. Using the TIVA-tag in combination with RNA-seq to analyze transcriptome variance among single dispersed cells and in vivo resident mouse and human neurons, we show that the tissue microenvironment shapes the transcriptomic landscape of individual cells. The TIVA methodology provides the first noninvasive approach for capturing mRNA from single cells in their natural microenvironment. PMID:24412976
Linear Regression Links Transcriptomic Data and Cellular Raman Spectra.
Kobayashi-Kirschvink, Koseki J; Nakaoka, Hidenori; Oda, Arisa; Kamei, Ken-Ichiro F; Nosho, Kazuki; Fukushima, Hiroko; Kanesaki, Yu; Yajima, Shunsuke; Masaki, Haruhiko; Ohta, Kunihiro; Wakamoto, Yuichi
2018-06-08
Raman microscopy is an imaging technique that has been applied to assess molecular compositions of living cells to characterize cell types and states. However, owing to the diverse molecular species in cells and challenges of assigning peaks to specific molecules, it has not been clear how to interpret cellular Raman spectra. Here, we provide firm evidence that cellular Raman spectra and transcriptomic profiles of Schizosaccharomyces pombe and Escherichia coli can be computationally connected and thus interpreted. We find that the dimensions of high-dimensional Raman spectra and transcriptomes measured by RNA sequencing can be reduced and connected linearly through a shared low-dimensional subspace. Accordingly, we were able to predict global gene expression profiles by applying the calculated transformation matrix to Raman spectra, and vice versa. Highly expressed non-coding RNAs contributed to the Raman-transcriptome linear correspondence more significantly than mRNAs in S. pombe. This demonstration of correspondence between cellular Raman spectra and transcriptomes is a promising step toward establishing spectroscopic live-cell omics studies. Copyright © 2018 Elsevier Inc. All rights reserved.
PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.
Gan, Ruei-Chi; Chen, Ting-Wen; Wu, Timothy H; Huang, Po-Jung; Lee, Chi-Ching; Yeh, Yuan-Ming; Chiu, Cheng-Hsun; Huang, Hsien-Da; Tang, Petrus
2016-12-22
Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw .
Hu, Yongli; Hase, Takeshi; Li, Hui Peng; Prabhakar, Shyam; Kitano, Hiroaki; Ng, See Kiong; Ghosh, Samik; Wee, Lawrence Jin Kiat
2016-12-22
The ability to sequence the transcriptomes of single cells using single-cell RNA-seq sequencing technologies presents a shift in the scientific paradigm where scientists, now, are able to concurrently investigate the complex biology of a heterogeneous population of cells, one at a time. However, till date, there has not been a suitable computational methodology for the analysis of such intricate deluge of data, in particular techniques which will aid the identification of the unique transcriptomic profiles difference between the different cellular subtypes. In this paper, we describe the novel methodology for the analysis of single-cell RNA-seq data, obtained from neocortical cells and neural progenitor cells, using machine learning algorithms (Support Vector machine (SVM) and Random Forest (RF)). Thirty-eight key transcripts were identified, using the SVM-based recursive feature elimination (SVM-RFE) method of feature selection, to best differentiate developing neocortical cells from neural progenitor cells in the SVM and RF classifiers built. Also, these genes possessed a higher discriminative power (enhanced prediction accuracy) as compared commonly used statistical techniques or geneset-based approaches. Further downstream network reconstruction analysis was carried out to unravel hidden general regulatory networks where novel interactions could be further validated in web-lab experimentation and be useful candidates to be targeted for the treatment of neuronal developmental diseases. This novel approach reported for is able to identify transcripts, with reported neuronal involvement, which optimally differentiate neocortical cells and neural progenitor cells. It is believed to be extensible and applicable to other single-cell RNA-seq expression profiles like that of the study of the cancer progression and treatment within a highly heterogeneous tumour.
Vitali, Francesca; Li, Qike; Schissler, A Grant; Berghout, Joanne; Kenost, Colleen; Lussier, Yves A
2017-12-18
The development of computational methods capable of analyzing -omics data at the individual level is critical for the success of precision medicine. Although unprecedented opportunities now exist to gather data on an individual's -omics profile ('personalome'), interpreting and extracting meaningful information from single-subject -omics remain underdeveloped, particularly for quantitative non-sequence measurements, including complete transcriptome or proteome expression and metabolite abundance. Conventional bioinformatics approaches have largely been designed for making population-level inferences about 'average' disease processes; thus, they may not adequately capture and describe individual variability. Novel approaches intended to exploit a variety of -omics data are required for identifying individualized signals for meaningful interpretation. In this review-intended for biomedical researchers, computational biologists and bioinformaticians-we survey emerging computational and translational informatics methods capable of constructing a single subject's 'personalome' for predicting clinical outcomes or therapeutic responses, with an emphasis on methods that provide interpretable readouts. (i) the single-subject analytics of the transcriptome shows the greatest development to date and, (ii) the methods were all validated in simulations, cross-validations or independent retrospective data sets. This survey uncovers a growing field that offers numerous opportunities for the development of novel validation methods and opens the door for future studies focusing on the interpretation of comprehensive 'personalomes' through the integration of multiple -omics, providing valuable insights into individual patient outcomes and treatments. © The Author 2017. Published by Oxford University Press.
Integrated network analysis and effective tools in plant systems biology
Fukushima, Atsushi; Kanaya, Shigehiko; Nishida, Kozo
2014-01-01
One of the ultimate goals in plant systems biology is to elucidate the genotype-phenotype relationship in plant cellular systems. Integrated network analysis that combines omics data with mathematical models has received particular attention. Here we focus on the latest cutting-edge computational advances that facilitate their combination. We highlight (1) network visualization tools, (2) pathway analyses, (3) genome-scale metabolic reconstruction, and (4) the integration of high-throughput experimental data and mathematical models. Multi-omics data that contain the genome, transcriptome, proteome, and metabolome and mathematical models are expected to integrate and expand our knowledge of complex plant metabolisms. PMID:25408696
Digital transcriptome profiling using selective hexamer priming for cDNA synthesis.
Armour, Christopher D; Castle, John C; Chen, Ronghua; Babak, Tomas; Loerch, Patrick; Jackson, Stuart; Shah, Jyoti K; Dey, John; Rohl, Carol A; Johnson, Jason M; Raymond, Christopher K
2009-09-01
We developed a procedure for the preparation of whole transcriptome cDNA libraries depleted of ribosomal RNA from only 1 microg of total RNA. The method relies on a collection of short, computationally selected oligonucleotides, called 'not-so-random' (NSR) primers, to obtain full-length, strand-specific representation of nonribosomal RNA transcripts. In this study we validated the technique by profiling human whole brain and universal human reference RNA using ultra-high-throughput sequencing.
Insights into transcriptomes of Big and Low sagebrush
Mark D. Huynh; Justin T. Page; Bryce A. Richardson; Joshua A. Udall
2015-01-01
We report the sequencing and assembly of three transcriptomes from Big (Artemisia tridentatassp. wyomingensis and A. tridentatassp. tridentata) and Low (A. arbuscula ssp. arbuscula) sagebrush. The sequence reads are available in the Sequence Read Archive of NCBI. We demonstrate the utilities of these transcriptomes for gene discovery and phylogenomic analysis. An...
Decoding genes with coexpression networks and metabolomics - 'majority report by precogs'.
Saito, Kazuki; Hirai, Masami Y; Yonekura-Sakakibara, Keiko
2008-01-01
Following the sequencing of whole genomes of model plants, high-throughput decoding of gene function is a major challenge in modern plant biology. In view of remarkable technical advances in transcriptomics and metabolomics, integrated analysis of these 'omics' by data-mining informatics is an excellent tool for prediction and identification of gene function, particularly for genes involved in complicated metabolic pathways. The availability of Arabidopsis public transcriptome datasets containing data of >1000 microarrays reinforces the potential for prediction of gene function by transcriptome coexpression analysis. Here, we review the strategy of combining transcriptome and metabolome as a powerful technology for studying the functional genomics of model plants and also crop and medicinal plants.
A high-throughput approach to profile RNA structure.
Delli Ponti, Riccardo; Marti, Stefanie; Armaos, Alexandros; Tartaglia, Gian Gaetano
2017-03-17
Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as Xist (17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
International Standards for Genomes, Transcriptomes, and Metagenomes
Mason, Christopher E.; Afshinnekoo, Ebrahim; Tighe, Scott; Wu, Shixiu; Levy, Shawn
2017-01-01
Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single-cells, RNA profiling, and metagenomics (across multiple genomes). Technical artifacts and contamination can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous. Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data. Here we review current standards and their applications in genomics, including whole genomes, transcriptomes, mixed genomic samples (metagenomes), and the modified bases within each (epigenomes and epitranscriptomes). These standards, tools, and metrics are critical for quantifying the accuracy of NGS methods, which will be essential for robust approaches in clinical genomics and precision medicine. PMID:28337071
Liu, Na; Liu, Lin; Pan, Xinghua
2014-07-01
Cellular heterogeneity within a cell population is a common phenomenon in multicellular organisms, tissues, cultured cells, and even FACS-sorted subpopulations. Important information may be masked if the cells are studied as a mass. Transcriptome profiling is a parameter that has been intensively studied, and relatively easier to address than protein composition. To understand the basis and importance of heterogeneity and stochastic aspects of the cell function and its mechanisms, it is essential to examine transcriptomes of a panel of single cells. High-throughput technologies, starting from microarrays and now RNA-seq, provide a full view of the expression of transcriptomes but are limited by the amount of RNA for analysis. Recently, several new approaches for amplification and sequencing the transcriptome of single cells or a limited low number of cells have been developed and applied. In this review, we summarize these major strategies, such as PCR-based methods, IVT-based methods, phi29-DNA polymerase-based methods, and several other methods, including their principles, characteristics, advantages, and limitations, with representative applications in cancer stem cells, early development, and embryonic stem cells. The prospects for development of future technology and application of transcriptome analysis in a single cell are also discussed.
USDA-ARS?s Scientific Manuscript database
Next Generation Sequencing is transforming the way scientists collect and measure an organism’s genetic background and gene dynamics, while bioinformatics and super-computing are merging to facilitate parallel sample computation and interpretation at unprecedented speeds. Analyzing the complete gene...
RAS oncogene-mediated deregulation of the transcriptome: from molecular signature to function.
Schäfer, Reinhold; Sers, Christine
2011-01-01
Transcriptome analysis of cancer cells has developed into a standard procedure to elucidate multiple features of the malignant process and to link gene expression to clinical properties. Gene expression profiling based on microarrays provides essentially correlative information and needs to be transferred to the functional level in order to understand the activity and contribution of individual genes or sets of genes as elements of the gene signature. To date, there exist significant gaps in the functional understanding of gene expression profiles. Moreover, the processes that drive the profound transcriptional alterations that characterize cancer cells remain mainly elusive. We have used pathway-restricted gene expression profiles derived from RAS oncogene-transformed cells and from RAS-expressing cancer cells to identify regulators downstream of the MAPK pathway.We describe the role of epigenetic regulation exemplified by the control of several immune genes in generic cell lines and colorectal cancer cells, particularly the functional interaction between signaling and DNA methylation. Moreover, we assess the role of the architectural transcription factor high mobility AT-hook 2 (HMGA2) as a regulator of the RAS-responsive transcriptome in ovarian epithelial cells. Finally, we describe an integrated approach combining pathway interference in colorectal cancer cells, gene expression profiling and computational analysis of regulatory elements of deregulated target genes. This strategy resulted in the identification of Y-box binding protein 1 (YBX1) as a regulator of MAPK-dependent proliferation and gene expression. The implications for a therapeutic application of HMGA2 gene silencing and the role of YBX1 as a prognostic factor are discussed.
Rai, Muhammad Farooq; Patra, Debabrata; Sandell, Linda J.; Brophy, Robert H.
2013-01-01
Objective Meniscus tears are associated with a heightened risk for osteoarthritis. We aimed to advance our understanding of the metabolic state of human injured meniscus at the time of arthroscopic partial meniscectomy through transcriptome-wide analysis of gene expression in relation to patient age and degree of cartilage chondrosis. Methods The degree of chondrosis of knee cartilage was recorded at the time of meniscectomy in symptomatic patients without radiographic osteoarthritis. RNA preparations from resected menisci (N=12) were subjected to transcriptome-wide microarray and QuantiGene Plex analyses. The relative changes in gene expression variation with age and chondrosis were analyzed and integrated biological processes were investigated computationally. Results We identified a set of genes in torn meniscus that were differentially expressed with age and chondrosis. There were 866 genes differentially regulated (≥1.5-fold; P<0.05) with age and 49 with chondrosis. In older patients, genes associated with cartilage and skeletal development and extracellular matrix synthesis were repressed while those involved in immune response, inflammation, cell cycle, and cellular proliferation were stimulated. With chondrosis, genes representing cell catabolism (cAMP catabolic process) and tissue and endothelial cell development were repressed and those involved in T cell differentiation and apoptosis were elevated. Conclusion Differences in age-related gene expression suggest that in older adults, meniscal cells might de-differentiate and initiate a proliferative phenotype. Conversely, meniscal cells in younger patients appear to respond to injury, but maintain the differentiated phenotype. Definitive molecular signatures identified in damaged meniscus could be segregated largely with age and, to a lesser extent, with chondrosis. PMID:23658108
Shen, Di; Wang, Haiping; Wu, Qingjun; Lu, Peng; Qiu, Yang; Song, Jiangping; Zhang, Youjun; Li, Xixiang
2013-01-01
Background The diamondback moth (DBM, Plutella xylostella) is a crucifer-specific pest that causes significant crop losses worldwide. Barbarea vulgaris (Brassicaceae) can resist DBM and other herbivorous insects by producing feeding-deterrent triterpenoid saponins. Plant breeders have long aimed to transfer this insect resistance to other crops. However, a lack of knowledge on the biosynthetic pathways and regulatory networks of these insecticidal saponins has hindered their practical application. A pyrosequencing-based transcriptome analysis of B. vulgaris during DBM larval feeding was performed to identify genes and gene networks responsible for saponin biosynthesis and its regulation at the genome level. Principal Findings Approximately 1.22, 1.19, 1.16, 1.23, 1.16, 1.20, and 2.39 giga base pairs of clean nucleotides were generated from B. vulgaris transcriptomes sampled 1, 4, 8, 12, 24, and 48 h after onset of P. xylostella feeding and from non-inoculated controls, respectively. De novo assembly using all data of the seven transcriptomes generated 39,531 unigenes. A total of 37,780 (95.57%) unigenes were annotated, 14,399 of which were assigned to one or more gene ontology terms and 19,620 of which were assigned to 126 known pathways. Expression profiles revealed 2,016–4,685 up-regulated and 557–5188 down-regulated transcripts. Secondary metabolic pathways, such as those of terpenoids, glucosinolates, and phenylpropanoids, and its related regulators were elevated. Candidate genes for the triterpene saponin pathway were found in the transcriptome. Orthological analysis of the transcriptome with four other crucifer transcriptomes identified 592 B. vulgaris-specific gene families with a P-value cutoff of 1e−5. Conclusion This study presents the first comprehensive transcriptome analysis of B. vulgaris subjected to a series of DBM feedings. The biosynthetic and regulatory pathways of triterpenoid saponins and other DBM deterrent metabolites in this plant were classified. The results of this study will provide useful data for future investigations on pest-resistance phytochemistry and plant breeding. PMID:23696897
Bu, Dengpan; Bionaz, Massimo; Wang, Mengzhi; Nan, Xuemei; Ma, Lu; Wang, Jiaqi
2017-01-01
Liver and mammary gland are among the most important organs during lactation in dairy cows. With the purpose of understanding both the different and the complementary roles and the crosstalk of those two organs during lactation, a transcriptome analysis was performed on liver and mammary tissues of 10 primiparous dairy cows in mid-lactation. The analysis was performed using a 4×44K Bovine Agilent microarray chip. The transcriptome difference between the two tissues was analyzed using SAS JMP Genomics using ANOVA with a false discovery rate correction (FDR). The analysis uncovered >9,000 genes differentially expressed (DEG) between the two tissues with a FDR<0.001. The functional analysis of the DEG uncovered a larger metabolic (especially related to lipid) and inflammatory response capacity in liver compared with mammary tissue while the mammary tissue had a larger protein synthesis and secretion, proliferation/differentiation, signaling, and innate immune system capacity compared with the liver. A plethora of endogenous compounds, cytokines, and transcription factors were estimated to control the DEG between the two tissues. Compared with mammary tissue, the liver transcriptome appeared to be under control of a large array of ligand-dependent nuclear receptors and, among endogenous chemical, fatty acids and bacteria-derived compounds. Compared with liver, the transcriptome of the mammary tissue was potentially under control of a large number of growth factors and miRNA. The in silico crosstalk analysis between the two tissues revealed an overall large communication with a reciprocal control of lipid metabolism, innate immune system adaptation, and proliferation/differentiation. In summary the transcriptome analysis confirmed prior known differences between liver and mammary tissue, especially considering the indication of a larger metabolic activity in liver compared with the mammary tissue and the larger protein synthesis, communication, and proliferative capacity in mammary tissue compared with the liver. Relatively novel is the indication by the data that the transcriptome of the liver is highly regulated by dietary and bacteria-related compounds while the mammary transcriptome is more under control of hormones, growth factors, and miRNA. A large crosstalk between the two tissues with a reciprocal control of metabolism and innate immune-adaptation was indicated by the network analysis that allowed uncovering previously unknown crosstalk between liver and mammary tissue for several signaling molecules.
Bu, Dengpan; Bionaz, Massimo; Wang, Mengzhi; Nan, Xuemei; Ma, Lu; Wang, Jiaqi
2017-01-01
Liver and mammary gland are among the most important organs during lactation in dairy cows. With the purpose of understanding both the different and the complementary roles and the crosstalk of those two organs during lactation, a transcriptome analysis was performed on liver and mammary tissues of 10 primiparous dairy cows in mid-lactation. The analysis was performed using a 4×44K Bovine Agilent microarray chip. The transcriptome difference between the two tissues was analyzed using SAS JMP Genomics using ANOVA with a false discovery rate correction (FDR). The analysis uncovered >9,000 genes differentially expressed (DEG) between the two tissues with a FDR<0.001. The functional analysis of the DEG uncovered a larger metabolic (especially related to lipid) and inflammatory response capacity in liver compared with mammary tissue while the mammary tissue had a larger protein synthesis and secretion, proliferation/differentiation, signaling, and innate immune system capacity compared with the liver. A plethora of endogenous compounds, cytokines, and transcription factors were estimated to control the DEG between the two tissues. Compared with mammary tissue, the liver transcriptome appeared to be under control of a large array of ligand-dependent nuclear receptors and, among endogenous chemical, fatty acids and bacteria-derived compounds. Compared with liver, the transcriptome of the mammary tissue was potentially under control of a large number of growth factors and miRNA. The in silico crosstalk analysis between the two tissues revealed an overall large communication with a reciprocal control of lipid metabolism, innate immune system adaptation, and proliferation/differentiation. In summary the transcriptome analysis confirmed prior known differences between liver and mammary tissue, especially considering the indication of a larger metabolic activity in liver compared with the mammary tissue and the larger protein synthesis, communication, and proliferative capacity in mammary tissue compared with the liver. Relatively novel is the indication by the data that the transcriptome of the liver is highly regulated by dietary and bacteria-related compounds while the mammary transcriptome is more under control of hormones, growth factors, and miRNA. A large crosstalk between the two tissues with a reciprocal control of metabolism and innate immune-adaptation was indicated by the network analysis that allowed uncovering previously unknown crosstalk between liver and mammary tissue for several signaling molecules. PMID:28291785
USDA-ARS?s Scientific Manuscript database
Using the Eimeria spp. population that infect chickens as a model for coccidian biology, we aimed to survey the transcriptome of E. maxima and contrast it to the two other Eimeria spp. for which transcriptome data are available, E. tenella and E. acervulina. Examining specifically the asexual intra...
Transcriptome profiling analysis of cultivar-specific apple fruit ripening and texture attributes
USDA-ARS?s Scientific Manuscript database
Molecular events regulating cultivar-specific apple fruit ripening and sensory quality are largely unknown. Such knowledge is essential for genomic-assisted apple breeding and postharvest quality management. In this study, transcriptome profile analysis, scanning electron microscopic examination an...
Characterizing differential gene expression in polyploid grasses lacking a reference transcriptome
USDA-ARS?s Scientific Manuscript database
Basal transcriptome characterization and differential gene expression in response to varying conditions are often addressed through next generation sequencing (NGS) and data analysis techniques. While these strategies are commonly used, there are countless tools, pipelines, data analysis methods an...
Deshmukh, Rupesh K; Sonah, Humira; Bélanger, Richard R
2016-01-01
Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is offered as a resource for AQP research.
Deshmukh, Rupesh K.; Sonah, Humira; Bélanger, Richard R.
2016-01-01
Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is offered as a resource for AQP research. PMID:28066459
Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie
2018-01-01
Abstract Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. PMID:29106630
Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants.
Li, Xinguo; Wu, Harry X; Southerton, Simon G
2010-06-21
Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution.
Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants
2010-01-01
Background Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. Results The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conclusions Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution. PMID:20565927
Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard
2013-01-01
Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088
Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard
2013-01-01
Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.
Comparative transcriptomics of early dipteran development
2013-01-01
Background Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and the scuttle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies). PMID:23432914
BLIND ordering of large-scale transcriptomic developmental timecourses.
Anavy, Leon; Levin, Michal; Khair, Sally; Nakanishi, Nagayasu; Fernandez-Valverde, Selene L; Degnan, Bernard M; Yanai, Itai
2014-03-01
RNA-Seq enables the efficient transcriptome sequencing of many samples from small amounts of material, but the analysis of these data remains challenging. In particular, in developmental studies, RNA-Seq is challenged by the morphological staging of samples, such as embryos, since these often lack clear markers at any particular stage. In such cases, the automatic identification of the stage of a sample would enable previously infeasible experimental designs. Here we present the 'basic linear index determination of transcriptomes' (BLIND) method for ordering samples comprising different developmental stages. The method is an implementation of a traveling salesman algorithm to order the transcriptomes according to their inter-relationships as defined by principal components analysis. To establish the direction of the ordered samples, we show that an appropriate indicator is the entropy of transcriptomic gene expression levels, which increases over developmental time. Using BLIND, we correctly recover the annotated order of previously published embryonic transcriptomic timecourses for frog, mosquito, fly and zebrafish. We further demonstrate the efficacy of BLIND by collecting 59 embryos of the sponge Amphimedon queenslandica and ordering their transcriptomes according to developmental stage. BLIND is thus useful in establishing the temporal order of samples within large datasets and is of particular relevance to the study of organisms with asynchronous development and when morphological staging is difficult.
Transcriptome Analysis at the Single-Cell Level Using SMART Technology.
Fish, Rachel N; Bostick, Magnolia; Lehman, Alisa; Farmer, Andrew
2016-10-10
RNA sequencing (RNA-seq) is a powerful method for analyzing cell state, with minimal bias, and has broad applications within the biological sciences. However, transcriptome analysis of seemingly homogenous cell populations may in fact overlook significant heterogeneity that can be uncovered at the single-cell level. The ultra-low amount of RNA contained in a single cell requires extraordinarily sensitive and reproducible transcriptome analysis methods. As next-generation sequencing (NGS) technologies mature, transcriptome profiling by RNA-seq is increasingly being used to decipher the molecular signature of individual cells. This unit describes an ultra-sensitive and reproducible protocol to generate cDNA and sequencing libraries directly from single cells or RNA inputs ranging from 10 pg to 10 ng. Important considerations for working with minute RNA inputs are given. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
USDA-ARS?s Scientific Manuscript database
In order to investigate the mechanisms of persistent foot-and-mouth disease virus (FMDV) infection in cattle, transcriptome alterations associated with the FMDV carrier state were characterized using a bovine whole-transcriptome microarray. Eighteen cattle (8 vaccinated with a recombinant FMDV A vac...
USDA-ARS?s Scientific Manuscript database
Many species of mites and ticks are of agricultural and medical importance. Much can be learned from the study of transcriptomes of acarines which can generate DNA-sequence information of potential target genes for the control of acarine pests. High throughput transcriptome sequencing can also yie...
RNA-Seq Atlas of Glycine max: a guide to the soybean transcriptome
USDA-ARS?s Scientific Manuscript database
A first analysis of the Glycine max (L.) Merr. (soybean) transcriptome using next generation sequencing technology and RNA-Sequencing (RNA-Seq) is presented. This analysis will provide an important resource for understanding transcription and gene co-regulatory networks in soybean, the most economic...
Swiecicka, Magdalena; Filipecki, Marcin; Lont, Dieuwertje; Van Vliet, Joke; Qin, Ling; Goverse, Aska; Bakker, Jaap; Helder, Johannes
2009-07-01
Plant parasitic nematodes infect roots and trigger the formation of specialized feeding sites by substantial reprogramming of the developmental process of root cells. In this article, we describe the dynamic changes in the tomato root transcriptome during early interactions with the potato cyst nematode Globodera rostochiensis. Using amplified fragment length polymorphism-based mRNA fingerprinting (cDNA-AFLP), we monitored 17 600 transcript-derived fragments (TDFs) in infected and uninfected tomato roots, 1-14 days after inoculation with nematode larvae. Six hundred and twenty-four TDFs (3.5%) showed significant differential expression on nematode infection. We employed GenEST, a computer program which links gene expression profiles generated by cDNA-AFLP and databases of cDNA sequences, to identify 135 tomato sequences. These sequences were grouped into eight functional categories based on the presence of genes involved in hormone regulation, plant pathogen defence response, cell cycle and cytoskeleton regulation, cell wall modification, cellular signalling, transcriptional regulation, primary metabolism and allocation. The presence of unclassified genes was also taken into consideration. This article describes the responsiveness of numerous tomato genes hitherto uncharacterized during infection with endoparasitic cyst nematodes. The analysis of transcriptome profiles allowed the sequential order of expression to be dissected for many groups of genes and the genes to be connected with the biological processes involved in compatible interactions between the plant and nematode.
Fasoli, Marianna; Dal Santo, Silvia; Zenoni, Sara; Tornielli, Giovanni Battista; Farina, Lorenzo; Zamboni, Anita; Porceddu, Andrea; Venturini, Luca; Bicego, Manuele; Murino, Vittorio; Ferrarini, Alberto; Delledonne, Massimo; Pezzotti, Mario
2012-09-01
We developed a genome-wide transcriptomic atlas of grapevine (Vitis vinifera) based on 54 samples representing green and woody tissues and organs at different developmental stages as well as specialized tissues such as pollen and senescent leaves. Together, these samples expressed ∼91% of the predicted grapevine genes. Pollen and senescent leaves had unique transcriptomes reflecting their specialized functions and physiological status. However, microarray and RNA-seq analysis grouped all the other samples into two major classes based on maturity rather than organ identity, namely, the vegetative/green and mature/woody categories. This division represents a fundamental transcriptomic reprogramming during the maturation process and was highlighted by three statistical approaches identifying the transcriptional relationships among samples (correlation analysis), putative biomarkers (O2PLS-DA approach), and sets of strongly and consistently expressed genes that define groups (topics) of similar samples (biclustering analysis). Gene coexpression analysis indicated that the mature/woody developmental program results from the reiterative coactivation of pathways that are largely inactive in vegetative/green tissues, often involving the coregulation of clusters of neighboring genes and global regulation based on codon preference. This global transcriptomic reprogramming during maturation has not been observed in herbaceous annual species and may be a defining characteristic of perennial woody plants.
Chan, Kuang-Lim; Rosli, Rozana; Tatarinova, Tatiana V; Hogan, Michael; Firdaus-Raih, Mohd; Low, Eng-Ti Leslie
2017-01-27
Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion. We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure). Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.
Hara, Yuichiro; Tatsumi, Kaori; Yoshida, Michio; Kajikawa, Eriko; Kiyonari, Hiroshi; Kuraku, Shigehiro
2015-11-18
RNA-seq enables gene expression profiling in selected spatiotemporal windows and yields massive sequence information with relatively low cost and time investment, even for non-model species. However, there remains a large room for optimizing its workflow, in order to take full advantage of continuously developing sequencing capacity. Transcriptome sequencing for three embryonic stages of Madagascar ground gecko (Paroedura picta) was performed with the Illumina platform. The output reads were assembled de novo for reconstructing transcript sequences. In order to evaluate the completeness of transcriptome assemblies, we prepared a reference gene set consisting of vertebrate one-to-one orthologs. To take advantage of increased read length of >150 nt, we demonstrated shortened RNA fragmentation time, which resulted in a dramatic shift of insert size distribution. To evaluate products of multiple de novo assembly runs incorporating reads with different RNA sources, read lengths, and insert sizes, we introduce a new reference gene set, core vertebrate genes (CVG), consisting of 233 genes that are shared as one-to-one orthologs by all vertebrate genomes examined (29 species)., The completeness assessment performed by the computational pipelines CEGMA and BUSCO referring to CVG, demonstrated higher accuracy and resolution than with the gene set previously established for this purpose. As a result of the assessment with CVG, we have derived the most comprehensive transcript sequence set of the Madagascar ground gecko by means of assembling individual libraries followed by clustering the assembled sequences based on their overall similarities. Our results provide several insights into optimizing de novo RNA-seq workflow, including the coordination between library insert size and read length, which manifested in improved connectivity of assemblies. The approach and assembly assessment with CVG demonstrated here would be applicable to transcriptome analysis of other species as well as whole genome analyses.
Elucidating and mining the Tulipa and Lilium transcriptomes.
Moreno-Pachon, Natalia M; Leeggangers, Hendrika A C F; Nijveen, Harm; Severing, Edouard; Hilhorst, Henk; Immink, Richard G H
2016-10-01
Genome sequencing remains a challenge for species with large and complex genomes containing extensive repetitive sequences, of which the bulbous and monocotyledonous plants tulip and lily are examples. In such a case, sequencing of only the active part of the genome, represented by the transcriptome, is a good alternative to obtain information about gene content. In this study we aimed to generate a high quality transcriptome of tulip and lily and to make this data available as an open-access resource via a user-friendly web-based interface. The Illumina HiSeq 2000 platform was applied and the transcribed RNA was sequenced from a collection of different lily and tulip tissues, respectively. In order to obtain good transcriptome coverage and to facilitate effective data mining, assembly was done using different filtering parameters for clearing out contamination and noise of the RNAseq datasets. This analysis revealed limitations of commonly applied methods and parameter settings used in de novo transcriptome assembly. The final created transcriptomes are publicly available via a user friendly Transcriptome browser ( http://www.bioinformatics.nl/bulbs/db/species/index ). The usefulness of this resource has been exemplified by a search for all potential transcription factors in lily and tulip, with special focus on the TCP transcription factor family. This analysis and other quality parameters point out the quality of the transcriptomes, which can serve as a basis for further genomics studies in lily, tulip, and bulbous plants in general.
USDA-ARS?s Scientific Manuscript database
Drought tolerance is a complex trait that is governed by multiple genes. To identify the potential candidate genes, comparative analysis of drought stress-responsive transcriptome between drought-tolerant (Triticum aestivum Cv. C306) and drought-sensitive (Triticum aestivum Cv. WL711) genotypes was ...
USDA-ARS?s Scientific Manuscript database
Identification of genes with differential transcript abundance (GDTA) in seedless mutants may enhance understanding of seedless citrus development. Transcriptome analysis was conducted at three time points during early fruit development (Phase 1) of three seedy citrus genotypes: Fallglo [Bower citru...
Brown, Roger B; Madrid, Nathaniel J; Suzuki, Hideaki; Ness, Scott A
2017-01-01
RNA-sequencing (RNA-seq) has become the standard method for unbiased analysis of gene expression but also provides access to more complex transcriptome features, including alternative RNA splicing, RNA editing, and even detection of fusion transcripts formed through chromosomal translocations. However, differences in library methods can adversely affect the ability to recover these different types of transcriptome data. For example, some methods have bias for one end of transcripts or rely on low-efficiency steps that limit the complexity of the resulting library, making detection of rare transcripts less likely. We tested several commonly used methods of RNA-seq library preparation and found vast differences in the detection of advanced transcriptome features, such as alternatively spliced isoforms and RNA editing sites. By comparing several different protocols available for the Ion Proton sequencer and by utilizing detailed bioinformatics analysis tools, we were able to develop an optimized random primer based RNA-seq technique that is reliable at uncovering rare transcript isoforms and RNA editing features, as well as fusion reads from oncogenic chromosome rearrangements. The combination of optimized libraries and rapid Ion Proton sequencing provides a powerful platform for the transcriptome analysis of research and clinical samples.
Houshyani, Benyamin; van der Krol, Alexander R; Bino, Raoul J; Bouwmeester, Harro J
2014-06-19
Molecular characterization is an essential step of risk/safety assessment of genetically modified (GM) crops. Holistic approaches for molecular characterization using omics platforms can be used to confirm the intended impact of the genetic engineering, but can also reveal the unintended changes at the omics level as a first assessment of potential risks. The potential of omics platforms for risk assessment of GM crops has rarely been used for this purpose because of the lack of a consensus reference and statistical methods to judge the significance or importance of the pleiotropic changes in GM plants. Here we propose a meta data analysis approach to the analysis of GM plants, by measuring the transcriptome distance to untransformed wild-types. In the statistical analysis of the transcriptome distance between GM and wild-type plants, values are compared with naturally occurring transcriptome distances in non-GM counterparts obtained from a database. Using this approach we show that the pleiotropic effect of genes involved in indirect insect defence traits is substantially equivalent to the variation in gene expression occurring naturally in Arabidopsis. Transcriptome distance is a useful screening method to obtain insight in the pleiotropic effects of genetic modification.
Computational Selection of Transcriptomics Experiments Improves Guilt-by-Association Analyses
Bhat, Prajwal; Yang, Haixuan; Bögre, László; Devoto, Alessandra; Paccanaro, Alberto
2012-01-01
The Guilt-by-Association (GBA) principle, according to which genes with similar expression profiles are functionally associated, is widely applied for functional analyses using large heterogeneous collections of transcriptomics data. However, the use of such large collections could hamper GBA functional analysis for genes whose expression is condition specific. In these cases a smaller set of condition related experiments should instead be used, but identifying such functionally relevant experiments from large collections based on literature knowledge alone is an impractical task. We begin this paper by analyzing, both from a mathematical and a biological point of view, why only condition specific experiments should be used in GBA functional analysis. We are able to show that this phenomenon is independent of the functional categorization scheme and of the organisms being analyzed. We then present a semi-supervised algorithm that can select functionally relevant experiments from large collections of transcriptomics experiments. Our algorithm is able to select experiments relevant to a given GO term, MIPS FunCat term or even KEGG pathways. We extensively test our algorithm on large dataset collections for yeast and Arabidopsis. We demonstrate that: using the selected experiments there is a statistically significant improvement in correlation between genes in the functional category of interest; the selected experiments improve GBA-based gene function prediction; the effectiveness of the selected experiments increases with annotation specificity; our algorithm can be successfully applied to GBA-based pathway reconstruction. Importantly, the set of experiments selected by the algorithm reflects the existing literature knowledge about the experiments. [A MATLAB implementation of the algorithm and all the data used in this paper can be downloaded from the paper website: http://www.paccanarolab.org/papers/CorrGene/]. PMID:22879875
Choi, Sun Young; Park, Byeonghyeok; Choi, In-Geol; Sim, Sang Jun; Lee, Sun-Mi; Um, Youngsoon; Woo, Han Min
2016-01-01
The development of high-throughput technology using RNA-seq has allowed understanding of cellular mechanisms and regulations of bacterial transcription. In addition, transcriptome analysis with RNA-seq has been used to accelerate strain improvement through systems metabolic engineering. Synechococcus elongatus PCC 7942, a photosynthetic bacterium, has remarkable potential for biochemical and biofuel production due to photoautotrophic cell growth and direct CO2 conversion. Here, we performed a transcriptome analysis of S. elongatus PCC 7942 using RNA-seq to understand the changes of cellular metabolism and regulation for nitrogen starvation responses. As a result, differentially expressed genes (DEGs) were identified and functionally categorized. With mapping onto metabolic pathways, we probed transcriptional perturbation and regulation of carbon and nitrogen metabolisms relating to nitrogen starvation responses. Experimental evidence such as chlorophyll a and phycobilisome content and the measurement of CO2 uptake rate validated the transcriptome analysis. The analysis suggests that S. elongatus PCC 7942 reacts to nitrogen starvation by not only rearranging the cellular transport capacity involved in carbon and nitrogen assimilation pathways but also by reducing protein synthesis and photosynthesis activities. PMID:27488818
Transcriptomics of cortical gray matter thickness decline during normal aging
Kochunov, P; Charlesworth, J; Winkler, A; Hong, LE; Nichols, T; Curran, JE; Sprooten, E; Jahanshad, N; Thompson, PM; Johnson, MP; Kent, JW; Landman, BA; Mitchell, B; Cole, SA; Dyer, TD; Moses, EK; Goring, HHH; Almasy, L; Duggirala, R; Olvera, RL; Glahn, DC; Blangero, J
2013-01-01
Introduction We performed a whole-transcriptome correlation analysis, followed by the pathway enrichment and testing of innate immune response pathways analyses to evaluate the hypothesis that transcriptional activity can predict cortical gray matter thickness (GMT) variability during normal cerebral aging Methods Transcriptome and GMT data were availabe for 379 individuals (age range=28–85) community-dwelling members of large extended Mexican-American families. Collection of transcriptome data preceded that of neuroimaging data by 17 years. Genome-wide gene transcriptome data consisted of 20,413 heritable lymphocytes-based transcripts. GMT measurements were performed from high-resolution (isotropic 800µm) T1-weighted MRI. Transcriptome-wide and pathway enrichment analysis was used to classify genes correlated with GMT. Transcripts for sixty genes from seven innate immune pathways were tested as specific predictors of GMT variability. Results Transcripts for eight genes (IGFBP3, LRRN3, CRIP2, SCD, IDS, TCF4, GATA3, HN1) passed the transcriptome-wide significance threshold. Four orthogonal factors extracted from this set predicted 31.9% of the variability in the whole-brain and between 23.4 and 35% of regional GMT measurements. Pathway enrichment analysis identified six functional categories including cellular proliferation, aggregation, differentiation, viral infection, and metabolism. The integrin signaling pathway was significantly (p<10−6) enriched with GMT. Finally, three innate immune pathways (complement signaling, toll-receptors and scavenger and immunoglobulins) were significantly associated with GMT. Conclusion Expression activity for the genes that regulate cellular proliferation, adhesion, differentiation and inflammation can explain a significant proportion of individual variability in cortical GMT. Our findings suggest that normal cerebral aging is the product of a progressive decline in regenerative capacity and increased neuroinflammation. PMID:23707588
Transcriptomics of cortical gray matter thickness decline during normal aging.
Kochunov, P; Charlesworth, J; Winkler, A; Hong, L E; Nichols, T E; Curran, J E; Sprooten, E; Jahanshad, N; Thompson, P M; Johnson, M P; Kent, J W; Landman, B A; Mitchell, B; Cole, S A; Dyer, T D; Moses, E K; Goring, H H H; Almasy, L; Duggirala, R; Olvera, R L; Glahn, D C; Blangero, J
2013-11-15
We performed a whole-transcriptome correlation analysis, followed by the pathway enrichment and testing of innate immune response pathway analyses to evaluate the hypothesis that transcriptional activity can predict cortical gray matter thickness (GMT) variability during normal cerebral aging. Transcriptome and GMT data were available for 379 individuals (age range=28-85) community-dwelling members of large extended Mexican American families. Collection of transcriptome data preceded that of neuroimaging data by 17 years. Genome-wide gene transcriptome data consisted of 20,413 heritable lymphocytes-based transcripts. GMT measurements were performed from high-resolution (isotropic 800 μm) T1-weighted MRI. Transcriptome-wide and pathway enrichment analysis was used to classify genes correlated with GMT. Transcripts for sixty genes from seven innate immune pathways were tested as specific predictors of GMT variability. Transcripts for eight genes (IGFBP3, LRRN3, CRIP2, SCD, IDS, TCF4, GATA3, and HN1) passed the transcriptome-wide significance threshold. Four orthogonal factors extracted from this set predicted 31.9% of the variability in the whole-brain and between 23.4 and 35% of regional GMT measurements. Pathway enrichment analysis identified six functional categories including cellular proliferation, aggregation, differentiation, viral infection, and metabolism. The integrin signaling pathway was significantly (p<10(-6)) enriched with GMT. Finally, three innate immune pathways (complement signaling, toll-receptors and scavenger and immunoglobulins) were significantly associated with GMT. Expression activity for the genes that regulate cellular proliferation, adhesion, differentiation and inflammation can explain a significant proportion of individual variability in cortical GMT. Our findings suggest that normal cerebral aging is the product of a progressive decline in regenerative capacity and increased neuroinflammation. Copyright © 2013 Elsevier Inc. All rights reserved.
Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes.
Kumar, Vikas; Kutschera, Verena E; Nilsson, Maria A; Janke, Axel
2015-08-07
The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated and analyzed for signatures of positive selection. In addition, the data allowed for a phylogenetic analysis and divergence time estimate between the two fox species. The de novo assembly of reads resulted in more than 160,000 contigs/transcripts per individual. Approximately 17,000 homologous genes were identified using human and the non-redundant databases. Positive selection analyses revealed several genes involved in various metabolic and molecular processes such as energy metabolism, cardiac gene regulation, apoptosis and blood coagulation to be under positive selection in foxes. Branch site tests identified four genes to be under positive selection in the Arctic fox transcriptome, two of which are fat metabolism genes. In the red fox transcriptome eight genes are under positive selection, including molecular process genes, notably genes involved in ATP metabolism. Analysis of the three transcriptomes and five Sanger re-sequenced genes in additional individuals identified a lower genetic variability within Arctic foxes compared to red foxes, which is consistent with distribution range differences and demographic responses to past climatic fluctuations. A phylogenomic analysis estimated that the Arctic and red fox lineages diverged about three million years ago. Transcriptome data are an economic way to generate genomic resources for evolutionary studies. Despite not representing an entire genome, this transcriptome analysis identified numerous genes that are relevant to arctic adaptation in foxes. Similar to polar bears, fat metabolism seems to play a central role in adaptation of Arctic foxes to the cold climate, as has been identified in the polar bear, another arctic specialist.
DOGMA: domain-based transcriptome and proteome quality assessment.
Dohmen, Elias; Kremer, Lukas P M; Bornberg-Bauer, Erich; Kemena, Carsten
2016-09-01
Genome studies have become cheaper and easier than ever before, due to the decreased costs of high-throughput sequencing and the free availability of analysis software. However, the quality of genome or transcriptome assemblies can vary a lot. Therefore, quality assessment of assemblies and annotations are crucial aspects of genome analysis pipelines. We developed DOGMA, a program for fast and easy quality assessment of transcriptome and proteome data based on conserved protein domains. DOGMA measures the completeness of a given transcriptome or proteome and provides information about domain content for further analysis. DOGMA provides a very fast way to do quality assessment within seconds. DOGMA is implemented in Python and published under GNU GPL v.3 license. The source code is available on https://ebbgit.uni-muenster.de/domainWorld/DOGMA/ CONTACTS: e.dohmen@wwu.de or c.kemena@wwu.de Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Novel Insights into the Transcriptome of Dirofilaria immitis
Zhang, Zhihe; Hou, Rong; Wu, Xuhang; Yang, Deying; Zhang, Runhui; Zheng, Wanpeng; Nie, Huaming; Xie, Yue; Yan, Ning; Yang, Zhi; Wang, Chengdong; Luo, Li; Liu, Li; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou
2012-01-01
Background The heartworm Dirofilaria immitis is the causal agent of cardiopulmonary dirofilariosis in dogs and cats, and also infects a wide range of wild mammals as well as humans. One bottleneck for the design of fundamentally new intervention and management strategies against D. immitis may be the currently limited knowledge of fundamental molecular aspects of D. immitis. Methodology/Principal Findings A next-generation sequencing platform combining computational approaches was employed to assess a global view of the heartworm transcriptome. A total of 20,810 unigenes (mean length = 1,270 bp) were assembled from 22.3 million clean reads. From these, 15,698 coding sequences (CDS) were inferred, and about 85% of the unigenes had orthologs/homologs in public databases. Comparative transcriptomic study uncovered 4,157 filarial-specific genes as well as 3,795 genes potentially involved in filarial-Wolbachia symbiosis. In addition, the potential intestine transcriptome of D. immitis (1,101 genes) was mined for the first time, which might help to discover ‘hidden antigens’. Conclusions/Significance This study provides novel insights into the transcriptome of D. immitis and sheds light on its molecular processes and survival mechanisms. Furthermore, it provides a platform to discover new vaccine candidates and potential targets for new drugs against dirofilariosis. PMID:22911833
Hsiang, Chien-Yun; Chen, Yueh-Sheng; Ho, Tin-Yun
2009-06-01
Establishment of a comprehensive platform for the assessment of host-biomaterial interaction in vivo is an important issue. Nuclear factor-kappaB (NF-kappaB) is an inducible transcription factor that is activated by numerous stimuli. Therefore, NF-kappaB-dependent luminescent signal in transgenic mice carrying the luciferase genes was used as the guide to monitor the biomaterials-affected organs, and transcriptomic analysis was further applied to evaluate the complex host responses in affected organs in this study. In vivo imaging showed that genipin-cross-linked gelatin conduit (GGC) implantation evoked the strong NF-kappaB activity at 6h in the implanted region, and transcriptomic analysis showed that the expressions of interleukin-6 (IL-6), IL-24, and IL-1 family were up-regulated. A strong luminescent signal was observed in spleen on 14 d, suggesting that GGC implantation might elicit the biological events in spleen. Transcriptomic analysis of spleen showed that 13 Kyoto Encyclopedia of Genes and Genomes pathways belonging to cell cycles, immune responses, and metabolism were significantly altered by GGC implants. Connectivity Map analysis suggested that the gene signatures of GGC were similar to those of compounds that affect lipid or glucose metabolism. GeneSetTest analysis further showed that host responses to GGC implants might be related to diseases states, especially the metabolic and cardiovascular diseases. In conclusion, our data provided a concept of molecular imaging-guided transcriptomic platform for the evaluation and the prediction of host-biomaterial interaction in vivo.
Necklace: combining reference and assembled transcriptomes for more comprehensive RNA-Seq analysis.
Davidson, Nadia M; Oshlack, Alicia
2018-05-01
RNA sequencing (RNA-seq) analyses can benefit from performing a genome-guided and de novo assembly, in particular for species where the reference genome or the annotation is incomplete. However, tools for integrating an assembled transcriptome with reference annotation are lacking. Necklace is a software pipeline that runs genome-guided and de novo assembly and combines the resulting transcriptomes with reference genome annotations. Necklace constructs a compact but comprehensive superTranscriptome out of the assembled and reference data. Reads are subsequently aligned and counted in preparation for differential expression testing. Necklace allows a comprehensive transcriptome to be built from a combination of assembled and annotated transcripts, which results in a more comprehensive transcriptome for the majority of organisms. In addition RNA-seq data are mapped back to this newly created superTranscript reference to enable differential expression testing with standard methods.
Perron, Gabrielle; Jandaghi, Pouria; Solanki, Shraddha; Safisamghabadi, Maryam; Storoz, Cristina; Karimzadeh, Mehran; Papadakis, Andreas I; Arseneault, Madeleine; Scelo, Ghislaine; Banks, Rosamonde E; Tost, Jorg; Lathrop, Mark; Tanguay, Simon; Brazma, Alvis; Huang, Sidong; Brimo, Fadi; Najafabadi, Hamed S; Riazalhosseini, Yasser
2018-05-08
Widespread remodeling of the transcriptome is a signature of cancer; however, little is known about the post-transcriptional regulatory factors, including RNA-binding proteins (RBPs) that regulate mRNA stability, and the extent to which RBPs contribute to cancer-associated pathways. Here, by modeling the global change in gene expression based on the effect of sequence-specific RBPs on mRNA stability, we show that RBP-mediated stability programs are recurrently deregulated in cancerous tissues. Particularly, we uncovered several RBPs that contribute to the abnormal transcriptome of renal cell carcinoma (RCC), including PCBP2, ESRP2, and MBNL2. Modulation of these proteins in cancer cell lines alters the expression of pathways that are central to the disease and highlights RBPs as driving master regulators of RCC transcriptome. This study presents a framework for the screening of RBP activities based on computational modeling of mRNA stability programs in cancer and highlights the role of post-transcriptional gene dysregulation in RCC. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
A transcriptome atlas of rabbit revealed by PacBio single-molecule long-read sequencing.
Chen, Shi-Yi; Deng, Feilong; Jia, Xianbo; Li, Cao; Lai, Song-Jia
2017-08-09
It is widely acknowledged that transcriptional diversity largely contributes to biological regulation in eukaryotes. Since the advent of second-generation sequencing technologies, a large number of RNA sequencing studies have considerably improved our understanding of transcriptome complexity. However, it still remains a huge challenge for obtaining full-length transcripts because of difficulties in the short read-based assembly. In the present study we employ PacBio single-molecule long-read sequencing technology for whole-transcriptome profiling in rabbit (Oryctolagus cuniculus). We totally obtain 36,186 high-confidence transcripts from 14,474 genic loci, among which more than 23% of genic loci and 66% of isoforms have not been annotated yet within the current reference genome. Furthermore, about 17% of transcripts are computationally revealed to be non-coding RNAs. Up to 24,797 alternative splicing (AS) and 11,184 alternative polyadenylation (APA) events are detected within this de novo constructed transcriptome, respectively. The results provide a comprehensive set of reference transcripts and hence contribute to the improved annotation of rabbit genome.
Yao, Peng; Potdar, Alka A.; Arif, Abul; Ray, Partho Sarothi; Mukhopadhyay, Rupak; Willard, Belinda; Xu, Yichi; Yan, Jun; Saidel, Gerald M.; Fox, Paul L.
2012-01-01
SUMMARY Post-transcriptional regulatory mechanisms superimpose “fine-tuning” control upon “on-off” switches characteristic of gene transcription. We have exploited computational modeling with experimental validation to resolve an anomalous relationship between mRNA expression and protein synthesis. Differential GAIT (Gamma-interferon Activated Inhibitor of Translation) complex activation repressed VEGF-A synthesis to a low, constant rate despite high, variable VEGFA mRNA expression. Dynamic model simulations indicated the presence of an unidentified, inhibitory GAIT element-interacting factor. We discovered a truncated form of glutamyl-prolyl tRNA synthetase (EPRS), the GAIT constituent that binds the 3’-UTR GAIT element in target transcripts. The truncated protein, EPRSN1, prevents binding of functional GAIT complex. EPRSN1 mRNA is generated by a remarkable polyadenylation-directed conversion of a Tyr codon in the EPRS coding sequence to a stop codon (PAY*). By low-level protection of GAIT element-bearing transcripts, EPRSN1 imposes a robust “translational trickle” of target protein expression. Genome-wide analysis shows PAY* generates multiple truncated transcripts thereby contributing to transcriptome expansion. PMID:22386318
Stempler, Shiri; Yizhak, Keren; Ruppin, Eytan
2014-01-01
Accumulating evidence links numerous abnormalities in cerebral metabolism with the progression of Alzheimer's disease (AD), beginning in its early stages. Here, we integrate transcriptomic data from AD patients with a genome-scale computational human metabolic model to characterize the altered metabolism in AD, and employ state-of-the-art metabolic modelling methods to predict metabolic biomarkers and drug targets in AD. The metabolic descriptions derived are first tested and validated on a large scale versus existing AD proteomics and metabolomics data. Our analysis shows a significant decrease in the activity of several key metabolic pathways, including the carnitine shuttle, folate metabolism and mitochondrial transport. We predict several metabolic biomarkers of AD progression in the blood and the CSF, including succinate and prostaglandin D2. Vitamin D and steroid metabolism pathways are enriched with predicted drug targets that could mitigate the metabolic alterations observed. Taken together, this study provides the first network wide view of the metabolic alterations associated with AD progression. Most importantly, it offers a cohort of new metabolic leads for the diagnosis of AD and its treatment. PMID:25127241
Celedon, Jose M; Yuen, Macaire M S; Chiang, Angela; Henderson, Hannah; Reid, Karen E; Bohlmann, Jörg
2017-11-01
Plant defenses often involve specialized cells and tissues. In conifers, specialized cells of the bark are important for defense against insects and pathogens. Using laser microdissection, we characterized the transcriptomes of cortical resin duct cells, phenolic cells and phloem of white spruce (Picea glauca) bark under constitutive and methyl jasmonate (MeJa)-induced conditions, and we compared these transcriptomes with the transcriptome of the bark tissue complex. Overall, ~3700 bark transcripts were differentially expressed in response to MeJa. Approximately 25% of transcripts were expressed in only one cell type, revealing cell specialization at the transcriptome level. MeJa caused cell-type-specific transcriptome responses and changed the overall patterns of cell-type-specific transcript accumulation. Comparison of transcriptomes of the conifer bark tissue complex and specialized cells resolved a masking effect inherent to transcriptome analysis of complex tissues, and showed the actual cell-type-specific transcriptome signatures. Characterization of cell-type-specific transcriptomes is critical to reveal the dynamic patterns of spatial and temporal display of constitutive and induced defense systems in a complex plant tissue or organ. This was demonstrated with the improved resolution of spatially restricted expression of sets of genes of secondary metabolism in the specialized cell types. © 2017 The Authors The Plant Journal published by John Wiley & Sons Ltd and Society for Experimental Biology.
Cheng, Yunqing; Liu, Jianfeng; Zhang, Huidi; Wang, Ju; Zhao, Yixin; Geng, Wanting
2015-01-01
A high ratio of blank fruit in hazelnut (Corylus heterophylla Fisch) is a very common phenomenon that causes serious yield losses in northeast China. The development of blank fruit in the Corylus genus is known to be associated with embryo abortion. However, little is known about the molecular mechanisms responsible for embryo abortion during the nut development stage. Genomic information for C. heterophylla Fisch is not available; therefore, data related to transcriptome and gene expression profiling of developing and abortive ovules are needed. In this study, de novo transcriptome sequencing and RNA-seq analysis were conducted using short-read sequencing technology (Illumina HiSeq 2000). The results of the transcriptome assembly analysis revealed genetic information that was associated with the fruit development stage. Two digital gene expression libraries were constructed, one for a full (normally developing) ovule and one for an empty (abortive) ovule. Transcriptome sequencing and assembly results revealed 55,353 unigenes, including 18,751 clusters and 36,602 singletons. These results were annotated using the public databases NR, NT, Swiss-Prot, KEGG, COG, and GO. Using digital gene expression profiling, gene expression differences in developing and abortive ovules were identified. A total of 1,637 and 715 unigenes were significantly upregulated and downregulated, respectively, in abortive ovules, compared with developing ovules. Quantitative real-time polymerase chain reaction analysis was used in order to verify the differential expression of some genes. The transcriptome and digital gene expression profiling data of normally developing and abortive ovules in hazelnut provide exhaustive information that will improve our understanding of the molecular mechanisms of abortive ovule formation in hazelnut.
Cho, Byuri Angela; Yoo, Seong-Keun; Song, Young Shin; Kim, Su-jin; Lee, Kyu Eun; Shong, Minho
2018-01-01
Background: Elucidating aging-related transcriptomic changes in human organs is necessary to understand the aging physiology and mechanisms, but little is known regarding the thyroid gland. We investigated aging-related transcriptomic alterations in the human thyroid gland and characterized the related molecular functions. Methods: Publicly available RNA sequencing data of 322 thyroid tissue samples from the Genotype-Tissue Expression project were analyzed. In addition, our own 64 RNA sequencing data of normal thyroid tissue samples were used as a validation set. To comprehensively evaluate the associations between aging and transcriptomic changes, we performed a weighted gene coexpression network analysis and pathway enrichment analysis. The thyroid differentiation score was then used for further analysis, defining the correlations between thyroid differentiation and aging. Results: The most significant aging-related transcriptomic change in thyroid was the downregulation of genes related to the mitochondrial and proteasomal functions (p = 3 × 10−6). Moreover, genes that are associated with immune processes were significantly upregulated with age (p = 3 × 10−4), and all of them overlapped with the upregulated genes in the thyroid glands affected by lymphocytic thyroiditis. Furthermore, these aging-related changes were not significantly different according to sex, but in terms of the thyroid differentiation, females were more susceptible to aging-related changes (p for trend = 0.03). Conclusions: Aging-related transcriptomic changes in the thyroid gland were associated with mitochondrial and proteasomal dysfunction, loss of differentiation, and activation of autoimmune processes. Our results provide clues to better understanding the age-related decline in thyroid function and higher susceptibility to autoimmune thyroid disease. PMID:29652618
Impact of Transcriptomics on Our Understanding of Pulmonary Fibrosis
Vukmirovic, Milica; Kaminski, Naftali
2018-01-01
Idiopathic pulmonary fibrosis (IPF) is a lethal fibrotic lung disease characterized by aberrant remodeling of the lung parenchyma with extensive changes to the phenotypes of all lung resident cells. The introduction of transcriptomics, genome scale profiling of thousands of RNA transcripts, caused a significant inversion in IPF research. Instead of generating hypotheses based on animal models of disease, or biological plausibility, with limited validation in humans, investigators were able to generate hypotheses based on unbiased molecular analysis of human samples and then use animal models of disease to test their hypotheses. In this review, we describe the insights made from transcriptomic analysis of human IPF samples. We describe how transcriptomic studies led to identification of novel genes and pathways involved in the human IPF lung such as: matrix metalloproteinases, WNT pathway, epithelial genes, role of microRNAs among others, as well as conceptual insights such as the involvement of developmental pathways and deep shifts in epithelial and fibroblast phenotypes. The impact of lung and transcriptomic studies on disease classification, endotype discovery, and reproducible biomarkers is also described in detail. Despite these impressive achievements, the impact of transcriptomic studies has been limited because they analyzed bulk tissue and did not address the cellular and spatial heterogeneity of the IPF lung. We discuss new emerging technologies and applications, such as single-cell RNAseq and microenvironment analysis that may address cellular and spatial heterogeneity. We end by making the point that most current tissue collections and resources are not amenable to analysis using the novel technologies. To take advantage of the new opportunities, we need new efforts of sample collections, this time focused on access to all the microenvironments and cells in the IPF lung. PMID:29670881
Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie; Zhang, Gong
2018-01-04
Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Phelix, C F; Feltus, F A
2015-01-01
Measuring biomarkers from plant tissue samples is challenging and expensive when the desire is to integrate transcriptomics, fluxomics, metabolomics, lipidomics, proteomics, physiomics and phenomics. We present a computational biology method where only the transcriptome needs to be measured and is used to derive a set of parameters for deterministic kinetic models of metabolic pathways. The technology is called Transcriptome-To-Metabolome (TTM) biosimulations, currently under commercial development, but available for non-commercial use by researchers. The simulated results on metabolites of 30 primary and secondary metabolic pathways in rice (Oryza sativa) were used as the biomarkers to predict whether the transcriptome was from a plant that had been under drought conditions. The rice transcriptomes were accessed from public archives and each individual plant was simulated. This unique quality of the TTM technology allows standard analyses on biomarker assessments, i.e. sensitivity, specificity, positive and negative predictive values, accuracy, receiver operator characteristics (ROC) curve and area under the ROC curve (AUC). Two validation methods were also used, the holdout and 10-fold cross validations. Initially 17 metabolites were identified as candidate biomarkers based on either statistical significance on binary phenotype when compared with control samples or recognition from the literature. The top three biomarkers based on AUC were gibberellic acid 12 (0.89), trehalose (0.80) and sn1-palmitate-sn2-oleic-phosphatidylglycerol (0.70). Neither heat map analyses of transcriptomes nor all 300 metabolites clustered the stressed and control groups effectively. The TTM technology allows the emergent properties of the integrated system to generate unique and useful 'Omics' information. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Epigenetic transgenerational inheritance of somatic transcriptomes and epigenetic control regions
2012-01-01
Background Environmentally induced epigenetic transgenerational inheritance of adult onset disease involves a variety of phenotypic changes, suggesting a general alteration in genome activity. Results Investigation of different tissue transcriptomes in male and female F3 generation vinclozolin versus control lineage rats demonstrated all tissues examined had transgenerational transcriptomes. The microarrays from 11 different tissues were compared with a gene bionetwork analysis. Although each tissue transgenerational transcriptome was unique, common cellular pathways and processes were identified between the tissues. A cluster analysis identified gene modules with coordinated gene expression and each had unique gene networks regulating tissue-specific gene expression and function. A large number of statistically significant over-represented clusters of genes were identified in the genome for both males and females. These gene clusters ranged from 2-5 megabases in size, and a number of them corresponded to the epimutations previously identified in sperm that transmit the epigenetic transgenerational inheritance of disease phenotypes. Conclusions Combined observations demonstrate that all tissues derived from the epigenetically altered germ line develop transgenerational transcriptomes unique to the tissue, but common epigenetic control regions in the genome may coordinately regulate these tissue-specific transcriptomes. This systems biology approach provides insight into the molecular mechanisms involved in the epigenetic transgenerational inheritance of a variety of adult onset disease phenotypes. PMID:23034163
Global Transcriptome Analysis of Staphylococcus aureus Response to Hydrogen Peroxide†
Chang, Wook; Small, David A.; Toghrol, Freshteh; Bentley, William E.
2006-01-01
Staphylococcus aureus responds with protective strategies against phagocyte-derived reactive oxidants to infect humans. Herein, we report the transcriptome analysis of the cellular response of S. aureus to hydrogen peroxide-induced oxidative stress. The data indicate that the oxidative response includes the induction of genes involved in virulence, DNA repair, and notably, anaerobic metabolism. PMID:16452450
Mallik, Saurav; Zhao, Zhongming
2017-12-28
For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures-weighted rank-based Jaccard and Cosine measures-and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm-RANWAR-was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.
Transcriptomic Analysis of Phenotypic Changes in Birch (Betula platyphylla) Autotetraploids
Mu, Huai-Zhi; Liu, Zi-Jia; Lin, Lin; Li, Hui-Yu; Jiang, Jing; Liu, Gui-Feng
2012-01-01
Plant breeders have focused much attention on polyploid trees because of their importance to forestry. To evaluate the impact of intraspecies genome duplication on the transcriptome, a series of Betula platyphylla autotetraploids and diploids were generated from four full-sib families. The phenotypes and transcriptomes of these autotetraploid individuals were compared with those of diploid trees. Autotetraploids were generally superior in breast-height diameter, volume, leaf, fruit and stoma and were generally inferior in height compared to diploids. Transcriptome data revealed numerous changes in gene expression attributable to autotetraploidization, which resulted in the upregulation of 7052 unigenes and the downregulation of 3658 unigenes. Pathway analysis revealed that the biosynthesis and signal transduction of indoleacetate (IAA) and ethylene were altered after genome duplication, which may have contributed to phenotypic changes. These results shed light on variations in birch autotetraploidization and help identify important genes for the genetic engineering of birch trees. PMID:23202935
De novo Assembly and Analysis of the Chilean Pencil Catfish Trichomycterus areolatus Transcriptome
Schulze, Thomas T.; Ali, Jonathan M.; Bartlett, Maggie L.; McFarland, Madalyn M.; Clement, Emalie J.; Won, Harim I.; Sanford, Austin G.; Monzingo, Elyssa B.; Martens, Matthew C.; Hemsley, Ryan M.; Kumar, Sidharta; Gouin, Nicolas; Kolok, Alan S.; Davis, Paul H.
2016-01-01
Trichomycterus areolatus is an endemic species of pencil catfish that inhabits the riffles and rapids of many freshwater ecosystems of Chile. Despite its unique adaptation to Chile's high gradient watersheds and therefore potential application in the investigation of ecosystem integrity and environmental contamination, relatively little is known regarding the molecular biology of this environmental sentinel. Here, we detail the assembly of the Trichomycterus areolatus transcriptome, a molecular resource for the study of this organism and its molecular response to the environment. RNA-Seq reads were obtained by next-generation sequencing with an Illumina® platform and processed using PRINSEQ. The transcriptome assembly was performed using TRINITY assembler. Transcriptome validation was performed by functional characterization with KOG, KEGG, and GO analyses. Additionally, differential expression analysis highlights sex-specific expression patterns, and a list of endocrine and oxidative stress related transcripts are included. PMID:27672404
Transcriptome analysis by strand-specific sequencing of complementary DNA
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-01-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212
Transcriptome analysis by strand-specific sequencing of complementary DNA.
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-10-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.
Transcriptome-enabled marker discovery and mapping of plastochron-related genes in Petunia spp.
Guo, Yufang; Wiegert-Rininger, Krystle E; Vallejo, Veronica A; Barry, Cornelius S; Warner, Ryan M
2015-09-24
Petunia (Petunia × hybrida), derived from a hybrid between P. axillaris and P. integrifolia, is one of the most economically important bedding plant crops and Petunia spp. serve as model systems for investigating the mechanisms underlying diverse mating systems and pollination syndromes. In addition, we have previously described genetic variation and quantitative trait loci (QTL) related to petunia development rate and morphology, which represent important breeding targets for the floriculture industry to improve crop production and performance. Despite the importance of petunia as a crop, the floriculture industry has been slow to adopt marker assisted selection to facilitate breeding strategies and there remains a limited availability of sequences and molecular markers from the genus compared to other economically important members of the Solanaceae family such as tomato, potato and pepper. Here we report the de novo assembly, annotation and characterization of transcriptomes from P. axillaris, P. exserta and P. integrifolia. Each transcriptome assembly was derived from five tissue libraries (callus, 3-week old seedlings, shoot apices, flowers of mixed developmental stages, and trichomes). A total of 74,573, 54,913, and 104,739 assembled transcripts were recovered from P. axillaris, P. exserta and P. integrifolia, respectively and following removal of multiple isoforms, 32,994 P. axillaris, 30,225 P. exserta, and 33,540 P. integrifolia high quality representative transcripts were extracted for annotation and expression analysis. The transcriptome data was mined for single nucleotide polymorphisms (SNP) and simple sequence repeat (SSR) markers, yielding 89,007 high quality SNPs and 2949 SSRs, respectively. 15,701 SNPs were computationally converted into user-friendly cleaved amplified polymorphic sequence (CAPS) markers and a subset of SNP and CAPS markers were experimentally verified. CAPS markers developed from plastochron-related homologous transcripts from P. axillaris were mapped in an interspecific Petunia population and evaluated for co-localization with QTL for development rate. The high quality of the three Petunia spp. transcriptomes coupled with the utility of the SNP data will serve as a resource for further exploration of genetic diversity within the genus and will facilitate efforts to develop genetic and physical maps to aid the identification of QTL associated with traits of interest.
Todd, Shawn; Boyd, Victoria; Tachedjian, Mary; Klein, Reuben; Shiell, Brian; Dearnley, Megan; McAuley, Alexander J.; Woon, Amanda P.; Purcell, Anthony W.; Marsh, Glenn A.; Baker, Michelle L.
2017-01-01
ABSTRACT Ebolavirus and Marburgvirus comprise two genera of negative-sense single-stranded RNA viruses that cause severe hemorrhagic fevers in humans. Despite considerable research efforts, the molecular events following Ebola virus (EBOV) infection are poorly understood. With the view of identifying host factors that underpin EBOV pathogenesis, we compared the transcriptomes of EBOV-infected human, pig, and bat kidney cells using a transcriptome sequencing (RNA-seq) approach. Despite a significant difference in viral transcription/replication between the cell lines, all cells responded to EBOV infection through a robust induction of extracellular growth factors. Furthermore, a significant upregulation of activator protein 1 (AP1) transcription factor complex members FOS and JUN was observed in permissive cell lines. Functional studies focusing on human cells showed that EBOV infection induces protein expression, phosphorylation, and nuclear accumulation of JUN and, to a lesser degree, FOS. Using a luciferase-based reporter, we show that EBOV infection induces AP1 transactivation activity within human cells at 48 and 72 h postinfection. Finally, we show that JUN knockdown decreases the expression of EBOV-induced host gene expression. Taken together, our study highlights the role of AP1 in promoting the host gene expression profile that defines EBOV pathogenesis. IMPORTANCE Many questions remain about the molecular events that underpin filovirus pathophysiology. The rational design of new intervention strategies, such as postexposure therapeutics, will be significantly enhanced through an in-depth understanding of these molecular events. We believe that new insights into the molecular pathogenesis of EBOV may be possible by examining the transcriptomic response of taxonomically diverse cell lines (derived from human, pig, and bat). We first identified the responsive pathways using an RNA-seq-based transcriptomics approach. Further functional and computational analysis focusing on human cells highlighted an important role for the AP1 transcription factor in mediating the transcriptional response to EBOV infection. Our study sheds new light on how host transcription factors respond to and promote the transcriptional landscape that follows viral infection. PMID:28931675
APAtrap: identification and quantification of alternative polyadenylation sites from RNA-seq data.
Ye, Congting; Long, Yuqi; Ji, Guoli; Li, Qingshun Quinn; Wu, Xiaohui
2018-06-01
Alternative polyadenylation (APA) has been increasingly recognized as a crucial mechanism that contributes to transcriptome diversity and gene expression regulation. As RNA-seq has become a routine protocol for transcriptome analysis, it is of great interest to leverage such unprecedented collection of RNA-seq data by new computational methods to extract and quantify APA dynamics in these transcriptomes. However, research progress in this area has been relatively limited. Conventional methods rely on either transcript assembly to determine transcript 3' ends or annotated poly(A) sites. Moreover, they can neither identify more than two poly(A) sites in a gene nor detect dynamic APA site usage considering more than two poly(A) sites. We developed an approach called APAtrap based on the mean squared error model to identify and quantify APA sites from RNA-seq data. APAtrap is capable of identifying novel 3' UTRs and 3' UTR extensions, which contributes to locating potential poly(A) sites in previously overlooked regions and improving genome annotations. APAtrap also aims to tally all potential poly(A) sites and detect genes with differential APA site usages between conditions. Extensive comparisons of APAtrap with two other latest methods, ChangePoint and DaPars, using various RNA-seq datasets from simulation studies, human and Arabidopsis demonstrate the efficacy and flexibility of APAtrap for any organisms with an annotated genome. Freely available for download at https://apatrap.sourceforge.io. liqq@xmu.edu.cn or xhuister@xmu.edu.cn. Supplementary data are available at Bioinformatics online.
Wu, Qing-jun; Wang, Shao-li; Yang, Xin; Yang, Ni-na; Li, Ru-mei; Jiao, Xiao-guo; Pan, Hui-peng; Liu, Bai-ming; Su, Qi; Xu, Bao-yun; Hu, Song-nian; Zhou, Xu-guo; Zhang, You-jun
2012-01-01
Background Bemisia tabaci (Gennadius) is a phloem-feeding insect poised to become one of the major insect pests in open field and greenhouse production systems throughout the world. The high level of resistance to insecticides is a main factor that hinders continued use of insecticides for suppression of B. tabaci. Despite its prevalence, little is known about B. tabaci at the genome level. To fill this gap, an invasive B. tabaci B biotype was subjected to pyrosequencing-based transcriptome analysis to identify genes and gene networks putatively involved in various physiological and toxicological processes. Methodology and Principal Findings Using Roche 454 pyrosequencing, 857,205 reads containing approximately 340 megabases were obtained from the B. tabaci transcriptome. De novo assembly generated 178,669 unigenes including 30,980 from insects, 17,881 from bacteria, and 129,808 from the nohit. A total of 50,835 (28.45%) unigenes showed similarity to the non-redundant database in GenBank with a cut-off E-value of 10–5. Among them, 40,611 unigenes were assigned to one or more GO terms and 6,917 unigenes were assigned to 288 known pathways. De novo metatranscriptome analysis revealed highly diverse bacterial symbionts in B. tabaci, and demonstrated the host-symbiont cooperation in amino acid production. In-depth transcriptome analysis indentified putative molecular markers, and genes potentially involved in insecticide resistance and nutrient digestion. The utility of this transcriptome was validated by a thiamethoxam resistance study, in which annotated cytochrome P450 genes were significantly overexpressed in the resistant B. tabaci in comparison to its susceptible counterparts. Conclusions This transcriptome/metatranscriptome analysis sheds light on the molecular understanding of symbiosis and insecticide resistance in an agriculturally important phloem-feeding insect pest, and lays the foundation for future functional genomics research of the B. tabaci complex. Moreover, current pyrosequencing effort greatly enriched the existing whitefly EST database, and makes RNAseq a viable option for future genomic analysis. PMID:22558125
Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich
2015-12-16
Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.
Bioinformatics of prokaryotic RNAs
Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F
2014-01-01
The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880
Mendes, Filipa; Sieuwerts, Sander; de Hulster, Erik; Almering, Marinka J. H.; Luttik, Marijke A. H.; Pronk, Jack T.; Smid, Eddy J.; Bron, Peter A.
2013-01-01
Mixed populations of Saccharomyces cerevisiae yeasts and lactic acid bacteria occur in many dairy, food, and beverage fermentations, but knowledge about their interactions is incomplete. In the present study, interactions between Saccharomyces cerevisiae and Lactobacillus delbrueckii subsp. bulgaricus, two microorganisms that co-occur in kefir fermentations, were studied during anaerobic growth on lactose. By combining physiological and transcriptome analysis of the two strains in the cocultures, five mechanisms of interaction were identified. (i) Lb. delbrueckii subsp. bulgaricus hydrolyzes lactose, which cannot be metabolized by S. cerevisiae, to galactose and glucose. Subsequently, galactose, which cannot be metabolized by Lb. delbrueckii subsp. bulgaricus, is excreted and provides a carbon source for yeast. (ii) In pure cultures, Lb. delbrueckii subsp. bulgaricus grows only in the presence of increased CO2 concentrations. In anaerobic mixed cultures, the yeast provides this CO2 via alcoholic fermentation. (iii) Analysis of amino acid consumption from the defined medium indicated that S. cerevisiae supplied alanine to the bacterium. (iv) A mild but significant low-iron response in the yeast transcriptome, identified by DNA microarray analysis, was consistent with the chelation of iron by the lactate produced by Lb. delbrueckii subsp. bulgaricus. (v) Transcriptome analysis of Lb. delbrueckii subsp. bulgaricus in mixed cultures showed an overrepresentation of transcripts involved in lipid metabolism, suggesting either a competition of the two microorganisms for fatty acids or a response to the ethanol produced by S. cerevisiae. This study demonstrates that chemostat-based transcriptome analysis is a powerful tool to investigate microbial interactions in mixed populations. PMID:23872557
ZHANG, YAFANG; CROFTON, ELIZABETH J.; FAN, XIUZHEN; LI, DINGGE; KONG, FANPING; SINHA, MALA; LUXON, BRUCE A.; SPRATT, HEIDI M.; LICHTI, CHERYL F.; GREEN, THOMAS A.
2016-01-01
Transcriptomic and proteomic approaches have separately proven effective at identifying novel mechanisms affecting addiction-related behavior; however, it is difficult to prioritize the many promising leads from each approach. A convergent secondary analysis of proteomic and transcriptomic results can glean additional information to help prioritize promising leads. The current study is a secondary analysis of the convergence of recently published separate transcriptomic and proteomic analyses of nucleus accumbens (NAc) tissue from rats subjected to environmental enrichment vs. isolation and cocaine self-administration vs. saline. Multiple bioinformatics approaches (e.g. Gene Ontology (GO) analysis, Ingenuity Pathway Analysis (IPA), and Gene Set Enrichment Analysis (GSEA)) were used to interrogate these rich data sets. Although there was little correspondence between mRNA vs. protein at the individual target level, good correspondence was found at the level of gene/protein sets, particularly for the environmental enrichment manipulation. These data identify gene sets where there is a positive relationship between changes in mRNA and protein (e.g. glycolysis, ATP synthesis, translation elongation factor activity, etc.) and gene sets where there is an inverse relationship (e.g. ribosomes, Rho GTPase signaling, protein ubiquitination, etc.). Overall environmental enrichment produced better correspondence than cocaine self-administration. The individual targets contributing to mRNA and protein effects were largely not overlapping. As a whole, these results confirm that robust transcriptomic and proteomic data sets can provide similar results at the gene/protein set level even when there is little correspondence at the individual target level and little overlap in the targets contributing to the effects. PMID:27717806
Mendes, Filipa; Sieuwerts, Sander; de Hulster, Erik; Almering, Marinka J H; Luttik, Marijke A H; Pronk, Jack T; Smid, Eddy J; Bron, Peter A; Daran-Lapujade, Pascale
2013-10-01
Mixed populations of Saccharomyces cerevisiae yeasts and lactic acid bacteria occur in many dairy, food, and beverage fermentations, but knowledge about their interactions is incomplete. In the present study, interactions between Saccharomyces cerevisiae and Lactobacillus delbrueckii subsp. bulgaricus, two microorganisms that co-occur in kefir fermentations, were studied during anaerobic growth on lactose. By combining physiological and transcriptome analysis of the two strains in the cocultures, five mechanisms of interaction were identified. (i) Lb. delbrueckii subsp. bulgaricus hydrolyzes lactose, which cannot be metabolized by S. cerevisiae, to galactose and glucose. Subsequently, galactose, which cannot be metabolized by Lb. delbrueckii subsp. bulgaricus, is excreted and provides a carbon source for yeast. (ii) In pure cultures, Lb. delbrueckii subsp. bulgaricus grows only in the presence of increased CO2 concentrations. In anaerobic mixed cultures, the yeast provides this CO2 via alcoholic fermentation. (iii) Analysis of amino acid consumption from the defined medium indicated that S. cerevisiae supplied alanine to the bacterium. (iv) A mild but significant low-iron response in the yeast transcriptome, identified by DNA microarray analysis, was consistent with the chelation of iron by the lactate produced by Lb. delbrueckii subsp. bulgaricus. (v) Transcriptome analysis of Lb. delbrueckii subsp. bulgaricus in mixed cultures showed an overrepresentation of transcripts involved in lipid metabolism, suggesting either a competition of the two microorganisms for fatty acids or a response to the ethanol produced by S. cerevisiae. This study demonstrates that chemostat-based transcriptome analysis is a powerful tool to investigate microbial interactions in mixed populations.
Microfluidic single-cell whole-transcriptome sequencing.
Streets, Aaron M; Zhang, Xiannian; Cao, Chen; Pang, Yuhong; Wu, Xinglong; Xiong, Liang; Yang, Lu; Fu, Yusi; Zhao, Liang; Tang, Fuchou; Huang, Yanyi
2014-05-13
Single-cell whole-transcriptome analysis is a powerful tool for quantifying gene expression heterogeneity in populations of cells. Many techniques have, thus, been recently developed to perform transcriptome sequencing (RNA-Seq) on individual cells. To probe subtle biological variation between samples with limiting amounts of RNA, more precise and sensitive methods are still required. We adapted a previously developed strategy for single-cell RNA-Seq that has shown promise for superior sensitivity and implemented the chemistry in a microfluidic platform for single-cell whole-transcriptome analysis. In this approach, single cells are captured and lysed in a microfluidic device, where mRNAs with poly(A) tails are reverse-transcribed into cDNA. Double-stranded cDNA is then collected and sequenced using a next generation sequencing platform. We prepared 94 libraries consisting of single mouse embryonic cells and technical replicates of extracted RNA and thoroughly characterized the performance of this technology. Microfluidic implementation increased mRNA detection sensitivity as well as improved measurement precision compared with tube-based protocols. With 0.2 M reads per cell, we were able to reconstruct a majority of the bulk transcriptome with 10 single cells. We also quantified variation between and within different types of mouse embryonic cells and found that enhanced measurement precision, detection sensitivity, and experimental throughput aided the distinction between biological variability and technical noise. With this work, we validated the advantages of an early approach to single-cell RNA-Seq and showed that the benefits of combining microfluidic technology with high-throughput sequencing will be valuable for large-scale efforts in single-cell transcriptome analysis.
Zeng, Fansuo; Sun, Fengkun; Li, Leilei; Liu, Kun; Zhan, Yaguang
2014-01-01
Evidence supporting nitric oxide (NO) as a mediator of plant biochemistry continues to grow, but its functions at the molecular level remains poorly understood and, in some cases, controversial. To study the role of NO at the transcriptional level in Betula platyphylla cells, we conducted a genome-scale transcriptome analysis of these cells. The transcriptome of untreated birch cells and those treated by sodium nitroprusside (SNP) were analyzed using the Solexa sequencing. Data were collected by sequencing cDNA libraries of birch cells, which had a long period to adapt to the suspension culture conditions before SNP-treated cells and untreated cells were sampled. Among the 34,100 UniGenes detected, BLASTX search revealed that 20,631 genes showed significant (E-values≤10−5) sequence similarity with proteins from the NR-database. Numerous expressed sequence tags (i.e., 1374) were identified as differentially expressed between the 12 h SNP-treated cells and control cells samples: 403 up-regulated and 971 down-regulated. From this, we specifically examined a core set of NO-related transcripts. The altered expression levels of several transcripts, as determined by transcriptome analysis, was confirmed by qRT-PCR. The results of transcriptome analysis, gene expression quantification, the content of triterpenoid and activities of defensive enzymes elucidated NO has a significant effect on many processes including triterpenoid production, carbohydrate metabolism and cell wall biosynthesis. PMID:25551661
2011-01-01
Background Several tools have been developed to perform global gene expression profile data analysis, to search for specific chromosomal regions whose features meet defined criteria as well as to study neighbouring gene expression. However, most of these tools are tailored for a specific use in a particular context (e.g. they are species-specific, or limited to a particular data format) and they typically accept only gene lists as input. Results TRAM (Transcriptome Mapper) is a new general tool that allows the simple generation and analysis of quantitative transcriptome maps, starting from any source listing gene expression values for a given gene set (e.g. expression microarrays), implemented as a relational database. It includes a parser able to assign univocal and updated gene symbols to gene identifiers from different data sources. Moreover, TRAM is able to perform intra-sample and inter-sample data normalization, including an original variant of quantile normalization (scaled quantile), useful to normalize data from platforms with highly different numbers of investigated genes. When in 'Map' mode, the software generates a quantitative representation of the transcriptome of a sample (or of a pool of samples) and identifies if segments of defined lengths are over/under-expressed compared to the desired threshold. When in 'Cluster' mode, the software searches for a set of over/under-expressed consecutive genes. Statistical significance for all results is calculated with respect to genes localized on the same chromosome or to all genome genes. Transcriptome maps, showing differential expression between two sample groups, relative to two different biological conditions, may be easily generated. We present the results of a biological model test, based on a meta-analysis comparison between a sample pool of human CD34+ hematopoietic progenitor cells and a sample pool of megakaryocytic cells. Biologically relevant chromosomal segments and gene clusters with differential expression during the differentiation toward megakaryocyte were identified. Conclusions TRAM is designed to create, and statistically analyze, quantitative transcriptome maps, based on gene expression data from multiple sources. The release includes FileMaker Pro database management runtime application and it is freely available at http://apollo11.isto.unibo.it/software/, along with preconfigured implementations for mapping of human, mouse and zebrafish transcriptomes. PMID:21333005
Parreira, Valeria R; Russell, Kay; Athanasiadou, Spiridoula; Prescott, John F
2016-08-12
Necrotic enteritis (NE) caused by netB-positive type A Clostridium perfringens is an important bacterial disease of poultry. Through its complex regulatory system, C. perfringens orchestrates the expression of a collection of toxins and extracellular enzymes that are crucial for the development of the disease; environmental conditions play an important role in their regulation. In this study, and for the first time, global transcriptomic analysis was performed on ligated intestinal loops in chickens colonized with a netB-positive C. perfringens strain, as well as the same strain propagated in vitro under various nutritional and environmental conditions. Analysis of the respective pathogen transcriptomes revealed up to 673 genes that were significantly expressed in vivo. Gene expression profiles in vivo were most similar to those of C. perfringens grown in nutritionally-deprived conditions. Taken together, our results suggest a bacterial transcriptome responses to the early stages of adaptation, and colonization of, the chicken intestine. Our work also reveals how netB-positive C. perfringens reacts to different environmental conditions including those in the chicken intestine.
Srivastava, Smriti; Singh, Rajesh K.; Pathak, Garima; Goel, Ridhi; Asif, Mehar Hasan; Sane, Aniruddha P.; Sane, Vidhu A.
2016-01-01
Ripening in mango is under a complex control of ethylene. In an effort to understand the complex spatio-temporal control of ripening we have made use of a popular N. Indian variety “Dashehari” This variety ripens from the stone inside towards the peel outside and forms jelly in the pulp in ripe fruits. Through a combination of 454 and Illumina sequencing, a transcriptomic analysis of gene expression from unripe and midripe stages have been performed in triplicates. Overall 74,312 unique transcripts with ≥1 FPKM were obtained. The transcripts related to 127 pathways were identified in “Dashehari” mango transcriptome by the KEGG analysis. These pathways ranged from detoxification, ethylene biosynthesis, carbon metabolism and aromatic amino acid degradation. The transcriptome study reveals differences not only in expression of softening associated genes but also those that govern ethylene biosynthesis and other nutritional characteristics. This study could help to develop ripening related markers for selective breeding to reduce the problems of excess jelly formation during softening in the “Dashehari” variety. PMID:27586495
PEA: an integrated R toolkit for plant epitranscriptome analysis.
Zhai, Jingjing; Song, Jie; Cheng, Qian; Tang, Yunjia; Ma, Chuang
2018-05-29
The epitranscriptome, also known as chemical modifications of RNA (CMRs), is a newly discovered layer of gene regulation, the biological importance of which emerged through analysis of only a small fraction of CMRs detected by high-throughput sequencing technologies. Understanding of the epitranscriptome is hampered by the absence of computational tools for the systematic analysis of epitranscriptome sequencing data. In addition, no tools have yet been designed for accurate prediction of CMRs in plants, or to extend epitranscriptome analysis from a fraction of the transcriptome to its entirety. Here, we introduce PEA, an integrated R toolkit to facilitate the analysis of plant epitranscriptome data. The PEA toolkit contains a comprehensive collection of functions required for read mapping, CMR calling, motif scanning and discovery, and gene functional enrichment analysis. PEA also takes advantage of machine learning technologies for transcriptome-scale CMR prediction, with high prediction accuracy, using the Positive Samples Only Learning algorithm, which addresses the two-class classification problem by using only positive samples (CMRs), in the absence of negative samples (non-CMRs). Hence PEA is a versatile epitranscriptome analysis pipeline covering CMR calling, prediction, and annotation, and we describe its application to predict N6-methyladenosine (m6A) modifications in Arabidopsis thaliana. Experimental results demonstrate that the toolkit achieved 71.6% sensitivity and 73.7% specificity, which is superior to existing m6A predictors. PEA is potentially broadly applicable to the in-depth study of epitranscriptomics. PEA Docker image is available at https://hub.docker.com/r/malab/pea, source codes and user manual are available at https://github.com/cma2015/PEA. chuangma2006@gmail.com. Supplementary data are available at Bioinformatics online.
Brownian model of transcriptome evolution and phylogenetic network visualization between tissues.
Gu, Xun; Ruan, Hang; Su, Zhixi; Zou, Yangyun
2017-09-01
While phylogenetic analysis of transcriptomes of the same tissue is usually congruent with the species tree, the controversy emerges when multiple tissues are included, that is, whether species from the same tissue are clustered together, or different tissues from the same species are clustered together. Recent studies have suggested that phylogenetic network approach may shed some lights on our understanding of multi-tissue transcriptome evolution; yet the underlying evolutionary mechanism remains unclear. In this paper we develop a Brownian-based model of transcriptome evolution under the phylogenetic network that can statistically distinguish between the patterns of species-clustering and tissue-clustering. Our model can be used as a null hypothesis (neutral transcriptome evolution) for testing any correlation in tissue evolution, can be applied to cancer transcriptome evolution to study whether two tumors of an individual appeared independently or via metastasis, and can be useful to detect convergent evolution at the transcriptional level. Copyright © 2017. Published by Elsevier Inc.
Jiménez-Guerrero, Irene; Acosta-Jurado, Sebastián; Navarro-Gómez, Pilar; López-Baena, Francisco Javier; Ollero, Francisco Javier
2017-01-01
Simultaneous quantification of transcripts of the whole bacterial genome allows the analysis of the global transcriptional response under changing conditions. RNA-seq and microarrays are the most used techniques to measure these transcriptomic changes, and both complement each other in transcriptome profiling. In this review, we exhaustively compiled the symbiosis-related transcriptomic reports (microarrays and RNA sequencing) carried out hitherto in rhizobia. This review is specially focused on transcriptomic changes that takes place when five rhizobial species, Bradyrhizobium japonicum (=diazoefficiens) USDA 110, Rhizobium leguminosarum biovar viciae 3841, Rhizobium tropici CIAT 899, Sinorhizobium (=Ensifer) meliloti 1021 and S. fredii HH103, recognize inducing flavonoids, plant-exuded phenolic compounds that activate the biosynthesis and export of Nod factors (NF) in all analysed rhizobia. Interestingly, our global transcriptomic comparison also indicates that each rhizobial species possesses its own arsenal of molecular weapons accompanying the set of NF in order to establish a successful interaction with host legumes. PMID:29267254
Ubrihien, Rodney P; Ezaz, Tariq; Taylor, Anne M; Stevens, Mark M; Krikowa, Frank; Foster, Simon; Maher, William A
2017-04-01
This study describes the transcriptomic response of the Australian endemic freshwater gastropod Isidorella newcombi exposed to 80±1μg/L of copper for 3days. Analysis of copper tissue concentration, lysosomal membrane destabilisation and RNA-seq were conducted. Copper tissue concentrations confirmed that copper was bioaccumulated by the snails. Increased lysosomal membrane destabilisation in the copper-exposed snails indicated that the snails were stressed as a result of the exposure. Both copper tissue concentrations and lysosomal destabilisation were significantly greater in snails exposed to copper. In order to interpret the RNA-seq data from an ecotoxicological perspective an integrated biological response model was developed that grouped transcriptomic responses into those associated with copper transport and storage, survival mechanisms and cell death. A conceptual model of expected transcriptomic changes resulting from the copper exposure was developed as a basis to assess transcriptomic responses. Transcriptomic changes were evident at all the three levels of the integrated biological response model. Despite lacking statistical significance, increased expression of the gene encoding copper transporting ATPase provided an indication of increased internal transport of copper. Increased expression of genes associated with endocytosis are associated with increased transport of copper to the lysosome for storage in a detoxified form. Survival mechanisms included metabolic depression and processes associated with cellular repair and recycling. There was transcriptomic evidence of increased cell death by apoptosis in the copper-exposed organisms. Increased apoptosis is supported by the increase in lysosomal membrane destabilisation in the copper-exposed snails. Transcriptomic changes relating to apoptosis, phagocytosis, protein degradation and the lysosome were evident and these processes can be linked to the degradation of post-apoptotic debris. The study identified contaminant specific transcriptomic markers as well as markers of general stress. From an ecotoxicological perspective, the use of a framework to group transcriptomic responses into those associated with copper transport, survival and cell death assisted with the complex process of interpretation of RNA-seq data. The broad adoption of such a framework in ecotoxicology studies would assist in comparison between studies and the identification of reliable transcriptomic markers of contaminant exposure and response. Copyright © 2017 Elsevier B.V. All rights reserved.
Rai, Amit; Yamazaki, Mami; Takahashi, Hiroki; Nakamura, Michimi; Kojoma, Mareshige; Suzuki, Hideyuki; Saito, Kazuki
2016-01-01
The Panax genus has been a source of natural medicine, benefitting human health over the ages, among which the Panax japonicus represents an important species. Our understanding of several key pathways and enzymes involved in the biosynthesis of ginsenosides, a pharmacologically active class of metabolites and a major chemical constituents of the rhizome extracts from the Panax species, are limited. Limited genomic information, and lack of studies on comparative transcriptomics across the Panax species have restricted our understanding of the biosynthetic mechanisms of these and many other important classes of phytochemicals. Herein, we describe Illumina based RNA sequencing analysis to characterize the transcriptome and expression profiles of genes expressed in the five tissues of P. japonicus, and its comparison with other Panax species. RNA sequencing and de novo transcriptome assembly for P. japonicus resulted in a total of 135,235 unigenes with 78,794 (58.24%) unigenes being annotated using NCBI-nr database. Transcriptome profiling, and gene ontology enrichment analysis for five tissues of P. japonicus showed that although overall processes were evenly conserved across all tissues. However, each tissue was characterized by several unique unigenes with the leaves showing the most unique unigenes among the tissues studied. A comparative analysis of the P. japonicus transcriptome assembly with publically available transcripts from other Panax species, namely, P. ginseng, P. notoginseng, and P. quinquefolius also displayed high sequence similarity across all Panax species, with P. japonicus showing highest similarity with P. ginseng. Annotation of P. japonicus transcriptome resulted in the identification of putative genes encoding all enzymes from the triterpene backbone biosynthetic pathways, and identified 24 and 48 unigenes annotated as cytochrome P450 (CYP) and glycosyltransferases (GT), respectively. These CYPs and GTs annotated unigenes were conserved across all Panax species and co-expressed with other the transcripts involved in the triterpenoid backbone biosynthesis pathways. Unigenes identified in this study represent strong candidates for being involved in the triterpenoid saponins biosynthesis, and can serve as a basis for future validation studies. PMID:27148308
Transcriptomic analysis of Arabidopsis developing stems: a close-up on cell wall genes
Minic, Zoran; Jamet, Elisabeth; San-Clemente, Hélène; Pelletier, Sandra; Renou, Jean-Pierre; Rihouey, Christophe; Okinyo, Denis PO; Proux, Caroline; Lerouge, Patrice; Jouanin, Lise
2009-01-01
Background Different strategies (genetics, biochemistry, and proteomics) can be used to study proteins involved in cell biogenesis. The availability of the complete sequences of several plant genomes allowed the development of transcriptomic studies. Although the expression patterns of some Arabidopsis thaliana genes involved in cell wall biogenesis were identified at different physiological stages, detailed microarray analysis of plant cell wall genes has not been performed on any plant tissues. Using transcriptomic and bioinformatic tools, we studied the regulation of cell wall genes in Arabidopsis stems, i.e. genes encoding proteins involved in cell wall biogenesis and genes encoding secreted proteins. Results Transcriptomic analyses of stems were performed at three different developmental stages, i.e., young stems, intermediate stage, and mature stems. Many genes involved in the synthesis of cell wall components such as polysaccharides and monolignols were identified. A total of 345 genes encoding predicted secreted proteins with moderate or high level of transcripts were analyzed in details. The encoded proteins were distributed into 8 classes, based on the presence of predicted functional domains. Proteins acting on carbohydrates and proteins of unknown function constituted the two most abundant classes. Other proteins were proteases, oxido-reductases, proteins with interacting domains, proteins involved in signalling, and structural proteins. Particularly high levels of expression were established for genes encoding pectin methylesterases, germin-like proteins, arabinogalactan proteins, fasciclin-like arabinogalactan proteins, and structural proteins. Finally, the results of this transcriptomic analyses were compared with those obtained through a cell wall proteomic analysis from the same material. Only a small proportion of genes identified by previous proteomic analyses were identified by transcriptomics. Conversely, only a few proteins encoded by genes having moderate or high level of transcripts were identified by proteomics. Conclusion Analysis of the genes predicted to encode cell wall proteins revealed that about 345 genes had moderate or high levels of transcripts. Among them, we identified many new genes possibly involved in cell wall biogenesis. The discrepancies observed between results of this transcriptomic study and a previous proteomic study on the same material revealed post-transcriptional mechanisms of regulation of expression of genes encoding cell wall proteins. PMID:19149885
Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan
2013-01-01
Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system. PMID:24009133
Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan
2013-01-01
Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system.
Aging-like Changes in the Transcriptome of Irradiated Microglia
Li, Matthew D.; Burns, Terry C.; Kumar, Sunny; Morgan, Alexander A.; Sloan, Steven A.; Palmer, Theo D.
2014-01-01
Whole brain irradiation remains important in the management of brain tumors. Although necessary for improving survival outcomes, cranial irradiation also results in cognitive decline in long-term survivors. A chronic inflammatory state characterized by microglial activation has been implicated in radiation-induced brain injury. We here provide the first comprehensive transcriptional profile of irradiated microglia. Fluorescence-activated cell sorting (FACS) was used to isolate CD11b+ microglia from the hippocampi of C57BL/6 and Balb/c mice 1 month after 10Gy cranial irradiation. Affymetrix gene expression profiles were evaluated using linear modeling, rank product analyses. One month after irradiation, a conserved irradiation signature across strains was identified, comprising 448 and 85 differentially up- and down-regulated genes, respectively. Gene set enrichment analysis (GSEA) demonstrated enrichment for inflammation, including M1 macrophage-associated genes, but also an unexpected enrichment for extracellular matrix and blood coagulation-related gene sets, in contrast previously described microglial states. Weighted gene co-expression network analysis (WGCNA) confirmed these findings and further revealed alterations in mitochondrial function. The RNA-seq transcriptome of microglia 24h post-radiation proved similar to the 1-month transcriptome, but additionally featured alterations in apoptotic and lysosomal gene expression. Re-analysis of published aging mouse microglia transcriptome data demonstrated striking similarity to the 1 month irradiated microglia transcriptome, suggesting that shared mechanisms may underlie aging and chronic irradiation-induced cognitive decline. PMID:25690519
Use of prior knowledge for the analysis of high-throughput transcriptomics and metabolomics data
2014-01-01
Background High-throughput omics technologies have enabled the measurement of many genes or metabolites simultaneously. The resulting high dimensional experimental data poses significant challenges to transcriptomics and metabolomics data analysis methods, which may lead to spurious instead of biologically relevant results. One strategy to improve the results is the incorporation of prior biological knowledge in the analysis. This strategy is used to reduce the solution space and/or to focus the analysis on biological meaningful regions. In this article, we review a selection of these methods used in transcriptomics and metabolomics. We combine the reviewed methods in three groups based on the underlying mathematical model: exploratory methods, supervised methods and estimation of the covariance matrix. We discuss which prior knowledge has been used, how it is incorporated and how it modifies the mathematical properties of the underlying methods. PMID:25033193
Costa, Fabrizio; Alba, Rob; Schouten, Henk; Soglio, Valeria; Gianfranceschi, Luca; Serra, Sara; Musacchi, Stefano; Sansavini, Silviero; Costa, Guglielmo; Fei, Zhangjun; Giovannoni, James
2010-10-25
Fruit development, maturation and ripening consists of a complex series of biochemical and physiological changes that in climacteric fruits, including apple and tomato, are coordinated by the gaseous hormone ethylene. These changes lead to final fruit quality and understanding of the functional machinery underlying these processes is of both biological and practical importance. To date many reports have been made on the analysis of gene expression in apple. In this study we focused our investigation on the role of ethylene during apple maturation, specifically comparing transcriptomics of normal ripening with changes resulting from application of the hormone receptor competitor 1-methylcyclopropene. To gain insight into the molecular process regulating ripening in apple, and to compare to tomato (model species for ripening studies), we utilized both homologous and heterologous (tomato) microarray to profile transcriptome dynamics of genes involved in fruit development and ripening, emphasizing those which are ethylene regulated.The use of both types of microarrays facilitated transcriptome comparison between apple and tomato (for the later using data previously published and available at the TED: tomato expression database) and highlighted genes conserved during ripening of both species, which in turn represent a foundation for further comparative genomic studies. The cross-species analysis had the secondary aim of examining the efficiency of heterologous (specifically tomato) microarray hybridization for candidate gene identification as related to the ripening process. The resulting transcriptomics data revealed coordinated gene expression during fruit ripening of a subset of ripening-related and ethylene responsive genes, further facilitating the analysis of ethylene response during fruit maturation and ripening. Our combined strategy based on microarray hybridization enabled transcriptome characterization during normal climacteric apple ripening, as well as definition of ethylene-dependent transcriptome changes. Comparison with tomato fruit maturation and ethylene responsive transcriptome activity facilitated identification of putative conserved orthologous ripening-related genes, which serve as an initial set of candidates for assessing conservation of gene activity across genomes of fruit bearing plant species.
Transcriptomic analysis of flower development in tea (Camellia sinensis (L.)).
Liu, Feng; Wang, Yu; Ding, Zhaotang; Zhao, Lei; Xiao, Jun; Wang, Linjun; Ding, Shibo
2017-10-05
Flowering is a critical and complicated process in plant development, involving interactions of numerous endogenous and environmental factors, but little is known about the complex network regulating flower development in tea plants. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptomic analysis assembles gene-related information involved in reproductive growth of C. sinensis. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction were enriched among the DEGs. Furthermore, 207 flowering-associated unigenes were identified from our database. Some transcription factors, such as WRKY, ERF, bHLH, MYB and MADS-box were shown to be up-regulated in floral transition, which might play the role of progression of flowering. Furthermore, 14 genes were selected for confirmation of expression levels using quantitative real-time PCR (qRT-PCR). The comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in C. sinensis. Our data also provided a useful database for further research of tea and other species of plants. Copyright © 2017 Elsevier B.V. All rights reserved.
Moskalev, Alexey А; Kudryavtseva, Anna V; Graphodatsky, Alexander S; Beklemisheva, Violetta R; Serdyukova, Natalya A; Krutovsky, Konstantin V; Sharov, Vadim V; Kulakovskiy, Ivan V; Lando, Andrey S; Kasianov, Artem S; Kuzmin, Dmitry A; Putintseva, Yuliya A; Feranchuk, Sergey I; Shaposhnikov, Mikhail V; Fraifeld, Vadim E; Toren, Dmitri; Snezhkina, Anastasia V; Sitnik, Vasily V
2017-12-28
Gray whale, Eschrichtius robustus (E. robustus), is a single member of the family Eschrichtiidae, which is considered to be the most primitive in the class Cetacea. Gray whale is often described as a "living fossil". It is adapted to extreme marine conditions and has a high life expectancy (77 years). The assembly of a gray whale genome and transcriptome will allow to carry out further studies of whale evolution, longevity, and resistance to extreme environment. In this work, we report the first de novo assembly and primary analysis of the E. robustus genome and transcriptome based on kidney and liver samples. The presented draft genome assembly is complete by 55% in terms of a total genome length, but only by 24% in terms of the BUSCO complete gene groups, although 10,895 genes were identified. Transcriptome annotation and comparison with other whale species revealed robust expression of DNA repair and hypoxia-response genes, which is expected for whales. This preliminary study of the gray whale genome and transcriptome provides new data to better understand the whale evolution and the mechanisms of their adaptation to the hypoxic conditions.
Ponce, Dalia; Brinkman, Diane L; Potriquet, Jeremy; Mulvenna, Jason
2016-04-05
Jellyfish venoms are rich sources of toxins designed to capture prey or deter predators, but they can also elicit harmful effects in humans. In this study, an integrated transcriptomic and proteomic approach was used to identify putative toxins and their potential role in the venom of the scyphozoan jellyfish Chrysaora fuscescens. A de novo tentacle transcriptome, containing more than 23,000 contigs, was constructed and used in proteomic analysis of C. fuscescens venom to identify potential toxins. From a total of 163 proteins identified in the venom proteome, 27 were classified as putative toxins and grouped into six protein families: proteinases, venom allergens, C-type lectins, pore-forming toxins, glycoside hydrolases and enzyme inhibitors. Other putative toxins identified in the transcriptome, but not the proteome, included additional proteinases as well as lipases and deoxyribonucleases. Sequence analysis also revealed the presence of ShKT domains in two putative venom proteins from the proteome and an additional 15 from the transcriptome, suggesting potential ion channel blockade or modulatory activities. Comparison of these potential toxins to those from other cnidarians provided insight into their possible roles in C. fuscescens venom and an overview of the diversity of potential toxin families in cnidarian venoms.
Diray-Arce, Joann; Clement, Mark; Gul, Bilquees; Khan, M Ajmal; Nielsen, Brent L
2015-05-06
Improvement of crop production is needed to feed the growing world population as the amount and quality of agricultural land decreases and soil salinity increases. This has stimulated research on salt tolerance in plants. Most crops tolerate a limited amount of salt to survive and produce biomass, while halophytes (salt-tolerant plants) have the ability to grow with saline water utilizing specific biochemical mechanisms. However, little is known about the genes involved in salt tolerance. We have characterized the transcriptome of Suaeda fruticosa, a halophyte that has the ability to sequester salts in its leaves. Suaeda fruticosa is an annual shrub in the family Chenopodiaceae found in coastal and inland regions of Pakistan and Mediterranean shores. This plant is an obligate halophyte that grows optimally from 200-400 mM NaCl and can grow at up to 1000 mM NaCl. High throughput sequencing technology was performed to provide understanding of genes involved in the salt tolerance mechanism. De novo assembly of the transcriptome and analysis has allowed identification of differentially expressed and unique genes present in this non-conventional crop. Twelve sequencing libraries prepared from control (0 mM NaCl treated) and optimum (300 mM NaCl treated) plants were sequenced using Illumina Hiseq 2000 to investigate differential gene expression between shoots and roots of Suaeda fruticosa. The transcriptome was assembled de novo using Velvet and Oases k-45 and clustered using CDHIT-EST. There are 54,526 unigenes; among these 475 genes are downregulated and 44 are upregulated when samples from plants grown under optimal salt are compared with those grown without salt. BLAST analysis identified the differentially expressed genes, which were categorized in gene ontology terms and their pathways. This work has identified potential genes involved in salt tolerance in Suaeda fruticosa, and has provided an outline of tools to use for de novo transcriptome analysis. The assemblies that were used provide coverage of a considerable proportion of the transcriptome, which allows analysis of differential gene expression and identification of genes that may be involved in salt tolerance. The transcriptome may serve as a reference sequence for study of other succulent halophytes.
Analysis, annotation, and profiling of the oat seed transcriptome
USDA-ARS?s Scientific Manuscript database
Novel high-throughput next generation sequencing (NGS) technologies are providing opportunities to explore genomes and transcriptomes in a cost-effective manner. To construct a gene expression atlas of developing oat (Avena sativa) seeds, two software packages specifically designed for RNA-seq (Trin...
A comprehensive analysis of the human placenta transcriptome
USDA-ARS?s Scientific Manuscript database
As the conduit for nutrients and growth signals, the placenta is critical to establishing an environment sufficient for fetal growth and development. To better understand the mechanisms regulating placental development and gene expression, we characterized the transcriptome of term placenta from 20 ...
Genome-wide transcriptome and expression profile analysis of Phalaenopsis during explant browning.
Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei
2015-01-01
Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning.
Genome-Wide Transcriptome and Expression Profile Analysis of Phalaenopsis during Explant Browning
Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei
2015-01-01
Background Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. Methodology/Principal Findings We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Conclusions/Significance Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning. PMID:25874455
Ball, Robyn L; Fujiwara, Yasuhiro; Sun, Fengyun; Hu, Jianjun; Hibbs, Matthew A; Handel, Mary Ann; Carter, Gregory W
2016-08-12
The continuous and non-synchronous nature of postnatal male germ-cell development has impeded stage-specific resolution of molecular events of mammalian meiotic prophase in the testis. Here the juvenile onset of spermatogenesis in mice is analyzed by combining cytological and transcriptomic data in a novel computational analysis that allows decomposition of the transcriptional programs of spermatogonia and meiotic prophase substages. Germ cells from testes of individual mice were obtained at two-day intervals from 8 to 18 days post-partum (dpp), prepared as surface-spread chromatin and immunolabeled for meiotic stage-specific protein markers (STRA8, SYCP3, phosphorylated H2AFX, and HISTH1T). Eight stages were discriminated cytologically by combinatorial antibody labeling, and RNA-seq was performed on the same samples. Independent principal component analyses of cytological and transcriptomic data yielded similar patterns for both data types, providing strong evidence for substage-specific gene expression signatures. A novel permutation-based maximum covariance analysis (PMCA) was developed to map co-expressed transcripts to one or more of the eight meiotic prophase substages, thereby linking distinct molecular programs to cytologically defined cell states. Expression of meiosis-specific genes is not substage-limited, suggesting regulation of substage transitions at other levels. This integrated analysis provides a general method for resolving complex cell populations. Here it revealed not only features of meiotic substage-specific gene expression, but also a network of substage-specific transcription factors and relationships to potential target genes.
Lu, Taofeng; Sun, Yujiao; Ma, Qin; Zhu, Minghao; Liu, Dan; Ma, Jianzhang; Ma, Yuehui; Chen, Hongyan; Guan, Weijun
2016-12-01
The Siberian tiger, Panthera tigris altaica, is an endangered species, and much more work is needed to protect this species, which is still vulnerable to extinction. Conservation efforts may be supported by the genetic assessment of wild populations, for which highly specific microsatellite markers are required. However, only a limited amount of genetic sequence data is available for this species. To identify the genes involved in the lung transcriptome and to develop additional simple sequence repeat (SSR) markers for the Siberian tiger, we used high-throughput RNA-Seq to characterize the Siberian tiger transcriptome in lung tissue (designated 'PTA-lung') and a pooled tissue sample (designated 'PTA'). Approximately 47.5 % (33,187/69,836) of the lung transcriptome was annotated in four public databases (Nr, Swiss-Prot, KEGG, and COG). The annotated genes formed a potential pool for gene identification in the tiger. An analysis of the genes differentially expressed in the PTA lung, and PTA samples revealed that the tiger may have suffered a series of diseases before death. In total, 1062 non-redundant SSRs were identified in the Siberian tiger transcriptome. Forty-three primer pairs were randomly selected for amplification reactions, and 26 of the 43 pairs were also used to evaluate the levels of genetic polymorphism. Fourteen primer pairs (32.56 %) amplified products that were polymorphic in size in P. tigris altaica. In conclusion, the transcriptome sequences will provide a valuable genomic resource for genetic research, and these new SSR markers comprise a reasonable number of loci for the genetic analysis of wild and captive populations of P. tigris altaica.
Divina, Petr; Vlcek, Cestmír; Strnad, Petr; Paces, Václav; Forejt, Jirí
2005-03-05
We generated the gene expression profile of the total testis from the adult C57BL/6J male mice using serial analysis of gene expression (SAGE). Two high-quality SAGE libraries containing a total of 76 854 tags were constructed. An extensive bioinformatic analysis and comparison of SAGE transcriptomes of the total testis, testicular somatic cells and other mouse tissues was performed and the theory of male-biased gene accumulation on the X chromosome was tested. We sorted out 829 genes predominantly expressed from the germinal part and 944 genes from the somatic part of the testis. The genes preferentially and specifically expressed in total testis and testicular somatic cells were identified by comparing the testis SAGE transcriptomes to the available transcriptomes of seven non-testis tissues. We uncovered chromosomal clusters of adjacent genes with preferential expression in total testis and testicular somatic cells by a genome-wide search and found that the clusters encompassed a significantly higher number of genes than expected by chance. We observed a significant 3.2-fold enrichment of the proportion of X-linked genes specific for testicular somatic cells, while the proportions of X-linked genes specific for total testis and for other tissues were comparable. In contrast to the tissue-specific genes, an under-representation of X-linked genes in the total testis transcriptome but not in the transcriptomes of testicular somatic cells and other tissues was detected. Our results provide new evidence in favor of the theory of male-biased genes accumulation on the X chromosome in testicular somatic cells and indicate the opposite action of the meiotic X-inactivation in testicular germ cells.
Divina, Petr; Vlček, Čestmír; Strnad, Petr; Pačes, Václav; Forejt, Jiří
2005-01-01
Background We generated the gene expression profile of the total testis from the adult C57BL/6J male mice using serial analysis of gene expression (SAGE). Two high-quality SAGE libraries containing a total of 76 854 tags were constructed. An extensive bioinformatic analysis and comparison of SAGE transcriptomes of the total testis, testicular somatic cells and other mouse tissues was performed and the theory of male-biased gene accumulation on the X chromosome was tested. Results We sorted out 829 genes predominantly expressed from the germinal part and 944 genes from the somatic part of the testis. The genes preferentially and specifically expressed in total testis and testicular somatic cells were identified by comparing the testis SAGE transcriptomes to the available transcriptomes of seven non-testis tissues. We uncovered chromosomal clusters of adjacent genes with preferential expression in total testis and testicular somatic cells by a genome-wide search and found that the clusters encompassed a significantly higher number of genes than expected by chance. We observed a significant 3.2-fold enrichment of the proportion of X-linked genes specific for testicular somatic cells, while the proportions of X-linked genes specific for total testis and for other tissues were comparable. In contrast to the tissue-specific genes, an under-representation of X-linked genes in the total testis transcriptome but not in the transcriptomes of testicular somatic cells and other tissues was detected. Conclusion Our results provide new evidence in favor of the theory of male-biased genes accumulation on the X chromosome in testicular somatic cells and indicate the opposite action of the meiotic X-inactivation in testicular germ cells. PMID:15748293
Selenium supplementation prevents metabolic and transcriptomic responses to cadmium in mouse lung.
Hu, Xin; Chandler, Joshua D; Fernandes, Jolyn; Orr, Michael L; Hao, Li; Uppal, Karan; Neujahr, David C; Jones, Dean P; Go, Young-Mi
2018-04-12
The protective effect of selenium (Se) on cadmium (Cd) toxicity is well documented, but underlying mechanisms are unclear. Male mice fed standard diet were given Cd (CdCl 2 , 18 μmol/L) in drinking water with or without Se (Na 2 SeO 4, 20 μmol/L) for 16 weeks. Lungs were analyzed for Cd concentration, transcriptomics and metabolomics. Data were analyzed with biostatistics, bioinformatics, pathway enrichment analysis, and combined transcriptome-metabolome-wide association study. Mice treated with Cd had higher lung Cd content (1.7 ± 0.4 pmol/mg protein) than control mice (0.8 ± 0.3 pmol/mg protein) or mice treated with Cd and Se (0.4 ± 0.1 pmol/mg protein). Gene set enrichment analysis of transcriptomics data showed that Se prevented Cd effects on inflammatory and myogenesis genes and diminished Cd effects on several other pathways. Similarly, Se prevented Cd-disrupted metabolic pathways in amino acid metabolism and urea cycle. Integrated transcriptome and metabolome network analysis showed that Cd treatment had a network structure with fewer gene-metabolite clusters compared to control. Centrality measurements showed that Se counteracted changes in a group of Cd-responsive genes including Zdhhc11, (protein-cysteine S-palmitoyltransferase), Ighg1 (immunoglobulin heavy constant gamma-1) and associated changes in metabolite concentrations. Co-administration of Se with Cd prevented Cd increase in lung and prevented Cd-associated pathway and network responses of the transcriptome and metabolome. Se protection against Cd toxicity in lung involves complex systems responses. Environmental Cd stimulates proinflammatory and profibrotic signaling. The present results indicate that dietary or supplemental Se could be useful to mitigate Cd toxicity. Published by Elsevier B.V.
Figueroa-Montiel, Andrea; Ramos, Marco A; Mares, Rosa E; Dueñas, Salvador; Pimienta, Genaro; Ortiz, Ernesto; Possani, Lourival D; Licea-Navarro, Alexei F
2016-01-01
Small peptides isolated from the venom of the marine snails belonging to the genus Conus have been largely studied because of their therapeutic value. These peptides can be classified in two groups. The largest one is composed by peptides rich in disulfide bonds, and referred to as conotoxins. Despite the importance of conotoxins given their pharmacology value, little is known about the protein disulfide isomerase (PDI) enzymes that are required to catalyze their correct folding. To discover the PDIs that may participate in the folding and structural maturation of conotoxins, the transcriptomes of the venom duct of four different species of Conus from the peninsula of Baja California (Mexico) were assembled. Complementary DNA (cDNA) libraries were constructed for each species and sequenced using a Genome Analyzer Illumina platform. The raw RNA-seq data was converted into transcript sequences using Trinity, a de novo assembler that allows the grouping of reads into contigs without a reference genome. An N50 value of 605 was established as a reference for future assemblies of Conus transcriptomes using this software. Transdecoder was used to extract likely coding sequences from Trinity transcripts, and PDI-specific sequence motif "APWCGHCK" was used to capture potential PDIs. An in silico analysis was performed to characterize the group of PDI protein sequences encoded by the duct-transcriptome of each species. The computational approach entailed a structural homology characterization, based on the presence of functional Thioredoxin-like domains. Four different PDI families were characterized, which are constituted by a total of 41 different gene sequences. The sequences had an average of 65% identity with other PDIs. Using MODELLER 9.14, the homology-based three-dimensional structure prediction of a subset of the sequences reported, showed the expected thioredoxin fold which was confirmed by a "simulated annealing" method.
Next-Generation Genomics Facility at C-CAMP: Accelerating Genomic Research in India
S, Chandana; Russiachand, Heikham; H, Pradeep; S, Shilpa; M, Ashwini; S, Sahana; B, Jayanth; Atla, Goutham; Jain, Smita; Arunkumar, Nandini; Gowda, Malali
2014-01-01
Next-Generation Sequencing (NGS; http://www.genome.gov/12513162) is a recent life-sciences technological revolution that allows scientists to decode genomes or transcriptomes at a much faster rate with a lower cost. Genomic-based studies are in a relatively slow pace in India due to the non-availability of genomics experts, trained personnel and dedicated service providers. Using NGS there is a lot of potential to study India's national diversity (of all kinds). We at the Centre for Cellular and Molecular Platforms (C-CAMP) have launched the Next Generation Genomics Facility (NGGF) to provide genomics service to scientists, to train researchers and also work on national and international genomic projects. We have HiSeq1000 from Illumina and GS-FLX Plus from Roche454. The long reads from GS FLX Plus, and high sequence depth from HiSeq1000, are the best and ideal hybrid approaches for de novo and re-sequencing of genomes and transcriptomes. At our facility, we have sequenced around 70 different organisms comprising of more than 388 genomes and 615 transcriptomes – prokaryotes and eukaryotes (fungi, plants and animals). In addition we have optimized other unique applications such as small RNA (miRNA, siRNA etc), long Mate-pair sequencing (2 to 20 Kb), Coding sequences (Exome), Methylome (ChIP-Seq), Restriction Mapping (RAD-Seq), Human Leukocyte Antigen (HLA) typing, mixed genomes (metagenomes) and target amplicons, etc. Translating DNA sequence data from NGS sequencer into meaningful information is an important exercise. Under NGGF, we have bioinformatics experts and high-end computing resources to dissect NGS data such as genome assembly and annotation, gene expression, target enrichment, variant calling (SSR or SNP), comparative analysis etc. Our services (sequencing and bioinformatics) have been utilized by more than 45 organizations (academia and industry) both within India and outside, resulting several publications in peer-reviewed journals and several genomic/transcriptomic data is available at NCBI.
Transcriptome of interstitial cells of Cajal reveals unique and selective gene signatures
Park, Paul J.; Fuchs, Robert; Wei, Lai; Jorgensen, Brian G.; Redelman, Doug; Ward, Sean M.; Sanders, Kenton M.
2017-01-01
Transcriptome-scale data can reveal essential clues into understanding the underlying molecular mechanisms behind specific cellular functions and biological processes. Transcriptomics is a continually growing field of research utilized in biomarker discovery. The transcriptomic profile of interstitial cells of Cajal (ICC), which serve as slow-wave electrical pacemakers for gastrointestinal (GI) smooth muscle, has yet to be uncovered. Using copGFP-labeled ICC mice and flow cytometry, we isolated ICC populations from the murine small intestine and colon and obtained their transcriptomes. In analyzing the transcriptome, we identified a unique set of ICC-restricted markers including transcription factors, epigenetic enzymes/regulators, growth factors, receptors, protein kinases/phosphatases, and ion channels/transporters. This analysis provides new and unique insights into the cellular and biological functions of ICC in GI physiology. Additionally, we constructed an interactive ICC genome browser (http://med.unr.edu/physio/transcriptome) based on the UCSC genome database. To our knowledge, this is the first online resource that provides a comprehensive library of all known genetic transcripts expressed in primary ICC. Our genome browser offers a new perspective into the alternative expression of genes in ICC and provides a valuable reference for future functional studies. PMID:28426719
Musser, Jacob M; Wagner, Günter P
2015-11-01
We elaborate a framework for investigating the evolutionary history of morphological characters. We argue that morphological character trees generated by phylogenetic analysis of transcriptomes provide a useful tool for identifying causal gene expression differences underlying the development and evolution of morphological characters. They also enable rigorous testing of different models of morphological character evolution and origination, including the hypothesis that characters originate via divergence of repeated ancestral characters. Finally, morphological character trees provide evidence that character transcriptomes undergo concerted evolution. We argue that concerted evolution of transcriptomes can explain the so-called "species signal" found in several recent comparative transcriptome studies. The species signal is the phenomenon that transcriptomes cluster by species rather than character type, even though the characters are older than the respective species. We suggest the species signal is a natural consequence of concerted gene expression evolution resulting from mutations that alter gene regulatory network interactions shared by the characters under comparison. Thus, character trees generated from transcriptomes allow us to investigate the variational independence, or individuation, of morphological characters at the level of genetic programs. © 2015 Wiley Periodicals, Inc.
Xu, Zhifeng; Zhu, Wenyi; Liu, Yanchao; Liu, Xing; Chen, Qiushuang; Peng, Miao; Wang, Xiangzun; Shen, Guangmao; He, Lin
2014-01-01
The carmine spider mite (CSM), Tetranychus cinnabarinus, is an important pest mite in agriculture, because it can develop insecticide resistance easily. To gain valuable gene information and molecular basis for the future insecticide resistance study of CSM, the first transcriptome analysis of CSM was conducted. A total of 45,016 contigs and 25,519 unigenes were generated from the de novo transcriptome assembly, and 15,167 unigenes were annotated via BLAST querying against current databases, including nr, SwissProt, the Clusters of Orthologous Groups (COGs), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO). Aligning the transcript to Tetranychus urticae genome, the 19255 (75.45%) of the transcripts had significant (e-value <10-5) matches to T. urticae DNA genome, 19111 sequences matched to T. urticae proteome with an average protein length coverage of 42.55%. Core Eukaryotic Genes Mapping Approach (CEGMA) analysis identified 435 core eukaryotic genes (CEGs) in the CSM dataset corresponding to 95% coverage. Ten gene categories that relate to insecticide resistance in arthropod were generated from CSM transcriptome, including 53 P450-, 22 GSTs-, 23 CarEs-, 1 AChE-, 7 GluCls-, 9 nAChRs-, 8 GABA receptor-, 1 sodium channel-, 6 ATPase- and 12 Cyt b genes. We developed significant molecular resources for T. cinnabarinus putatively involved in insecticide resistance. The transcriptome assembly analysis will significantly facilitate our study on the mechanism of adapting environmental stress (including insecticide) in CSM at the molecular level, and will be very important for developing new control strategies against this pest mite.
Rai, Amit; Nakaya, Taiki; Shimizu, Yohei; Rai, Megha; Nakamura, Michimi; Suzuki, Hideyuki; Saito, Kazuki; Yamazaki, Mami
2018-05-29
Lithospermum officinale is a valuable source of bioactive metabolites with medicinal and industrial values. However, little is known about genes involved in the biosynthesis of these metabolites, primarily due to the lack of genome or transcriptome resources. This study presents the first effort to establish and characterize de novo transcriptome assembly resource for L. officinale and expression analysis for three of its tissues, namely leaf, stem, and root. Using over 4Gbps of RNA-sequencing datasets, we obtained de novo transcriptome assembly of L. officinale , consisting of 77,047 unigenes with assembly N50 value as 1524 bps. Based on transcriptome annotation and functional classification, 52,766 unigenes were assigned with putative genes functions, gene ontology terms, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. KEGG pathway and gene ontology enrichment analysis using highly expressed unigenes across three tissues and targeted metabolome analysis showed active secondary metabolic processes enriched specifically in the root of L. officinale . Using co-expression analysis, we also identified 20 and 48 unigenes representing different enzymes of lithospermic/chlorogenic acid and shikonin biosynthesis pathways, respectively. We further identified 15 candidate unigenes annotated as cytochrome P450 with the highest expression in the root of L. officinale as novel genes with a role in key biochemical reactions toward shikonin biosynthesis. Thus, through this study, we not only generated a high-quality genomic resource for L. officinale but also propose candidate genes to be involved in shikonin biosynthesis pathways for further functional characterization. Georg Thieme Verlag KG Stuttgart · New York.
Perigone Lobe Transcriptome Analysis Provides Insights into Rafflesia cantleyi Flower Development.
Lee, Xin-Wei; Mat-Isa, Mohd-Noor; Mohd-Elias, Nur-Atiqah; Aizat-Juhari, Mohd Afiq; Goh, Hoe-Han; Dear, Paul H; Chow, Keng-See; Haji Adam, Jumaat; Mohamed, Rahmah; Firdaus-Raih, Mohd; Wan, Kiew-Lian
2016-01-01
Rafflesia is a biologically enigmatic species that is very rare in occurrence and possesses an extraordinary morphology. This parasitic plant produces a gigantic flower up to one metre in diameter with no leaves, stem or roots. However, little is known about the floral biology of this species especially at the molecular level. In an effort to address this issue, we have generated and characterised the transcriptome of the Rafflesia cantleyi flower, and performed a comparison with the transcriptome of its floral bud to predict genes that are expressed and regulated during flower development. Approximately 40 million sequencing reads were generated and assembled de novo into 18,053 transcripts with an average length of 641 bp. Of these, more than 79% of the transcripts had significant matches to annotated sequences in the public protein database. A total of 11,756 and 7,891 transcripts were assigned to Gene Ontology categories and clusters of orthologous groups respectively. In addition, 6,019 transcripts could be mapped to 129 pathways in Kyoto Encyclopaedia of Genes and Genomes Pathway database. Digital abundance analysis identified 52 transcripts with very high expression in the flower transcriptome of R. cantleyi. Subsequently, analysis of differential expression between developing flower and the floral bud revealed a set of 105 transcripts with potential role in flower development. Our work presents a deep transcriptome resource analysis for the developing flower of R. cantleyi. Genes potentially involved in the growth and development of the R. cantleyi flower were identified and provide insights into biological processes that occur during flower development.
2014-01-01
Background Clinically useful biomarkers for patient stratification and monitoring of disease progression and drug response are in big demand in drug development and for addressing potential safety concerns. Many diseases influence the frequency and phenotype of cells found in the peripheral blood and the transcriptome of blood cells. Changes in cell type composition influence whole blood gene expression analysis results and thus the discovery of true transcript level changes remains a challenge. We propose a robust and reproducible procedure, which includes whole transcriptome gene expression profiling of major subsets of immune cell cells directly sorted from whole blood. Methods Target cells were enriched using magnetic microbeads and an autoMACS® Pro Separator (Miltenyi Biotec). Flow cytometric analysis for purity was performed before and after magnetic cell sorting. Total RNA was hybridized on HGU133 Plus 2.0 expression microarrays (Affymetrix, USA). CEL files signal intensity values were condensed using RMA and a custom CDF file (EntrezGene-based). Results Positive selection by use of MACS® Technology coupled to transcriptomics was assessed for eight different peripheral blood cell types, CD14+ monocytes, CD3+, CD4+, or CD8+ T cells, CD15+ granulocytes, CD19+ B cells, CD56+ NK cells, and CD45+ pan leukocytes. RNA quality from enriched cells was above a RIN of eight. GeneChip analysis confirmed cell type specific transcriptome profiles. Storing whole blood collected in an EDTA Vacutainer® tube at 4°C followed by MACS does not activate sorted cells. Gene expression analysis supports cell enrichment measurements by MACS. Conclusions The proposed workflow generates reproducible cell-type specific transcriptome data which can be translated to clinical settings and used to identify clinically relevant gene expression biomarkers from whole blood samples. This procedure enables the integration of transcriptomics of relevant immune cell subsets sorted directly from whole blood in clinical trial protocols. PMID:25984272
Transcriptome profiling reveals regulatory mechanisms underlying Corolla Senescence in Petunia
USDA-ARS?s Scientific Manuscript database
Genetic regulatory mechanisms that govern petal natural senescence in petunia is complicated and unclear. To identify key genes and pathways that regulate the process, we initiated a transcriptome analysis in petunia petals at four developmental time points, including petal opening without anthesis ...
Placental transcriptome co-expression analysis reveals conserved regulatory program across gestation
USDA-ARS?s Scientific Manuscript database
Mammalian development in utero is absolutely dependent on proper placental development, which is ultimately regulated by the placental genome. The regulation of the placental genome can be directly studied by exploring the underlying organization of the placental transcriptome through a systematic a...
Won, Harim I.; Schulze, Thomas T.; Clement, Emalie J.; Watson, Gabrielle F.; Watson, Sean M.; Warner, Rosalie C.; Ramler, Elizabeth A. M.; Witte, Elias J.; Schoenbeck, Mark A.; Rauter, Claudia M.; Davis, Paul H.
2018-01-01
Burying beetles (Nicrophorus spp.) are among the relatively few insects that provide parental care while not belonging to the eusocial insects such as ants or bees. This behavior incurs energy costs as evidenced by immune deficits and shorter life-spans in reproducing beetles. In the absence of an assembled transcriptome, relatively little is known concerning the molecular biology of these beetles. This work details the assembly and analysis of the Nicrophorus orbicollis transcriptome at multiple developmental stages. RNA-Seq reads were obtained by next-generation sequencing and the transcriptome was assembled using the Trinity assembler. Validation of the assembly was performed by functional characterization using Gene Ontology (GO), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. Differential expression analysis highlights developmental stage-specific expression patterns, and immunity-related transcripts are discussed. The data presented provides a valuable molecular resource to aid further investigation into immunocompetence throughout this organism's sexual development. PMID:29707046
Petukhov, Viktor; Guo, Jimin; Baryawno, Ninib; Severe, Nicolas; Scadden, David T; Samsonova, Maria G; Kharchenko, Peter V
2018-06-19
Recent single-cell RNA-seq protocols based on droplet microfluidics use massively multiplexed barcoding to enable simultaneous measurements of transcriptomes for thousands of individual cells. The increasing complexity of such data creates challenges for subsequent computational processing and troubleshooting of these experiments, with few software options currently available. Here, we describe a flexible pipeline for processing droplet-based transcriptome data that implements barcode corrections, classification of cell quality, and diagnostic information about the droplet libraries. We introduce advanced methods for correcting composition bias and sequencing errors affecting cellular and molecular barcodes to provide more accurate estimates of molecular counts in individual cells.
Transcriptional profiling of CD31(+) cells isolated from murine embryonic stem cells.
Mariappan, Devi; Winkler, Johannes; Chen, Shuhua; Schulz, Herbert; Hescheler, Jürgen; Sachinidis, Agapios
2009-02-01
Identification of genes involved in endothelial differentiation is of great interest for the understanding of the cellular and molecular mechanisms involved in the development of new blood vessels. Mouse embryonic stem (mES) cells serve as a potential source of endothelial cells for transcriptomic analysis. We isolated endothelial cells from 8-days old embryoid bodies by immuno-magnetic separation using platelet endothelial cell adhesion molecule-1 (also known as CD31) expressed on both early and mature endothelial cells. CD31(+) cells exhibit endothelial-like behavior by being able to incorporate DiI-labeled acetylated low-density lipoprotein as well as form tubular structures on matrigel. Quantitative and semi-quantitative PCR analysis further demonstrated the increased expression of endothelial transcripts. To ascertain the specific transcriptomic identity of the CD31(+) cells, large-scale microarray analysis was carried out. Comparative bioinformatic analysis reveals an enrichment of the gene ontology categories angiogenesis, blood vessel morphogenesis, vasculogenesis and blood coagulation in the CD31(+) cell population. Based on the transcriptomic signatures of the CD31(+) cells, we conclude that this ES cell-derived population contains endothelial-like cells expressing a mesodermal marker BMP2 and possess an angiogenic potential. The transcriptomic characterization of CD31(+) cells enables an in vitro functional genomic model to identify genes required for angiogenesis.
Ching, Travers; Zhu, Xun; Garmire, Lana X
2018-04-01
Artificial neural networks (ANN) are computing architectures with many interconnections of simple neural-inspired computing elements, and have been applied to biomedical fields such as imaging analysis and diagnosis. We have developed a new ANN framework called Cox-nnet to predict patient prognosis from high throughput transcriptomics data. In 10 TCGA RNA-Seq data sets, Cox-nnet achieves the same or better predictive accuracy compared to other methods, including Cox-proportional hazards regression (with LASSO, ridge, and mimimax concave penalty), Random Forests Survival and CoxBoost. Cox-nnet also reveals richer biological information, at both the pathway and gene levels. The outputs from the hidden layer node provide an alternative approach for survival-sensitive dimension reduction. In summary, we have developed a new method for accurate and efficient prognosis prediction on high throughput data, with functional biological insights. The source code is freely available at https://github.com/lanagarmire/cox-nnet.
Morey, Jeanine S; Burek Huntington, Kathy A; Campbell, Michelle; Clauss, Tonya M; Goertz, Caroline E; Hobbs, Roderick C; Lunardi, Denise; Moors, Amanda J; Neely, Marion G; Schwacke, Lori H; Van Dolah, Frances M
2017-10-01
Assessing the health of marine mammal sentinel species is crucial to understanding the impacts of environmental perturbations on marine ecosystems and human health. In Arctic regions, beluga whales, Delphinapterus leucas, are upper level predators that may serve as a sentinel species, potentially forecasting impacts on human health. While gene expression profiling from blood transcriptomes has widely been used to assess health status and environmental exposures in human and veterinary medicine, its use in wildlife has been limited due to the lack of available genomes and baseline data. To this end we constructed the first beluga whale blood transcriptome de novo from samples collected during annual health assessments of the healthy Bristol Bay, AK stock during 2012-2014 to establish baseline information on the content and variation of the beluga whale blood transcriptome. The Trinity transcriptome assembly from beluga was comprised of 91,325 transcripts that represented a wide array of cellular functions and processes and was extremely similar in content to the blood transcriptome of another cetacean, the bottlenose dolphin. Expression of hemoglobin transcripts was much lower in beluga (25.6% of TPM, transcripts per million) than has been observed in many other mammals. A T12A amino acid substitution in the HBB sequence of beluga whales, but not bottlenose dolphins, was identified and may play a role in low temperature adaptation. The beluga blood transcriptome was extremely stable between sex and year, with no apparent clustering of samples by principle components analysis and <4% of genes differentially expressed (EBseq, FDR<0.05). While the impacts of season, sexual maturity, disease, and geography on the beluga blood transcriptome must be established, the presence of transcripts involved in stress, detoxification, and immune functions indicate that blood gene expression analyses may provide information on health status and exposure. This study provides a wealth of transcriptomic data on beluga whales and provides a sizeable pool of preliminary data for comparison with other studies in beluga whale. Copyright © 2017 Elsevier B.V. All rights reserved.
Chauhan, Pallavi; Hansson, Bengt; Kraaijeveld, Ken; de Knijff, Peter; Svensson, Erik I; Wellenreuther, Maren
2014-09-22
There is growing interest in odonates (damselflies and dragonflies) as model organisms in ecology and evolutionary biology but the development of genomic resources has been slow. So far only one draft genome (Ladona fulva) and one transcriptome assembly (Enallagma hageni) have been published. Odonates have some of the most advanced visual systems among insects and several species are colour polymorphic, and genomic and transcriptomic data would allow studying the genomic architecture of these interesting traits and make detailed comparative studies between related species possible. Here, we present a comprehensive de novo transcriptome assembly for the blue-tailed damselfly Ischnura elegans (Odonata: Coenagrionidae) built from short-read RNA-seq data. The transcriptome analysis in this paper provides a first step towards identifying genes and pathways underlying the visual and colour systems in this insect group. Illumina RNA sequencing performed on tissues from the head, thorax and abdomen generated 428,744,100 paired-ends reads amounting to 110 Gb of sequence data, which was assembled de novo with Trinity. A transcriptome was produced after filtering and quality checking yielding a final set of 60,232 high quality transcripts for analysis. CEGMA software identified 247 out of 248 ultra-conserved core proteins as 'complete' in the transcriptome assembly, yielding a completeness of 99.6%. BLASTX and InterProScan annotated 55% of the assembled transcripts and showed that the three tissue types differed both qualitatively and quantitatively in I. elegans. Differential expression identified 8,625 transcripts to be differentially expressed in head, thorax and abdomen. Targeted analyses of vision and colour functional pathways identified the presence of four different opsin types and three pigmentation pathways. We also identified transcripts involved in temperature sensitivity, thermoregulation and olfaction. All these traits and their associated transcripts are of considerable ecological and evolutionary interest for this and other insect orders. Our work presents a comprehensive transcriptome resource for the ancient insect order Odonata and provides insight into their biology and physiology. The transcriptomic resource can provide a foundation for future investigations into this diverse group, including the evolution of colour, vision, olfaction and thermal adaptation.
Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation.
Dueck, Hannah; Khaladkar, Mugdha; Kim, Tae Kyung; Spaethling, Jennifer M; Francis, Chantal; Suresh, Sangita; Fisher, Stephen A; Seale, Patrick; Beck, Sheryl G; Bartfai, Tamas; Kuhn, Bernhard; Eberwine, James; Kim, Junhyong
2015-06-09
Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotypes. Additionally, single-cell transcriptomics presents unique analysis challenges that need to be addressed to answer this question. We present high quality deep read-depth single-cell RNA sequencing for 91 cells from five mouse tissues and 18 cells from two rat tissues, along with 30 control samples of bulk RNA diluted to single-cell levels. We find that transcriptomes differ globally across tissues with regard to the number of genes expressed, the average expression patterns, and within-cell-type variation patterns. We develop methods to filter genes for reliable quantification and to calibrate biological variation. All cell types include genes with high variability in expression, in a tissue-specific manner. We also find evidence that single-cell variability of neuronal genes in mice is correlated with that in rats consistent with the hypothesis that levels of variation may be conserved. Single-cell RNA-sequencing data provide a unique view of transcriptome function; however, careful analysis is required in order to use single-cell RNA-sequencing measurements for this purpose. Technical variation must be considered in single-cell RNA-sequencing studies of expression variation. For a subset of genes, biological variability within each cell type appears to be regulated in order to perform dynamic functions, rather than solely molecular noise.
Transcriptomics provides unique solutions for understanding the impact of complex mixtures and their components on aquatic systems. Here we describe the application of transcriptomics analysis of in situ fathead minnow exposures for assessing biological impacts of wastewater trea...
USDA-ARS?s Scientific Manuscript database
Natural rubber biosynthesis in guayule (Parthenium argentatum) is associated with moderately cold night temperatures. To begin to dissect the molecular events triggered by cold temperatures that govern rubber synthesis induction in guayule, the transcriptome of bark tissue, where rubber is produced...
Mining a human transcriptome database for Nrf2 modulators
Nuclear factor erythroid-2 related factor 2 (Nrf2) is a key transcription factor important in the protection against oxidative stress. We developed computational procedures to enable the identification of chemical, genetic and environmental modulators of Nrf2 in a large database ...
Alu elements shape the primate transcriptome by cis-regulation of RNA editing
2014-01-01
Background RNA editing by adenosine to inosine deamination is a widespread phenomenon, particularly frequent in the human transcriptome, largely due to the presence of inverted Alu repeats and their ability to form double-stranded structures – a requisite for ADAR editing. While several hundred thousand editing sites have been identified within these primate-specific repeats, the function of Alu-editing has yet to be elucidated. Results We show that inverted Alu repeats, expressed in the primate brain, can induce site-selective editing in cis on sites located several hundred nucleotides from the Alu elements. Furthermore, a computational analysis, based on available RNA-seq data, finds that site-selective editing occurs significantly closer to edited Alu elements than expected. These targets are poorly edited upon deletion of the editing inducers, as well as in homologous transcripts from organisms lacking Alus. Sequences surrounding sites near edited Alus in UTRs, have been subjected to a lesser extent of evolutionary selection than those far from edited Alus, indicating that their editing generally depends on cis-acting Alus. Interestingly, we find an enrichment of primate-specific editing within encoded sequence or the UTRs of zinc finger-containing transcription factors. Conclusions We propose a model whereby primate-specific editing is induced by adjacent Alu elements that function as recruitment elements for the ADAR editing enzymes. The enrichment of site-selective editing with potentially functional consequences on the expression of transcription factors indicates that editing contributes more profoundly to the transcriptomic regulation and repertoire in primates than previously thought. PMID:24485196
Genome-wide inference of regulatory networks in Streptomyces coelicolor.
Castro-Melchor, Marlene; Charaniya, Salim; Karypis, George; Takano, Eriko; Hu, Wei-Shou
2010-10-18
The onset of antibiotics production in Streptomyces species is co-ordinated with differentiation events. An understanding of the genetic circuits that regulate these coupled biological phenomena is essential to discover and engineer the pharmacologically important natural products made by these species. The availability of genomic tools and access to a large warehouse of transcriptome data for the model organism, Streptomyces coelicolor, provides incentive to decipher the intricacies of the regulatory cascades and develop biologically meaningful hypotheses. In this study, more than 500 samples of genome-wide temporal transcriptome data, comprising wild-type and more than 25 regulatory gene mutants of Streptomyces coelicolor probed across multiple stress and medium conditions, were investigated. Information based on transcript and functional similarity was used to update a previously-predicted whole-genome operon map and further applied to predict transcriptional networks constituting modules enriched in diverse functions such as secondary metabolism, and sigma factor. The predicted network displays a scale-free architecture with a small-world property observed in many biological networks. The networks were further investigated to identify functionally-relevant modules that exhibit functional coherence and a consensus motif in the promoter elements indicative of DNA-binding elements. Despite the enormous experimental as well as computational challenges, a systems approach for integrating diverse genome-scale datasets to elucidate complex regulatory networks is beginning to emerge. We present an integrated analysis of transcriptome data and genomic features to refine a whole-genome operon map and to construct regulatory networks at the cistron level in Streptomyces coelicolor. The functionally-relevant modules identified in this study pose as potential targets for further studies and verification.
Pal, Tarun; Malhotra, Nikhil; Chanumolu, Sree Krishna; Chauhan, Rajinder Singh
2015-07-01
The transcriptomes of Aconitum heterophyllum were assembled and characterized for the first time to decipher molecular components contributing to biosynthesis and accumulation of metabolites in tuberous roots. Aconitum heterophyllum Wall., popularly known as Atis, is a high-value medicinal herb of North-Western Himalayas. No information exists as of today on genetic factors contributing to the biosynthesis of secondary metabolites accumulating in tuberous roots, thereby, limiting genetic interventions towards genetic improvement of A. heterophyllum. Illumina paired-end sequencing followed by de novo assembly yielded 75,548 transcripts for root transcriptome and 39,100 transcripts for shoot transcriptome with minimum length of 200 bp. Biological role analysis of root versus shoot transcriptomes assigned 27,596 and 16,604 root transcripts; 12,340 and 9398 shoot transcripts into gene ontology and clusters of orthologous group, respectively. KEGG pathway mapping assigned 37 and 31 transcripts onto starch-sucrose metabolism while 329 and 341 KEGG orthologies associated with transcripts were found to be involved in biosynthesis of various secondary metabolites for root and shoot transcriptomes, respectively. In silico expression profiling of the mevalonate/2-C-methyl-D-erythritol 4-phosphate (non-mevalonate) pathway genes for aconites biosynthesis revealed 4 genes HMGR (3-hydroxy-3-methylglutaryl-CoA reductase), MVK (mevalonate kinase), MVDD (mevalonate diphosphate decarboxylase) and HDS (1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase) with higher expression in root transcriptome compared to shoot transcriptome suggesting their key role in biosynthesis of aconite alkaloids. Five genes, GMPase (geranyl diphosphate mannose pyrophosphorylase), SHAGGY, RBX1 (RING-box protein 1), SRF receptor kinases and β-amylase, implicated in tuberous root formation in other plant species showed higher levels of expression in tuberous roots compared to shoots. A total of 15,487 transcription factors belonging to bHLH, MYB, bZIP families and 399 ABC transporters which regulate biosynthesis and accumulation of bioactive compounds were identified in root and shoot transcriptomes. The expression of 5 ABC transporters involved in tuberous root development was validated by quantitative PCR analysis. Network connectivity diagrams were drawn for starch-sucrose metabolism and isoquinoline alkaloid biosynthesis associated with tuberous root growth and secondary metabolism, respectively, in root transcriptome of A. heterophyllum. The current endeavor will be of practical importance in planning a suitable genetic intervention strategy for the improvement of A. heterophyllum.
A large-scale full-length cDNA analysis to explore the budding yeast transcriptome
Miura, Fumihito; Kawaguchi, Noriko; Sese, Jun; Toyoda, Atsushi; Hattori, Masahira; Morishita, Shinichi; Ito, Takashi
2006-01-01
We performed a large-scale cDNA analysis to explore the transcriptome of the budding yeast Saccharomyces cerevisiae. We sequenced two cDNA libraries, one from the cells exponentially growing in a minimal medium and the other from meiotic cells. Both libraries were generated by using a vector-capping method that allows the accurate mapping of transcription start sites (TSSs). Consequently, we identified 11,575 TSSs associated with 3,638 annotated genomic features, including 3,599 ORFs, to suggest that most yeast genes have two or more TSSs. In addition, we identified 45 previously undescribed introns, including those affecting current ORF annotations and those spliced alternatively. Furthermore, the analysis revealed 667 transcription units in the intergenic regions and transcripts derived from antisense strands of 367 known features. We also found that 348 ORFs carry TSSs in their 3′-halves to generate sense transcripts starting from inside the ORFs. These results indicate that the budding yeast transcriptome is considerably more complex than previously thought, and it shares many recently revealed characteristics with the transcriptomes of mammals and other higher eukaryotes. Thus, the genome-wide active transcription that generates novel classes of transcripts appears to be an intrinsic feature of the eukaryotic cells. The budding yeast will serve as a versatile model for the studies on these aspects of transcriptome, and the full-length cDNA clones can function as an invaluable resource in such studies. PMID:17101987
Inferring Molecular Processes Heterogeneity from Transcriptional Data.
Gogolewski, Krzysztof; Wronowska, Weronika; Lech, Agnieszka; Lesyng, Bogdan; Gambin, Anna
2017-01-01
RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs.
Inferring Molecular Processes Heterogeneity from Transcriptional Data
Wronowska, Weronika; Lesyng, Bogdan; Gambin, Anna
2017-01-01
RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs. PMID:29362714
Systems biology approaches to understand the effects of nutrition and promote health.
Badimon, Lina; Vilahur, Gemma; Padro, Teresa
2017-01-01
Within the last years the implementation of systems biology in nutritional research has emerged as a powerful tool to understand the mechanisms by which dietary components promote health and prevent disease as well as to identify the biologically active molecules involved in such effects. Systems biology, by combining several '-omics' disciplines (mainly genomics/transcriptomics, proteomics and metabolomics), creates large data sets that upon computational integration provide in silico predictive networks that allow a more extensive analysis of the individual response to a nutritional intervention and provide a more global comprehensive understanding of how diet may influence health and disease. Numerous studies have demonstrated that diet and particularly bioactive food components play a pivotal role in helping to counteract environmental-related oxidative damage. Oxidative stress is considered to be strongly implicated in ageing and the pathophysiology of numerous diseases including neurodegenerative disease, cancers, metabolic disorders and cardiovascular diseases. In the following review we will provide insights into the role of systems biology in nutritional research and focus on transcriptomic, proteomic and metabolomics studies that have demonstrated the ability of functional foods and their bioactive components to fight against oxidative damage and contribute to health benefits. © 2016 The British Pharmacological Society.
Kim, Taewook; Park, June Hyun; Lee, Sang-Gil; Kim, Soyoung; Kim, Jihyun; Lee, Jungho; Shin, Chanseok
2017-08-01
MicroRNAs (miRNAs) are essential small RNA molecules that regulate the expression of target mRNAs in plants and animals. Here, we aimed to identify miRNAs and their putative targets in Hibiscus syriacus , the national flower of South Korea. We employed high-throughput sequencing of small RNAs obtained from four different tissues ( i.e. , leaf, root, flower, and ovary) and identified 33 conserved and 30 novel miRNA families, many of which showed differential tissue-specific expressions. In addition, we computationally predicted novel targets of miRNAs and validated some of them using 5' rapid amplification of cDNA ends analysis. One of the validated novel targets of miR477 was a terpene synthase, the primary gene involved in the formation of disease-resistant terpene metabolites such as sterols and phytoalexins. In addition, a predicted target of conserved miRNAs, miR396, is SHORT VEGETATIVE PHASE , which is involved in flower initiation and is duplicated in H. syriacus . Collectively, this study provides the first reliable draft of the H. syriacus miRNA transcriptome that should constitute a basis for understanding the biological roles of miRNAs in H. syriacus.
Ranninger, Christina; Rurik, Marc; Limonciel, Alice; Ruzek, Silke; Reischl, Roland; Wilmes, Anja; Jennings, Paul; Hewitt, Philip; Dekant, Wolfgang; Kohlbacher, Oliver; Huber, Christian G.
2015-01-01
Untargeted metabolomics has the potential to improve the predictivity of in vitro toxicity models and therefore may aid the replacement of expensive and laborious animal models. Here we describe a long term repeat dose nephrotoxicity study conducted on the human renal proximal tubular epithelial cell line, RPTEC/TERT1, treated with 10 and 35 μmol·liter−1 of chloroacetaldehyde, a metabolite of the anti-cancer drug ifosfamide. Our study outlines the establishment of an automated and easy to use untargeted metabolomics workflow for HPLC-high resolution mass spectrometry data. Automated data analysis workflows based on open source software (OpenMS, KNIME) enabled a comprehensive and reproducible analysis of the complex and voluminous metabolomics data produced by the profiling approach. Time- and concentration-dependent responses were clearly evident in the metabolomic profiles. To obtain a more comprehensive picture of the mode of action, transcriptomics and proteomics data were also integrated. For toxicity profiling of chloroacetaldehyde, 428 and 317 metabolite features were detectable in positive and negative modes, respectively, after stringent removal of chemical noise and unstable signals. Changes upon treatment were explored using principal component analysis, and statistically significant differences were identified using linear models for microarray assays. The analysis revealed toxic effects only for the treatment with 35 μmol·liter−1 for 3 and 14 days. The most regulated metabolites were glutathione and metabolites related to the oxidative stress response of the cells. These findings are corroborated by proteomics and transcriptomics data, which show, among other things, an activation of the Nrf2 and ATF4 pathways. PMID:26055719
Ponce, Dalia; Brinkman, Diane L.; Potriquet, Jeremy; Mulvenna, Jason
2016-01-01
Jellyfish venoms are rich sources of toxins designed to capture prey or deter predators, but they can also elicit harmful effects in humans. In this study, an integrated transcriptomic and proteomic approach was used to identify putative toxins and their potential role in the venom of the scyphozoan jellyfish Chrysaora fuscescens. A de novo tentacle transcriptome, containing more than 23,000 contigs, was constructed and used in proteomic analysis of C. fuscescens venom to identify potential toxins. From a total of 163 proteins identified in the venom proteome, 27 were classified as putative toxins and grouped into six protein families: proteinases, venom allergens, C-type lectins, pore-forming toxins, glycoside hydrolases and enzyme inhibitors. Other putative toxins identified in the transcriptome, but not the proteome, included additional proteinases as well as lipases and deoxyribonucleases. Sequence analysis also revealed the presence of ShKT domains in two putative venom proteins from the proteome and an additional 15 from the transcriptome, suggesting potential ion channel blockade or modulatory activities. Comparison of these potential toxins to those from other cnidarians provided insight into their possible roles in C. fuscescens venom and an overview of the diversity of potential toxin families in cnidarian venoms. PMID:27058558
Fathead minnow and zebrafish are among the most intensively studied fish species in environmental toxicogenomics. To aid the assessment and interpretation of subtle transcriptomic effects from treatment conditions of interest, there needs to be a better characterization and unde...
USDA-ARS?s Scientific Manuscript database
Sclerotinia sclerotiorum and S. trifoliorum are two closely related devastating plant pathogens. Extensive research has been conducted on S. sclerotiorum and its genome sequences are available. To take advantages of the genomic information of S. sclerotiorum, we compared the transcriptome of S. tr...
Transcriptome analysis of Pseudomonas syringae identifies new genes, ncRNAs, and antisense activity
USDA-ARS?s Scientific Manuscript database
To fully understand how bacteria respond to their environment, it is essential to assess genome-wide transcriptional activity. New high throughput sequencing technologies make it possible to query the transcriptome of an organism in an efficient unbiased manner. We applied a strand-specific method t...
Performance of Arma chinensis reared on an artificial diet formulated using transcriptomic methods
USDA-ARS?s Scientific Manuscript database
An artificial diet formulated for continuous rearing of the predator Arma chinensis was inferior to natural prey when evaluated using life history parameters. A transcriptome analysis identified differentially expressed genes in diet-fed and prey-fed A. chinensis that were suggestive of molecular me...
USDA-ARS?s Scientific Manuscript database
To analyze transcriptome response to virus infection, we have assembled currently available microarray data on changes in gene expression levels in compatible Arabidopsis-virus interactions. We used the mean r (Pearson’s correlation coefficient) for neighboring pairs to estimate pairwise local simil...
USDA-ARS?s Scientific Manuscript database
Aspergillus flavus and aflatoxin contamination in the field are known to be influenced by numerous stress factors, particularly drought and heat stress. However, the purpose of aflatoxin production is unknown. Here, we report transcriptome analyses comprised of 282.6 Gb of sequencing data describing...
USDA-ARS?s Scientific Manuscript database
Alternative splicing is a well-known phenomenon that dramatically increases eukaryotic transcriptome diversity. The extent of mRNA isoform diversity among porcine tissues was assessed using Pacific Biosciences single-molecule long-read isoform sequencing (Iso-Seq) and Illumina short read sequencing ...
USDA-ARS?s Scientific Manuscript database
Understanding the molecular and genetic mechanisms underlying variation in seed composition and contents among different genotypes is important for soybean oil quality improvement. We designed a bioinformatics approach to compare seed transcriptomes of 9 soybean genotypes varying in oil composition ...
Lepoivre, Cyrille; Bergon, Aurélie; Lopez, Fabrice; Perumal, Narayanan B; Nguyen, Catherine; Imbert, Jean; Puthier, Denis
2012-01-31
Deciphering gene regulatory networks by in silico approaches is a crucial step in the study of the molecular perturbations that occur in diseases. The development of regulatory maps is a tedious process requiring the comprehensive integration of various evidences scattered over biological databases. Thus, the research community would greatly benefit from having a unified database storing known and predicted molecular interactions. Furthermore, given the intrinsic complexity of the data, the development of new tools offering integrated and meaningful visualizations of molecular interactions is necessary to help users drawing new hypotheses without being overwhelmed by the density of the subsequent graph. We extend the previously developed TranscriptomeBrowser database with a set of tables containing 1,594,978 human and mouse molecular interactions. The database includes: (i) predicted regulatory interactions (computed by scanning vertebrate alignments with a set of 1,213 position weight matrices), (ii) potential regulatory interactions inferred from systematic analysis of ChIP-seq experiments, (iii) regulatory interactions curated from the literature, (iv) predicted post-transcriptional regulation by micro-RNA, (v) protein kinase-substrate interactions and (vi) physical protein-protein interactions. In order to easily retrieve and efficiently analyze these interactions, we developed In-teractomeBrowser, a graph-based knowledge browser that comes as a plug-in for Transcriptome-Browser. The first objective of InteractomeBrowser is to provide a user-friendly tool to get new insight into any gene list by providing a context-specific display of putative regulatory and physical interactions. To achieve this, InteractomeBrowser relies on a "cell compartments-based layout" that makes use of a subset of the Gene Ontology to map gene products onto relevant cell compartments. This layout is particularly powerful for visual integration of heterogeneous biological information and is a productive avenue in generating new hypotheses. The second objective of InteractomeBrowser is to fill the gap between interaction databases and dynamic modeling. It is thus compatible with the network analysis software Cytoscape and with the Gene Interaction Network simulation software (GINsim). We provide examples underlying the benefits of this visualization tool for large gene set analysis related to thymocyte differentiation. The InteractomeBrowser plugin is a powerful tool to get quick access to a knowledge database that includes both predicted and validated molecular interactions. InteractomeBrowser is available through the TranscriptomeBrowser framework and can be found at: http://tagc.univ-mrs.fr/tbrowser/. Our database is updated on a regular basis.
Single-cell entropy for accurate estimation of differentiation potency from a cell's transcriptome
NASA Astrophysics Data System (ADS)
Teschendorff, Andrew E.; Enver, Tariq
2017-06-01
The ability to quantify differentiation potential of single cells is a task of critical importance. Here we demonstrate, using over 7,000 single-cell RNA-Seq profiles, that differentiation potency of a single cell can be approximated by computing the signalling promiscuity, or entropy, of a cell's transcriptome in the context of an interaction network, without the need for feature selection. We show that signalling entropy provides a more accurate and robust potency estimate than other entropy-based measures, driven in part by a subtle positive correlation between the transcriptome and connectome. Signalling entropy identifies known cell subpopulations of varying potency and drug resistant cancer stem-cell phenotypes, including those derived from circulating tumour cells. It further reveals that expression heterogeneity within single-cell populations is regulated. In summary, signalling entropy allows in silico estimation of the differentiation potency and plasticity of single cells and bulk samples, providing a means to identify normal and cancer stem-cell phenotypes.
Single-cell entropy for accurate estimation of differentiation potency from a cell's transcriptome
Teschendorff, Andrew E.; Enver, Tariq
2017-01-01
The ability to quantify differentiation potential of single cells is a task of critical importance. Here we demonstrate, using over 7,000 single-cell RNA-Seq profiles, that differentiation potency of a single cell can be approximated by computing the signalling promiscuity, or entropy, of a cell's transcriptome in the context of an interaction network, without the need for feature selection. We show that signalling entropy provides a more accurate and robust potency estimate than other entropy-based measures, driven in part by a subtle positive correlation between the transcriptome and connectome. Signalling entropy identifies known cell subpopulations of varying potency and drug resistant cancer stem-cell phenotypes, including those derived from circulating tumour cells. It further reveals that expression heterogeneity within single-cell populations is regulated. In summary, signalling entropy allows in silico estimation of the differentiation potency and plasticity of single cells and bulk samples, providing a means to identify normal and cancer stem-cell phenotypes. PMID:28569836
Sager, Monica; Yeat, Nai Chien; Pajaro-Van der Stadt, Stefan; Lin, Charlotte; Ren, Qiuyin; Lin, Jimmy
2015-01-01
Transcriptomic technologies are evolving to diagnose cancer earlier and more accurately to provide greater predictive and prognostic utility to oncologists and patients. Digital techniques such as RNA sequencing are replacing still-imaging techniques to provide more detailed analysis of the transcriptome and aberrant expression that causes oncogenesis, while companion diagnostics are developing to determine the likely effectiveness of targeted treatments. This article examines recent advancements in molecular profiling research and technology as applied to cancer diagnosis, clinical applications and predictions for the future of personalized medicine in oncology.
Kang, Yun; McMillan, Ian; Norris, Michael H; Hoang, Tung T
2015-07-01
Until recently, transcriptome analyses of single cells have been confined to eukaryotes. The information obtained from single-cell transcripts can provide detailed insight into spatiotemporal gene expression, and it could be even more valuable if expanded to prokaryotic cells. Transcriptome analysis of single prokaryotic cells is a recently developed and powerful tool. Here we describe a procedure that allows amplification of the total transcript of a single prokaryotic cell for in-depth analysis. This is performed by using a laser-capture microdissection instrument for single-cell isolation, followed by reverse transcription via Moloney murine leukemia virus, degradation of chromosomal DNA with McrBC and DpnI restriction enzymes, single-stranded cDNA (ss-cDNA) ligation using T4 polynucleotide kinase and CircLigase, and polymerization of ss-cDNA to double-stranded cDNA (ds-cDNA) by Φ29 polymerase. This procedure takes ∼5 d, and sufficient amounts of ds-cDNA can be obtained from single-cell RNA template for further microarray analysis.
Oh, Dong-Ha; Barkla, Bronwyn J; Vera-Estrella, Rosario; Pantoja, Omar; Lee, Sang-Yeol; Bohnert, Hans J; Dassanayake, Maheshi
2015-08-01
Mesembryanthemum crystallinum (ice plant) exhibits extreme tolerance to salt. Epidermal bladder cells (EBCs), developing on the surface of aerial tissues and specialized in sodium sequestration and other protective functions, are critical for the plant's stress adaptation. We present the first transcriptome analysis of EBCs isolated from intact plants, to investigate cell type-specific responses during plant salt adaptation. We developed a de novo assembled, nonredundant EBC reference transcriptome. Using RNAseq, we compared the expression patterns of the EBC-specific transcriptome between control and salt-treated plants. The EBC reference transcriptome consists of 37 341 transcript-contigs, of which 7% showed significantly different expression between salt-treated and control samples. We identified significant changes in ion transport, metabolism related to energy generation and osmolyte accumulation, stress signalling, and organelle functions, as well as a number of lineage-specific genes of unknown function, in response to salt treatment. The salinity-induced EBC transcriptome includes active transcript clusters, refuting the view of EBCs as passive storage compartments in the whole-plant stress response. EBC transcriptomes, differing from those of whole plants or leaf tissue, exemplify the importance of cell type-specific resolution in understanding stress adaptive mechanisms. No claim to original US government works. New Phytologist © 2015 New Phytologist Trust.
Xu, Hai-Ming; Kong, Xiang-Dong; Chen, Fei; Huang, Ji-Xiang; Lou, Xiang-Yang; Zhao, Jian-Yi
2015-10-24
Brassica napus is an important oilseed crop. Dissection of the genetic architecture underlying oil-related biological processes will greatly facilitates the genetic improvement of rapeseed. The differential gene expression during pod development offers a snapshot on the genes responsible for oil accumulation in. To identify candidate genes in the linkage peaks reported previously, we used RNA sequencing (RNA-Seq) technology to analyze the pod transcriptomes of German cultivar Sollux and Chinese inbred line Gaoyou. The RNA samples were collected for RNA-Seq at 5-7, 15-17 and 25-27 days after flowering (DAF). Bioinformatics analysis was performed to investigate differentially expressed genes (DEGs). Gene annotation analysis was integrated with QTL mapping and Brassica napus pod transcriptome profiling to detect potential candidate genes in oilseed. Four hundred sixty five and two thousand, one hundred fourteen candidate DEGs were identified, respectively, between two varieties at the same stages and across different periods of each variety. Then, 33 DEGs between Sollux and Gaoyou were identified as the candidate genes affecting seed oil content by combining those DEGs with the quantitative trait locus (QTL) mapping results, of which, one was found to be homologous to Arabidopsis thaliana lipid-related genes. Intervarietal DEGs of lipid pathways in QTL regions represent important candidate genes for oil-related traits. Integrated analysis of transcriptome profiling, QTL mapping and comparative genomics with other relative species leads to efficient identification of most plausible functional genes underlying oil-content related characters, offering valuable resources for bettering breeding program of Brassica napus. This study provided a comprehensive overview on the pod transcriptomes of two varieties with different oil-contents at the three developmental stages.
Computational Biology Methods for Characterization of Pluripotent Cells.
Araúzo-Bravo, Marcos J
2016-01-01
Pluripotent cells are a powerful tool for regenerative medicine and drug discovery. Several techniques have been developed to induce pluripotency, or to extract pluripotent cells from different tissues and biological fluids. However, the characterization of pluripotency requires tedious, expensive, time-consuming, and not always reliable wet-lab experiments; thus, an easy, standard quality-control protocol of pluripotency assessment remains to be established. Here to help comes the use of high-throughput techniques, and in particular, the employment of gene expression microarrays, which has become a complementary technique for cellular characterization. Research has shown that the transcriptomics comparison with an Embryonic Stem Cell (ESC) of reference is a good approach to assess the pluripotency. Under the premise that the best protocol is a computer software source code, here I propose and explain line by line a software protocol coded in R-Bioconductor for pluripotency assessment based on the comparison of transcriptomics data of pluripotent cells with an ESC of reference. I provide advice for experimental design, warning about possible pitfalls, and guides for results interpretation.
Maron, Jill L.; Hwang, Jooyeon S.; Pathak, Subash; Ruthazer, Robin; Russell, Ruby L.; Alterovitz, Gil
2014-01-01
Objective To combine mathematical modeling of salivary gene expression microarray data and systems biology annotation with RT-qPCR amplification to identify (phase I) and validate (phase II) salivary biomarker analysis for the prediction of oral feeding readiness in preterm infants. Study design Comparative whole transcriptome microarray analysis from 12 preterm newborns pre- and post-oral feeding success was used for computational modeling and systems biology analysis to identify potential salivary transcripts associated with oral feeding success (phase I). Selected gene expression biomarkers (15 from computational modeling; 6 evidence-based; and 3 reference) were evaluated by RT-qPCR amplification on 400 salivary samples from successful (n=200) and unsuccessful (n=200) oral feeders (phase II). Genes, alone and in combination, were evaluated by a multivariate analysis controlling for sex and post-conceptional age (PCA) to determine the probability that newborns achieved successful oral feeding. Results Advancing post-conceptional age (p < 0.001) and female sex (p = 0.05) positively predicted an infant’s ability to feed orally. A combination of five genes, NPY2R (hunger signaling), AMPK (energy homeostasis), PLXNA1 (olfactory neurogenesis), NPHP4 (visual behavior) and WNT3 (facial development), in addition to PCA and sex, demonstrated good accuracy for determining feeding success (AUROC = 0.78). Conclusions We have identified objective and biologically relevant salivary biomarkers that noninvasively assess a newborn’s developing brain, sensory and facial development as they relate to oral feeding success. Understanding the mechanisms that underlie the development of oral feeding readiness through translational and computational methods may improve clinical decision making while decreasing morbidities and health care costs. PMID:25620512
2010-01-01
Background Fruit development, maturation and ripening consists of a complex series of biochemical and physiological changes that in climacteric fruits, including apple and tomato, are coordinated by the gaseous hormone ethylene. These changes lead to final fruit quality and understanding of the functional machinery underlying these processes is of both biological and practical importance. To date many reports have been made on the analysis of gene expression in apple. In this study we focused our investigation on the role of ethylene during apple maturation, specifically comparing transcriptomics of normal ripening with changes resulting from application of the hormone receptor competitor 1-Methylcyclopropene. Results To gain insight into the molecular process regulating ripening in apple, and to compare to tomato (model species for ripening studies), we utilized both homologous and heterologous (tomato) microarray to profile transcriptome dynamics of genes involved in fruit development and ripening, emphasizing those which are ethylene regulated. The use of both types of microarrays facilitated transcriptome comparison between apple and tomato (for the later using data previously published and available at the TED: tomato expression database) and highlighted genes conserved during ripening of both species, which in turn represent a foundation for further comparative genomic studies. The cross-species analysis had the secondary aim of examining the efficiency of heterologous (specifically tomato) microarray hybridization for candidate gene identification as related to the ripening process. The resulting transcriptomics data revealed coordinated gene expression during fruit ripening of a subset of ripening-related and ethylene responsive genes, further facilitating the analysis of ethylene response during fruit maturation and ripening. Conclusion Our combined strategy based on microarray hybridization enabled transcriptome characterization during normal climacteric apple ripening, as well as definition of ethylene-dependent transcriptome changes. Comparison with tomato fruit maturation and ethylene responsive transcriptome activity facilitated identification of putative conserved orthologous ripening-related genes, which serve as an initial set of candidates for assessing conservation of gene activity across genomes of fruit bearing plant species. PMID:20973957
Genomics and transcriptomics in drug discovery.
Dopazo, Joaquin
2014-02-01
The popularization of genomic high-throughput technologies is causing a revolution in biomedical research and, particularly, is transforming the field of drug discovery. Systems biology offers a framework to understand the extensive human genetic heterogeneity revealed by genomic sequencing in the context of the network of functional, regulatory and physical protein-drug interactions. Thus, approaches to find biomarkers and therapeutic targets will have to take into account the complex system nature of the relationships of the proteins with the disease. Pharmaceutical companies will have to reorient their drug discovery strategies considering the human genetic heterogeneity. Consequently, modeling and computational data analysis will have an increasingly important role in drug discovery. Copyright © 2013 Elsevier Ltd. All rights reserved.
Yassour, Moran; Grabherr, Manfred; Blood, Philip D.; Bowden, Joshua; Couger, Matthew Brian; Eccles, David; Li, Bo; Lieber, Matthias; MacManes, Matthew D.; Ott, Michael; Orvis, Joshua; Pochet, Nathalie; Strozzi, Francesco; Weeks, Nathan; Westerman, Rick; William, Thomas; Dewey, Colin N.; Henschel, Robert; LeDuc, Richard D.; Friedman, Nir; Regev, Aviv
2013-01-01
De novo assembly of RNA-Seq data allows us to study transcriptomes without the need for a genome sequence, such as in non-model organisms of ecological and evolutionary importance, cancer samples, or the microbiome. In this protocol, we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-Seq data in non-model organisms. We also present Trinity’s supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples, and approaches to identify protein coding genes. In an included tutorial we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sf.net. PMID:23845962
Li, Qinghong; Freeman, Lisa M; Rush, John E; Huggins, Gordon S; Kennedy, Adam D; Labuda, Jeffrey A; Laflamme, Dorothy P; Hannah, Steven S
2015-08-01
Canine degenerative mitral valve disease (DMVD) is the most common form of heart disease in dogs. The objective of this study was to identify cellular and metabolic pathways that play a role in DMVD by performing metabolomics and transcriptomics analyses on serum and tissue (mitral valve and left ventricle) samples previously collected from dogs with DMVD or healthy hearts. Gas or liquid chromatography followed by mass spectrophotometry were used to identify metabolites in serum. Transcriptomics analysis of tissue samples was completed using RNA-seq, and selected targets were confirmed by RT-qPCR. Random Forest analysis was used to classify the metabolites that best predicted the presence of DMVD. Results identified 41 known and 13 unknown serum metabolites that were significantly different between healthy and DMVD dogs, representing alterations in fat and glucose energy metabolism, oxidative stress, and other pathways. The three metabolites with the greatest single effect in the Random Forest analysis were γ-glutamylmethionine, oxidized glutathione, and asymmetric dimethylarginine. Transcriptomics analysis identified 812 differentially expressed transcripts in left ventricle samples and 263 in mitral valve samples, representing changes in energy metabolism, antioxidant function, nitric oxide signaling, and extracellular matrix homeostasis pathways. Many of the identified alterations may benefit from nutritional or medical management. Our study provides evidence of the growing importance of integrative approaches in multi-omics research in veterinary and nutritional sciences.
Comparative de novo transcriptome analysis of male and female Sea buckthorn.
Bansal, Ankush; Salaria, Mehul; Sharma, Tashil; Stobdan, Tsering; Kant, Anil
2018-02-01
Sea buckthorn is a dioecious medicinal plant found at high altitude. The plant has both male and female reproductive organs in separate individuals. In this article, whole transcriptome de novo assemblies of male and female flower bud samples were carried out using Illumina NextSeq 500 platform to determine the role of the genes involved in sex determination. Moreover, genes with differential expression in male and female transcriptomes were identified to understand the underlying sex determination mechanism. The current study showed 63,904 and 62,272 coding sequences (CDS) in female and male transcriptome data sets, respectively. 16,831 common CDS were screened out from both transcriptomes, out of which 625 were upregulated and 491 were found to be downregulated. To understand the potential regulatory roles of differentially expressed genes in metabolic networks and biosynthetic pathways: KEGG mapping, gene ontology, and co-expression network analysis were performed. Comparison with Flowering Interactive Database (FLOR-ID) resulted in eight differentially expressed genes viz. CHD3-type chromatin-remodeling factor PICKLE ( PKL ), phytochrome-associated serine/threonine-protein phosphatase ( FYPP ), protein TOPLESS ( TPL ), sensitive to freezing 6 ( SFR6 ), lysine-specific histone demethylase 1 homolog 1 ( LDL1 ), pre-mRNA-processing-splicing factor 8A ( PRP8A ), sucrose synthase 4 ( SUS4 ), ubiquitin carboxyl-terminal hydrolase 12 ( UBP12 ), known to be broadly involved in flowering, photoperiodism, embryo development, and cold response pathways. Male and female flower bud transcriptome data of Sea buckthorn may provide comprehensive information at genomic level for the identification of genetic regulation involved in sex determination.
USDA-ARS?s Scientific Manuscript database
The soybean transcriptome displays strong variation along the day in optimal growth conditions and also in response to adverse circumstances, like drought stress. However, no study conducted to date has presented suitable reference genes, with stable expression along the day, for relative gene expre...
Comparison of ribosomal RNA removal methods for transcriptome sequencing workflows in teleost fish
USDA-ARS?s Scientific Manuscript database
RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. rRNAs are commonly removed from total RNA via either mRNA selection or rRNA depletion. These methods have...
USDA-ARS?s Scientific Manuscript database
The whitefly (Bemisia tabaci) causes tremendous damage to cotton production worldwide. However, very limited information is available about how plants perceive and defend themselves from this destructive pest. In this study, the transcriptomics differences between two cotton cultivars that exhibit e...
USDA-ARS?s Scientific Manuscript database
The woody resurrection plant Myrothamnus flabellifolia has remarkable tolerance to desiccation. Pyro-sequencing technology permitted us to analyze the transcriptome of M. flabellifolia during both dehydration and rehydration. We identified a total of 8287 and 8542 differentially transcribed genes du...
Amber J. Vanden Wymelenberg; Jill Gaskell; Michael Mozuch; Grzegorz Sabat; John Ralph; Oleksandr Skyba; Shawn D Mansfield; Robert A. Blanchette; Diego Martinez; Igor Grigoriev; Philip J Kersten; Daniel Cullen
2010-01-01
Cellulose degradation by brown rot fungi, such as Postia placenta, is poorly understood relative to the phylogenetically related white rot basidiomycete, Phanerochaete chrysosporium. To elucidate the number, structure, and regulation of genes involved in lignocellulosic cell wall attack, secretome and transcriptome analyses were performed on both wood decay fungi...
USDA-ARS?s Scientific Manuscript database
While many studies have characterized the transcriptome of plants attacked by herbivorous insect pests, few have undertaken an examination of the genes affected by root pests. We have subjected maize seedlings to infestation by southern corn rootworm (SCR) Diabrotica undecimpunctata howardi and usin...
USDA-ARS?s Scientific Manuscript database
Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. "K...
USDA-ARS?s Scientific Manuscript database
An essential step to understanding the genomic biology of any organism is to comprehensively survey its transcriptome. We present the Bovine Gene Atlas (BGA) a compendium of over 7.2 million unique 20 base Illumina DGE tags representing 100 tissue transcriptomes collected primarily from L1 Dominette...
Cavill, Rachel; Kamburov, Atanas; Ellis, James K; Athersuch, Toby J; Blagrove, Marcus S C; Herwig, Ralf; Ebbels, Timothy M D; Keun, Hector C
2011-03-01
Using transcriptomic and metabolomic measurements from the NCI60 cell line panel, together with a novel approach to integration of molecular profile data, we show that the biochemical pathways associated with tumour cell chemosensitivity to platinum-based drugs are highly coincident, i.e. they describe a consensus phenotype. Direct integration of metabolome and transcriptome data at the point of pathway analysis improved the detection of consensus pathways by 76%, and revealed associations between platinum sensitivity and several metabolic pathways that were not visible from transcriptome analysis alone. These pathways included the TCA cycle and pyruvate metabolism, lipoprotein uptake and nucleotide synthesis by both salvage and de novo pathways. Extending the approach across a wide panel of chemotherapeutics, we confirmed the specificity of the metabolic pathway associations to platinum sensitivity. We conclude that metabolic phenotyping could play a role in predicting response to platinum chemotherapy and that consensus-phenotype integration of molecular profiling data is a powerful and versatile tool for both biomarker discovery and for exploring the complex relationships between biological pathways and drug response.
CBrowse: a SAM/BAM-based contig browser for transcriptome assembly visualization and analysis.
Li, Pei; Ji, Guoli; Dong, Min; Schmidt, Emily; Lenox, Douglas; Chen, Liangliang; Liu, Qi; Liu, Lin; Zhang, Jie; Liang, Chun
2012-09-15
To address the impending need for exploring rapidly increased transcriptomics data generated for non-model organisms, we developed CBrowse, an AJAX-based web browser for visualizing and analyzing transcriptome assemblies and contigs. Designed in a standard three-tier architecture with a data pre-processing pipeline, CBrowse is essentially a Rich Internet Application that offers many seamlessly integrated web interfaces and allows users to navigate, sort, filter, search and visualize data smoothly. The pre-processing pipeline takes the contig sequence file in FASTA format and its relevant SAM/BAM file as the input; detects putative polymorphisms, simple sequence repeats and sequencing errors in contigs and generates image, JSON and database-compatible CSV text files that are directly utilized by different web interfaces. CBowse is a generic visualization and analysis tool that facilitates close examination of assembly quality, genetic polymorphisms, sequence repeats and/or sequencing errors in transcriptome sequencing projects. CBrowse is distributed under the GNU General Public License, available at http://bioinfolab.muohio.edu/CBrowse/ liangc@muohio.edu or liangc.mu@gmail.com; glji@xmu.edu.cn Supplementary data are available at Bioinformatics online.
Li, Yiping; Li, Yanhong; Bai, Zhenjiang; Pan, Jian; Wang, Jian; Fang, Fang
2017-12-13
Sepsis represents a complex disease with the dysregulated inflammatory response and high mortality rate. The goal of this study was to identify potential transcriptomic markers in developing pediatric sepsis by a co-expression module analysis of the transcriptomic dataset. Using the R software and Bioconductor packages, we performed a weighted gene co-expression network analysis to identify co-expression modules significantly associated with pediatric sepsis. Functional interpretation (gene ontology and pathway analysis) and enrichment analysis with known transcription factors and microRNAs of the identified candidate modules were then performed. In modules significantly associated with sepsis, the intramodular analysis was further performed and "hub genes" were identified and validated by quantitative real-time PCR (qPCR) in this study. 15 co-expression modules in total were detected, and four modules ("midnight blue", "cyan", "brown", and "tan") were most significantly associated with pediatric sepsis and suggested as potential sepsis-associated modules. Gene ontology analysis and pathway analysis revealed that these four modules strongly associated with immune response. Three of the four sepsis-associated modules were also enriched with known transcription factors (false discovery rate-adjusted P < 0.05). Hub genes were identified in each of the four modules. Four of the identified hub genes (MYB proto-oncogene like 1, killer cell lectin like receptor G1, stomatin, and membrane spanning 4-domains A4A) were further validated to be differentially expressed between septic children and controls by qPCR. Four pediatric sepsis-associated co-expression modules were identified in this study. qPCR results suggest that hub genes in these modules are potential transcriptomic markers for pediatric sepsis diagnosis. These results provide novel insights into the pathogenesis of pediatric sepsis and promote the generation of diagnostic gene sets.
Torre, Sara; Tattini, Massimiliano; Brunetti, Cecilia; Guidi, Lucia; Gori, Antonella; Marzano, Cristina; Landi, Marco; Sebastiani, Federico
2016-01-01
Sweet basil (Ocimum basilicum), one of the most popular cultivated herbs worldwide, displays a number of varieties differing in several characteristics, such as the color of the leaves. The development of a reference transcriptome for sweet basil, and the analysis of differentially expressed genes in acyanic and cyanic cultivars exposed to natural sunlight irradiance, has interest from horticultural and biological point of views. There is still great uncertainty about the significance of anthocyanins in photoprotection, and how green and red morphs may perform when exposed to photo-inhibitory light, a condition plants face on daily and seasonal basis. We sequenced the leaf transcriptome of the green-leaved Tigullio (TIG) and the purple-leaved Red Rubin (RR) exposed to full sunlight over a four-week experimental period. We assembled and annotated 111,007 transcripts. A total of 5,468 and 5,969 potential SSRs were identified in TIG and RR, respectively, out of which 66 were polymorphic in silico. Comparative analysis of the two transcriptomes showed 2,372 differentially expressed genes (DEGs) clustered in 222 enriched Gene ontology terms. Green and red basil mostly differed for transcripts abundance of genes involved in secondary metabolism. While the biosynthesis of waxes was up-regulated in red basil, the biosynthesis of flavonols and carotenoids was up-regulated in green basil. Data from our study provides a comprehensive transcriptome survey, gene sequence resources and microsatellites that can be used for further investigations in sweet basil. The analysis of DEGs and their functional classification also offers new insights on the functional role of anthocyanins in photoprotection.
2013-01-01
Backgroud Isatis indigotica is a widely used herb for the clinical treatment of colds, fever, and influenza in Traditional Chinese Medicine (TCM). Various structural classes of compounds have been identified as effective ingredients. However, little is known at genetics level about these active metabolites. In the present study, we performed de novo transcriptome sequencing for the first time to produce a comprehensive dataset of I. indigotica. Results A database of 36,367 unigenes (average length = 1,115.67 bases) was generated by performing transcriptome sequencing. Based on the gene annotation of the transcriptome, 104 unigenes were identified covering most of the catalytic steps in the general biosynthetic pathways of indole, terpenoid, and phenylpropanoid. Subsequently, the organ-specific expression patterns of the genes involved in these pathways, and their responses to methyl jasmonate (MeJA) induction, were investigated. Metabolites profile of effective phenylpropanoid showed accumulation pattern of secondary metabolites were mostly correlated with the transcription of their biosynthetic genes. According to the analysis of UDP-dependent glycosyltransferases (UGT) family, several flavonoids were indicated to exist in I. indigotica and further identified by metabolic profile using UPLC/Q-TOF. Moreover, applying transcriptome co-expression analysis, nine new, putative UGTs were suggested as flavonol glycosyltransferases and lignan glycosyltransferases. Conclusions This database provides a pool of candidate genes involved in biosynthesis of effective metabolites in I. indigotica. Furthermore, the comprehensive analysis and characterization of the significant pathways are expected to give a better insight regarding the diversity of chemical composition, synthetic characteristics, and the regulatory mechanism which operate in this medical herb. PMID:24308360
Langley, Raymond J; Tipper, Jennifer L; Bruse, Shannon; Baron, Rebecca M; Tsalik, Ephraim L; Huntley, James; Rogers, Angela J; Jaramillo, Richard J; O'Donnell, Denise; Mega, William M; Keaton, Mignon; Kensicki, Elizabeth; Gazourian, Lee; Fredenburgh, Laura E; Massaro, Anthony F; Otero, Ronny M; Fowler, Vance G; Rivers, Emanuel P; Woods, Chris W; Kingsmore, Stephen F; Sopori, Mohan L; Perrella, Mark A; Choi, Augustine M K; Harrod, Kevin S
2014-08-15
Sepsis is a leading cause of morbidity and mortality. Currently, early diagnosis and the progression of the disease are difficult to make. The integration of metabolomic and transcriptomic data in a primate model of sepsis may provide a novel molecular signature of clinical sepsis. To develop a biomarker panel to characterize sepsis in primates and ascertain its relevance to early diagnosis and progression of human sepsis. Intravenous inoculation of Macaca fascicularis with Escherichia coli produced mild to severe sepsis, lung injury, and death. Plasma samples were obtained before and after 1, 3, and 5 days of E. coli challenge and at the time of killing. At necropsy, blood, lung, kidney, and spleen samples were collected. An integrative analysis of the metabolomic and transcriptomic datasets was performed to identify a panel of sepsis biomarkers. The extent of E. coli invasion, respiratory distress, lethargy, and mortality was dependent on the bacterial dose. Metabolomic and transcriptomic changes characterized severe infections and death, and indicated impaired mitochondrial, peroxisomal, and liver functions. Analysis of the pulmonary transcriptome and plasma metabolome suggested impaired fatty acid catabolism regulated by peroxisome-proliferator activated receptor signaling. A representative four-metabolite model effectively diagnosed sepsis in primates (area under the curve, 0.966) and in two human sepsis cohorts (area under the curve, 0.78 and 0.82). A model of sepsis based on reciprocal metabolomic and transcriptomic data was developed in primates and validated in two human patient cohorts. It is anticipated that the identified parameters will facilitate early diagnosis and management of sepsis.
Danchin, Etienne G.J.; Perfus-Barbeoch, Laetitia; Rancurel, Corinne; Thorpe, Peter; Da Rocha, Martine; Bajew, Simon; Neilson, Roy; Sokolova (Guzeeva), Elena; Da Silva, Corinne; Guy, Julie; Labadie, Karine; Esmenjaud, Daniel; Helder, Johannes; Jones, John T.
2017-01-01
Nematodes have evolved the ability to parasitize plants on at least four independent occasions, with plant parasites present in Clades 1, 2, 10 and 12 of the phylum. In the case of Clades 10 and 12, horizontal gene transfer of plant cell wall degrading enzymes from bacteria and fungi has been implicated in the evolution of plant parasitism. We have used ribonucleic acid sequencing (RNAseq) to generate reference transcriptomes for two economically important nematode species, Xiphinema index and Longidorus elongatus, representative of two genera within the early-branching Clade 2 of the phylum Nematoda. We used a transcriptome-wide analysis to identify putative horizontal gene transfer events. This represents the first in-depth transcriptome analysis from any plant-parasitic nematode of this clade. For each species, we assembled ~30 million Illumina reads into a reference transcriptome. We identified 62 and 104 transcripts, from X. index and L. elongatus, respectively, that were putatively acquired via horizontal gene transfer. By cross-referencing horizontal gene transfer prediction with a phylum-wide analysis of Pfam domains, we identified Clade 2-specific events. Of these, a GH12 cellulase from X. index was analysed phylogenetically and biochemically, revealing a likely bacterial origin and canonical enzymatic function. Horizontal gene transfer was previously shown to be a phenomenon that has contributed to the evolution of plant parasitism among nematodes. Our findings underline the importance and the extensiveness of this phenomenon in the evolution of plant-parasitic life styles in this speciose and widespread animal phylum. PMID:29065523
Danchin, Etienne G J; Perfus-Barbeoch, Laetitia; Rancurel, Corinne; Thorpe, Peter; Da Rocha, Martine; Bajew, Simon; Neilson, Roy; Guzeeva, Elena Sokolova; Da Silva, Corinne; Guy, Julie; Labadie, Karine; Esmenjaud, Daniel; Helder, Johannes; Jones, John T; den Akker, Sebastian Eves-van
2017-10-23
Nematodes have evolved the ability to parasitize plants on at least four independent occasions, with plant parasites present in Clades 1, 2, 10 and 12 of the phylum. In the case of Clades 10 and 12, horizontal gene transfer of plant cell wall degrading enzymes from bacteria and fungi has been implicated in the evolution of plant parasitism. We have used ribonucleic acid sequencing (RNAseq) to generate reference transcriptomes for two economically important nematode species, Xiphinema index and Longidorus elongatus , representative of two genera within the early-branching Clade 2 of the phylum Nematoda. We used a transcriptome-wide analysis to identify putative horizontal gene transfer events. This represents the first in-depth transcriptome analysis from any plant-parasitic nematode of this clade. For each species, we assembled ~30 million Illumina reads into a reference transcriptome. We identified 62 and 104 transcripts, from X. index and L. elongatus , respectively, that were putatively acquired via horizontal gene transfer. By cross-referencing horizontal gene transfer prediction with a phylum-wide analysis of Pfam domains, we identified Clade 2-specific events. Of these, a GH12 cellulase from X. index was analysed phylogenetically and biochemically, revealing a likely bacterial origin and canonical enzymatic function. Horizontal gene transfer was previously shown to be a phenomenon that has contributed to the evolution of plant parasitism among nematodes. Our findings underline the importance and the extensiveness of this phenomenon in the evolution of plant-parasitic life styles in this speciose and widespread animal phylum.
Time Series Expression Analyses Using RNA-seq: A Statistical Approach
Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.
2013-01-01
RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021
Time series expression analyses using RNA-seq: a statistical approach.
Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P
2013-01-01
RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.
Abiotic Stress Tolerance of Charophyte Green Algae: New Challenges for Omics Techniques
Holzinger, Andreas; Pichrtová, Martina
2016-01-01
Charophyte green algae are a paraphyletic group of freshwater and terrestrial green algae, comprising the classes of Chlorokybophyceae, Coleochaetophyceae, Klebsormidiophyceae, Zygnematophyceae, Mesostigmatophyceae, and Charo- phyceae. Zygnematophyceae (Conjugating green algae) are considered to be closest algal relatives to land plants (Embryophyta). Therefore, they are ideal model organisms for studying stress tolerance mechanisms connected with transition to land, one of the most important events in plant evolution and the Earth’s history. In Zygnematophyceae, but also in Coleochaetophyceae, Chlorokybophyceae, and Klebsormidiophyceae terrestrial members are found which are frequently exposed to naturally occurring abiotic stress scenarios like desiccation, freezing and high photosynthetic active (PAR) as well as ultraviolet (UV) irradiation. Here, we summarize current knowledge about various stress tolerance mechanisms including insight provided by pioneer transcriptomic and proteomic studies. While formation of dormant spores is a typical strategy of freshwater classes, true terrestrial groups are stress tolerant in vegetative state. Aggregation of cells, flexible cell walls, mucilage production and accumulation of osmotically active compounds are the most common desiccation tolerance strategies. In addition, high photophysiological plasticity and accumulation of UV-screening compounds are important protective mechanisms in conditions with high irradiation. Now a shift from classical chemical analysis to next-generation genome sequencing, gene reconstruction and annotation, genome-scale molecular analysis using omics technologies followed by computer-assisted analysis will give new insights in a systems biology approach. For example, changes in transcriptome and role of phytohormone signaling in Klebsormidium during desiccation were recently described. Application of these modern approaches will deeply enhance our understanding of stress reactions in an unbiased non-targeted view in an evolutionary context. PMID:27242877
Asha, Srinivasan; Sreekumar, Sweda; Soniya, E V
2016-01-01
Analysis of high-throughput small RNA deep sequencing data, in combination with black pepper transcriptome sequences revealed microRNA-mediated gene regulation in black pepper ( Piper nigrum L.). Black pepper is an important spice crop and its berries are used worldwide as a natural food additive that contributes unique flavour to foods. In the present study to characterize microRNAs from black pepper, we generated a small RNA library from black pepper leaf and sequenced it by Illumina high-throughput sequencing technology. MicroRNAs belonging to a total of 303 conserved miRNA families were identified from the sRNAome data. Subsequent analysis from recently sequenced black pepper transcriptome confirmed precursor sequences of 50 conserved miRNAs and four potential novel miRNA candidates. Stem-loop qRT-PCR experiments demonstrated differential expression of eight conserved miRNAs in black pepper. Computational analysis of targets of the miRNAs showed 223 potential black pepper unigene targets that encode diverse transcription factors and enzymes involved in plant development, disease resistance, metabolic and signalling pathways. RLM-RACE experiments further mapped miRNA-mediated cleavage at five of the mRNA targets. In addition, miRNA isoforms corresponding to 18 miRNA families were also identified from black pepper. This study presents the first large-scale identification of microRNAs from black pepper and provides the foundation for the future studies of miRNA-mediated gene regulation of stress responses and diverse metabolic processes in black pepper.
Transcriptome profile and unique genetic evolution of positively selected genes in yak lungs.
Lan, DaoLiang; Xiong, XianRong; Ji, WenHui; Li, Jian; Mipam, Tserang-Donko; Ai, Yi; Chai, ZhiXin
2018-04-01
The yak (Bos grunniens), which is a unique bovine breed that is distributed mainly in the Qinghai-Tibetan Plateau, is considered a good model for studying plateau adaptability in mammals. The lungs are important functional organs that enable animals to adapt to their external environment. However, the genetic mechanism underlying the adaptability of yak lungs to harsh plateau environments remains unknown. To explore the unique evolutionary process and genetic mechanism of yak adaptation to plateau environments, we performed transcriptome sequencing of yak and cattle (Bos taurus) lungs using RNA-Seq technology and a subsequent comparison analysis to identify the positively selected genes in the yak. After deep sequencing, a normal transcriptome profile of yak lung that containing a total of 16,815 expressed genes was obtained, and the characteristics of yak lungs transcriptome was described by functional analysis. Furthermore, Ka/Ks comparison statistics result showed that 39 strong positively selected genes are identified from yak lungs. Further GO and KEGG analysis was conducted for the functional annotation of these genes. The results of this study provide valuable data for further explorations of the unique evolutionary process of high-altitude hypoxia adaptation in yaks in the Tibetan Plateau and the genetic mechanism at the molecular level.
Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome
Kim, Gunjune
2017-01-01
Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is “leaves of three, let it be”, which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species. PMID:29125533
Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome.
Weisberg, Alexandra J; Kim, Gunjune; Westwood, James H; Jelesko, John G
2017-11-10
Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is "leaves of three, let it be", which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species.
Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing
Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li
2010-01-01
Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome. PMID:20392818
Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing.
Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li
2010-08-01
Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome.
Chen, Hongdan; Lai, Wenxiang; Fu, Qiang; Lou, Yonggen
2014-01-01
Background The brown planthopper (BPH), Nilaparvata lugens (Stål), one of the most serious rice insect pests in Asia, can quickly overcome rice resistance by evolving new virulent populations. The insect fat body plays essential roles in the life cycles of insects and in plant-insect interactions. However, whether differences in fat body transcriptomes exist between insect populations with different virulence levels and whether the transcriptomic differences are related to insect virulence remain largely unknown. Methodology/Principal Findings In this study, we performed transcriptome-wide analyses on the fat bodies of two BPH populations with different virulence levels in rice. The populations were derived from rice variety TN1 (TN1 population) and Mudgo (M population). In total, 33,776 and 32,332 unigenes from the fat bodies of TN1 and M populations, respectively, were generated using Illumina technology. Gene ontology annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology classifications indicated that genes related to metabolism and immunity were significantly active in the fat bodies. In addition, a total of 339 unigenes showed homology to genes of yeast-like symbionts (YLSs) from 12 genera and endosymbiotic bacteria Wolbachia. A comparative analysis of the two transcriptomes generated 7,860 differentially expressed genes. GO annotations and enrichment analysis of KEGG pathways indicated these differentially expressed transcripts might be involved in metabolism and immunity. Finally, 105 differentially expressed genes from YLSs and Wolbachia were identified, genes which might be associated with the formation of different virulent populations. Conclusions/Significance This study was the first to compare the fat-body transcriptomes of two BPH populations having different virulence traits and to find genes that may be related to this difference. Our findings provide a molecular resource for future investigations of fat bodies and will be useful in examining the interactions between the fat body and virulence variation in the BPH. PMID:24533099
Yu, Haixin; Ji, Rui; Ye, Wenfeng; Chen, Hongdan; Lai, Wenxiang; Fu, Qiang; Lou, Yonggen
2014-01-01
The brown planthopper (BPH), Nilaparvata lugens (Stål), one of the most serious rice insect pests in Asia, can quickly overcome rice resistance by evolving new virulent populations. The insect fat body plays essential roles in the life cycles of insects and in plant-insect interactions. However, whether differences in fat body transcriptomes exist between insect populations with different virulence levels and whether the transcriptomic differences are related to insect virulence remain largely unknown. In this study, we performed transcriptome-wide analyses on the fat bodies of two BPH populations with different virulence levels in rice. The populations were derived from rice variety TN1 (TN1 population) and Mudgo (M population). In total, 33,776 and 32,332 unigenes from the fat bodies of TN1 and M populations, respectively, were generated using Illumina technology. Gene ontology annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology classifications indicated that genes related to metabolism and immunity were significantly active in the fat bodies. In addition, a total of 339 unigenes showed homology to genes of yeast-like symbionts (YLSs) from 12 genera and endosymbiotic bacteria Wolbachia. A comparative analysis of the two transcriptomes generated 7,860 differentially expressed genes. GO annotations and enrichment analysis of KEGG pathways indicated these differentially expressed transcripts might be involved in metabolism and immunity. Finally, 105 differentially expressed genes from YLSs and Wolbachia were identified, genes which might be associated with the formation of different virulent populations. This study was the first to compare the fat-body transcriptomes of two BPH populations having different virulence traits and to find genes that may be related to this difference. Our findings provide a molecular resource for future investigations of fat bodies and will be useful in examining the interactions between the fat body and virulence variation in the BPH.
2010-01-01
Background Systematic research on fish immunogenetics is indispensable in understanding the origin and evolution of immune systems. This has long been a challenging task because of the limited number of deep sequencing technologies and genome backgrounds of non-model fish available. The newly developed Solexa/Illumina RNA-seq and Digital gene expression (DGE) are high-throughput sequencing approaches and are powerful tools for genomic studies at the transcriptome level. This study reports the transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus using RNA-seq and DGE in an attempt to gain insights into the immunogenetics of marine fish. Results RNA-seq analysis generated 169,950 non-redundant consensus sequences, among which 48,987 functional transcripts with complete or various length encoding regions were identified. More than 52% of these transcripts are possibly involved in approximately 219 known metabolic or signalling pathways, while 2,673 transcripts were associated with immune-relevant genes. In addition, approximately 8% of the transcripts appeared to be fish-specific genes that have never been described before. DGE analysis revealed that the host transcriptome profile of Vibrio harveyi-challenged L. japonicus is considerably altered, as indicated by the significant up- or down-regulation of 1,224 strong infection-responsive transcripts. Results indicated an overall conservation of the components and transcriptome alterations underlying innate and adaptive immunity in fish and other vertebrate models. Analysis suggested the acquisition of numerous fish-specific immune system components during early vertebrate evolution. Conclusion This study provided a global survey of host defence gene activities against bacterial challenge in a non-model marine fish. Results can contribute to the in-depth study of candidate genes in marine fish immunity, and help improve current understanding of host-pathogen interactions and evolutionary history of immunogenetics from fish to mammals. PMID:20707909
Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming
2013-01-01
Background Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. Results In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. Conclusions This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas. PMID:24349370
Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming
2013-01-01
Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas.
Sequencing, Annotation and Analysis of the Syrian Hamster (Mesocricetus auratus) Transcriptome
Tchitchek, Nicolas; Safronetz, David; Rasmussen, Angela L.; Martens, Craig; Virtaneva, Kimmo; Porcella, Stephen F.; Feldmann, Heinz
2014-01-01
Background The Syrian hamster (golden hamster, Mesocricetus auratus) is gaining importance as a new experimental animal model for multiple pathogens, including emerging zoonotic diseases such as Ebola. Nevertheless there are currently no publicly available transcriptome reference sequences or genome for this species. Results A cDNA library derived from mRNA and snRNA isolated and pooled from the brains, lungs, spleens, kidneys, livers, and hearts of three adult female Syrian hamsters was sequenced. Sequence reads were assembled into 62,482 contigs and 111,796 reads remained unassembled (singletons). This combined contig/singleton dataset, designated as the Syrian hamster transcriptome, represents a total of 60,117,204 nucleotides. Our Mesocricetus auratus Syrian hamster transcriptome mapped to 11,648 mouse transcripts representing 9,562 distinct genes, and mapped to a similar number of transcripts and genes in the rat. We identified 214 quasi-complete transcripts based on mouse annotations. Canonical pathways involved in a broad spectrum of fundamental biological processes were significantly represented in the library. The Syrian hamster transcriptome was aligned to the current release of the Chinese hamster ovary (CHO) cell transcriptome and genome to improve the genomic annotation of this species. Finally, our Syrian hamster transcriptome was aligned against 14 other rodents, primate and laurasiatheria species to gain insights about the genetic relatedness and placement of this species. Conclusions This Syrian hamster transcriptome dataset significantly improves our knowledge of the Syrian hamster's transcriptome, especially towards its future use in infectious disease research. Moreover, this library is an important resource for the wider scientific community to help improve genome annotation of the Syrian hamster and other closely related species. Furthermore, these data provide the basis for development of expression microarrays that can be used in functional genomics studies. PMID:25398096
Meta-analytic framework for liquid association.
Wang, Lin; Liu, Silvia; Ding, Ying; Yuan, Shin-Sheng; Ho, Yen-Yi; Tseng, George C
2017-07-15
Although coexpression analysis via pair-wise expression correlation is popularly used to elucidate gene-gene interactions at the whole-genome scale, many complicated multi-gene regulations require more advanced detection methods. Liquid association (LA) is a powerful tool to detect the dynamic correlation of two gene variables depending on the expression level of a third variable (LA scouting gene). LA detection from single transcriptomic study, however, is often unstable and not generalizable due to cohort bias, biological variation and limited sample size. With the rapid development of microarray and NGS technology, LA analysis combining multiple gene expression studies can provide more accurate and stable results. In this article, we proposed two meta-analytic approaches for LA analysis (MetaLA and MetaMLA) to combine multiple transcriptomic studies. To compensate demanding computing, we also proposed a two-step fast screening algorithm for more efficient genome-wide screening: bootstrap filtering and sign filtering. We applied the methods to five Saccharomyces cerevisiae datasets related to environmental changes. The fast screening algorithm reduced 98% of running time. When compared with single study analysis, MetaLA and MetaMLA provided stronger detection signal and more consistent and stable results. The top triplets are highly enriched in fundamental biological processes related to environmental changes. Our method can help biologists understand underlying regulatory mechanisms under different environmental exposure or disease states. A MetaLA R package, data and code for this article are available at http://tsenglab.biostat.pitt.edu/software.htm. ctseng@pitt.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Analysis of the Macaca mulatta transcriptome and the sequence divergence between Macaca and human.
Magness, Charles L; Fellin, P Campion; Thomas, Matthew J; Korth, Marcus J; Agy, Michael B; Proll, Sean C; Fitzgibbon, Matthew; Scherer, Christina A; Miner, Douglas G; Katze, Michael G; Iadonato, Shawn P
2005-01-01
We report the initial sequencing and comparative analysis of the Macaca mulatta transcriptome. Cloned sequences from 11 tissues, nine animals, and three species (M. mulatta, M. fascicularis, and M. nemestrina) were sampled, resulting in the generation of 48,642 sequence reads. These data represent an initial sampling of the putative rhesus orthologs for 6,216 human genes. Mean nucleotide diversity within M. mulatta and sequence divergence among M. fascicularis, M. nemestrina, and M. mulatta are also reported.
An automated method for detecting alternatively spliced protein domains.
Coelho, Vitor; Sammeth, Michael
2018-06-01
Alternative splicing (AS) has been demonstrated to play a role in shaping eukaryotic gene diversity at the transcriptional level. However, the impact of AS on the proteome is still controversial. Studies that seek to explore the effect of AS at the proteomic level are hampered by technical difficulties in the cumbersome process of casting forth and back between genome, transcriptome and proteome space coordinates, and the naïve prediction of protein domains in the presence of AS suffers many redundant sequence scans that emerge from constitutively spliced regions that are shared between alternative products of a gene. We developed the AstaFunk pipeline that computes for every generic transcriptome all domains that are altered by AS events in a systematic and efficient manner. In a nutshell, our method employs Viterbi dynamic programming, which guarantees to find all score-optimal hits of the domains under consideration, while complementary optimisations at different levels avoid redundant and other irrelevant computations. We evaluate AstaFunk qualitatively and quantitatively using RNAseq in well-studied genes with AS, and on large-scale employing entire transcriptomes. Our study confirms complementary reports that the effect of most AS events on the proteome seems to be rather limited, but our results also pinpoint several cases where AS could have a major impact on the function of a protein domain. The JAVA implementation of AstaFunk is available as an open source project on http://astafunk.sammeth.net. micha@sammeth.net. Supplementary data are available at Bioinformatics online.
Chandrani, P; Kulkarni, V; Iyer, P; Upadhyay, P; Chaubal, R; Das, P; Mulherkar, R; Singh, R; Dutt, A
2015-06-09
Human papilloma virus (HPV) accounts for the most common cause of all virus-associated human cancers. Here, we describe the first graphic user interface (GUI)-based automated tool 'HPVDetector', for non-computational biologists, exclusively for detection and annotation of the HPV genome based on next-generation sequencing data sets. We developed a custom-made reference genome that comprises of human chromosomes along with annotated genome of 143 HPV types as pseudochromosomes. The tool runs on a dual mode as defined by the user: a 'quick mode' to identify presence of HPV types and an 'integration mode' to determine genomic location for the site of integration. The input data can be a paired-end whole-exome, whole-genome or whole-transcriptome data set. The HPVDetector is available in public domain for download: http://www.actrec.gov.in/pi-webpages/AmitDutt/HPVdetector/HPVDetector.html. On the basis of our evaluation of 116 whole-exome, 23 whole-transcriptome and 2 whole-genome data, we were able to identify presence of HPV in 20 exomes and 4 transcriptomes of cervical and head and neck cancer tumour samples. Using the inbuilt annotation module of HPVDetector, we found predominant integration of viral gene E7, a known oncogene, at known 17q21, 3q27, 7q35, Xq28 and novel sites of integration in the human genome. Furthermore, co-infection with high-risk HPVs such as 16 and 31 were found to be mutually exclusive compared with low-risk HPV71. HPVDetector is a simple yet precise and robust tool for detecting HPV from tumour samples using variety of next-generation sequencing platforms including whole genome, whole exome and transcriptome. Two different modes (quick detection and integration mode) along with a GUI widen the usability of HPVDetector for biologists and clinicians with minimal computational knowledge.
Firmino, Alexandre Augusto Pereira; Fonseca, Fernando Campos de Assis; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; Antonino de Souza, José Dijair; Togawa, Roberto Coiti; Silva-Junior, Orzenil Bonfim; Pappas, Georgios Joannis; da Silva, Maria Cristina Mattar; Engler, Gilbert; Grossi-de-Sa, Maria Fatima
2013-01-01
Cotton plants are subjected to the attack of several insect pests. In Brazil, the cotton boll weevil, Anthonomus grandis, is the most important cotton pest. The use of insecticidal proteins and gene silencing by interference RNA (RNAi) as techniques for insect control are promising strategies, which has been applied in the last few years. For this insect, there are not much available molecular information on databases. Using 454-pyrosequencing methodology, the transcriptome of all developmental stages of the insect pest, A. grandis, was analyzed. The A. grandis transcriptome analysis resulted in more than 500.000 reads and a data set of high quality 20,841 contigs. After sequence assembly and annotation, around 10,600 contigs had at least one BLAST hit against NCBI non-redundant protein database and 65.7% was similar to Tribolium castaneum sequences. A comparison of A. grandis, Drosophila melanogaster and Bombyx mori protein families' data showed higher similarity to dipteran than to lepidopteran sequences. Several contigs of genes encoding proteins involved in RNAi mechanism were found. PAZ Domains sequences extracted from the transcriptome showed high similarity and conservation for the most important functional and structural motifs when compared to PAZ Domains from 5 species. Two SID-like contigs were phylogenetically analyzed and grouped with T. castaneum SID-like proteins. No RdRP gene was found. A contig matching chitin synthase 1 was mined from the transcriptome. dsRNA microinjection of a chitin synthase gene to A. grandis female adults resulted in normal oviposition of unviable eggs and malformed alive larvae that were unable to develop in artificial diet. This is the first study that characterizes the transcriptome of the coleopteran, A. grandis. A new and representative transcriptome database for this insect pest is now available. All data support the state of the art of RNAi mechanism in insects.
Coelho, Roberta Ramos; Antonino de Souza Jr, José Dijair; Togawa, Roberto Coiti; Silva-Junior, Orzenil Bonfim; Pappas-Jr, Georgios Joannis; da Silva, Maria Cristina Mattar; Engler, Gilbert; Grossi-de-Sa, Maria Fatima
2013-01-01
Cotton plants are subjected to the attack of several insect pests. In Brazil, the cotton boll weevil, Anthonomus grandis, is the most important cotton pest. The use of insecticidal proteins and gene silencing by interference RNA (RNAi) as techniques for insect control are promising strategies, which has been applied in the last few years. For this insect, there are not much available molecular information on databases. Using 454-pyrosequencing methodology, the transcriptome of all developmental stages of the insect pest, A. grandis, was analyzed. The A. grandis transcriptome analysis resulted in more than 500.000 reads and a data set of high quality 20,841 contigs. After sequence assembly and annotation, around 10,600 contigs had at least one BLAST hit against NCBI non-redundant protein database and 65.7% was similar to Tribolium castaneum sequences. A comparison of A. grandis, Drosophila melanogaster and Bombyx mori protein families’ data showed higher similarity to dipteran than to lepidopteran sequences. Several contigs of genes encoding proteins involved in RNAi mechanism were found. PAZ Domains sequences extracted from the transcriptome showed high similarity and conservation for the most important functional and structural motifs when compared to PAZ Domains from 5 species. Two SID-like contigs were phylogenetically analyzed and grouped with T. castaneum SID-like proteins. No RdRP gene was found. A contig matching chitin synthase 1 was mined from the transcriptome. dsRNA microinjection of a chitin synthase gene to A. grandis female adults resulted in normal oviposition of unviable eggs and malformed alive larvae that were unable to develop in artificial diet. This is the first study that characterizes the transcriptome of the coleopteran, A. grandis. A new and representative transcriptome database for this insect pest is now available. All data support the state of the art of RNAi mechanism in insects. PMID:24386449
Moisá, Sonia J.; Shike, Daniel W.; Shoup, Lindsay; Rodriguez-Zas, Sandra L.; Loor, Juan J.
2015-01-01
In model organisms both the nutrition of the mother and the young offspring could induce long-lasting transcriptional changes in tissues. In livestock, such changes could have important roles in determining nutrient use and meat quality. The main objective was to evaluate if plane of maternal nutrition during late-gestation and weaning age alter the offspring’s Longissimus muscle (LM) transcriptome, animal performance, and metabolic hormones. Whole-transcriptome microarray analysis was performed on LM samples of early (EW) and normal weaned (NW) Angus × Simmental calves born to grazing cows receiving no supplement [low plane of nutrition (LPN)] or 2.3 kg high-grain mix/day [medium plane of nutrition (MPN)] during the last 105 days of gestation. Biopsies of LM were harvested at 78 (EW), 187 (NW) and 354 (before slaughter) days of age. Despite greater feed intake in MPN offspring, blood insulin was greater in LPN offspring. Carcass intramuscular fat content was greater in EW offspring. Bioinformatics analysis of the transcriptome highlighted a modest overall response to maternal plane of nutrition, resulting in only 35 differentially expressed genes (DEG). However, weaning age and a high-grain diet (EW) strongly impacted the transcriptome (DEG = 167), especially causing a lipogenic program activation. In addition, between 78 and 187 days of age, EW steers had an activation of the innate immune system due presumably to macrophage infiltration of intramuscular fat. Between 187 and 354 days of age (the “finishing” phase), NW steers had an activation of the lipogenic transcriptome machinery, while EW steers had a clear inhibition through the epigenetic control of histone acetylases. Results underscored the need to conduct further studies to understand better the functional outcome of transcriptome changes induced in the offspring by pre- and post-natal nutrition. Additional knowledge on molecular and functional outcomes would help produce more efficient beef cattle. PMID:26153887
Babineau, Marielle; Mahmood, Khalid; Mathiassen, Solvejg K; Kudsk, Per; Kristensen, Michael
2017-02-06
Loose silky bentgrass (Apera spica-venti) is an important weed in Europe with a recent increase in herbicide resistance cases. The lack of genetic information about this noxious weed limits its biological understanding such as growth, reproduction, genetic variation, molecular ecology and metabolic herbicide resistance. This study produced a reference transcriptome for A. spica-venti from different tissues (leaf, root, stem) and various growth stages (seed at phenological stages 05, 07, 08, 09). The de novo assembly was performed on individual and combined dataset followed by functional annotations. Individual transcripts and gene families involved in metabolic based herbicide resistance were identified. Eight separate transcriptome assemblies were performed and compared. The combined transcriptome assembly consists of 83,349 contigs with an N50 and average contig length of 762 and 658 bp, respectively. This dataset contains 74,724 transcripts consisting of total 54,846,111 bp. Among them 94% had a homologue to UniProtKB, 73% retrieved a GO mapping, and 50% were functionally annotated. Compared with other grass species, A. spica-venti has 26% proteins in common to Brachypodium distachyon, and 41% to Lolium spp. Glycosyltransferases had the highest number of transcripts in each tissue followed by the cytochrome P450s. The GSTF1 and CYP89A2 transcripts were recovered from the majority of tissues and aligned at a maximum of 66 and 30% to proven herbicide resistant allele from Alopecurus myosuroides and Lolium rigidum, respectively. De novo transcriptome assembly enabled the generation of the first reference transcriptome of A. spica-venti. This can serve as stepping stone for understanding the metabolic herbicide resistance as well as the general biology of this problematic weed. Furthermore, this large-scale sequence data is a valuable scientific resource for comparative transcriptome analysis for Poaceae grasses.
Moisá, Sonia J; Shike, Daniel W; Shoup, Lindsay; Rodriguez-Zas, Sandra L; Loor, Juan J
2015-01-01
In model organisms both the nutrition of the mother and the young offspring could induce long-lasting transcriptional changes in tissues. In livestock, such changes could have important roles in determining nutrient use and meat quality. The main objective was to evaluate if plane of maternal nutrition during late-gestation and weaning age alter the offspring's Longissimus muscle (LM) transcriptome, animal performance, and metabolic hormones. Whole-transcriptome microarray analysis was performed on LM samples of early (EW) and normal weaned (NW) Angus × Simmental calves born to grazing cows receiving no supplement [low plane of nutrition (LPN)] or 2.3 kg high-grain mix/day [medium plane of nutrition (MPN)] during the last 105 days of gestation. Biopsies of LM were harvested at 78 (EW), 187 (NW) and 354 (before slaughter) days of age. Despite greater feed intake in MPN offspring, blood insulin was greater in LPN offspring. Carcass intramuscular fat content was greater in EW offspring. Bioinformatics analysis of the transcriptome highlighted a modest overall response to maternal plane of nutrition, resulting in only 35 differentially expressed genes (DEG). However, weaning age and a high-grain diet (EW) strongly impacted the transcriptome (DEG = 167), especially causing a lipogenic program activation. In addition, between 78 and 187 days of age, EW steers had an activation of the innate immune system due presumably to macrophage infiltration of intramuscular fat. Between 187 and 354 days of age (the "finishing" phase), NW steers had an activation of the lipogenic transcriptome machinery, while EW steers had a clear inhibition through the epigenetic control of histone acetylases. Results underscored the need to conduct further studies to understand better the functional outcome of transcriptome changes induced in the offspring by pre- and post-natal nutrition. Additional knowledge on molecular and functional outcomes would help produce more efficient beef cattle.
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2011-01-01
Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
Xu, Ning; Zhao, Hong-Yan; Yin, Yin; Shen, Shan-Shan; Shan, Lin-Lin; Chen, Chuan-Xi; Zhang, Yan-Xia; Gao, Jian-Fang; Ji, Xiang
2017-04-21
We conducted an omics-analysis of the venom of Naja kaouthia from China. Proteomics analysis revealed six protein families [three-finger toxins (3-FTx), phospholipase A 2 (PLA 2 ), nerve growth factor, snake venom metalloproteinase (SVMP), cysteine-rich secretory protein and ohanin], and venom-gland transcriptomics analysis revealed 28 protein families from 79 unigenes. 3-FTx (56.5% in proteome/82.0% in transcriptome) and PLA 2 (26.9%/13.6%) were identified as the most abundant families in venom proteome and venom-gland transcriptome. Furthermore, N. kaouthia venom expressed strong lethality (i.p. LD 50 : 0.79μg/g) and myotoxicity (CK: 5939U/l) in mice, and showed notable activity in PLA 2 but weak activity in SVMP, l-amino acid oxidase or 5' nucleotidase. Antivenomic assessment revealed that several venom components (nearly 17.5% of total venom) from N. kaouthia could not be thoroughly immunocaptured by commercial Naja atra antivenom. ELISA analysis revealed that there was no difference in the cross-reaction between N. kaouthia and N. atra venoms against the N. atra antivenom. The use of commercial N. atra antivenom in treatment of snakebites caused by N. kaouthia is reasonable, but design of novel antivenom with the attention on enhancing the immune response of non-immunocaptured components should be encouraged. The venomics, antivenomics and venom-gland transcriptome of the monocoled cobra (Naja kaouthia) from China have been elucidated. Quantitative and qualitative differences are evident when venom proteomic and venom-gland transcriptomic profiles are compared. Two protein families (3-FTx and PLA 2 ) are found to be the predominated components in N. kaouthia venom, and considered as the major players in functional role of venom. Other protein families with relatively low abundance appear to be minor in the functional significance. Antivenomics and ELISA evaluation reveal that the N. kaouthia venom can be effectively immunorecognized by commercial N. atra antivenom, but still a small number of venom components could not be thoroughly immunocaptured. The findings indicate that exploring the precise composition of snake venom should be executed by an integrated omics-approach, and elucidating the venom composition is helpful in understanding composition-function relationships and will facilitate the clinical application of antivenoms. Copyright © 2017 Elsevier B.V. All rights reserved.
How to normalize metatranscriptomic count data for differential expression analysis.
Klingenberg, Heiner; Meinicke, Peter
2017-01-01
Differential expression analysis on the basis of RNA-Seq count data has become a standard tool in transcriptomics. Several studies have shown that prior normalization of the data is crucial for a reliable detection of transcriptional differences. Until now it has not been clear whether and how the transcriptomic approach can be used for differential expression analysis in metatranscriptomics. We propose a model for differential expression in metatranscriptomics that explicitly accounts for variations in the taxonomic composition of transcripts across different samples. As a main consequence the correct normalization of metatranscriptomic count data under this model requires the taxonomic separation of the data into organism-specific bins. Then the taxon-specific scaling of organism profiles yields a valid normalization and allows us to recombine the scaled profiles into a metatranscriptomic count matrix. This matrix can then be analyzed with statistical tools for transcriptomic count data. For taxon-specific scaling and recombination of scaled counts we provide a simple R script. When applying transcriptomic tools for differential expression analysis directly to metatranscriptomic data with an organism-independent (global) scaling of counts the resulting differences may be difficult to interpret. The differences may correspond to changing functional profiles of the contributing organisms but may also result from a variation of taxonomic abundances. Taxon-specific scaling eliminates this variation and therefore the resulting differences actually reflect a different behavior of organisms under changing conditions. In simulation studies we show that the divergence between results from global and taxon-specific scaling can be drastic. In particular, the variation of organism abundances can imply a considerable increase of significant differences with global scaling. Also, on real metatranscriptomic data, the predictions from taxon-specific and global scaling can differ widely. Our studies indicate that in real data applications performed with global scaling it might be impossible to distinguish between differential expression in terms of transcriptomic changes and differential composition in terms of changing taxonomic proportions. As in transcriptomics, a proper normalization of count data is also essential for differential expression analysis in metatranscriptomics. Our model implies a taxon-specific scaling of counts for normalization of the data. The application of taxon-specific scaling consequently removes taxonomic composition variations from functional profiles and therefore provides a clear interpretation of the observed functional differences.
On the way toward systems biology of Aspergillus fumigatus infection.
Albrecht, Daniela; Kniemeyer, Olaf; Mech, Franziska; Gunzer, Matthias; Brakhage, Axel; Guthke, Reinhard
2011-06-01
Pathogenicity of Aspergillus fumigatus is multifactorial. Thus, global studies are essential for the understanding of the infection process. Therefore, a data warehouse was established where genome sequence, transcriptome and proteome data are stored. These data are analyzed for the elucidation of virulence determinants. The data analysis workflow starts with pre-processing including imputing of missing values and normalization. Last step is the identification of differentially expressed genes/proteins as interesting candidates for further analysis, in particular for functional categorization and correlation studies. Sequence data and other prior knowledge extracted from databases are integrated to support the inference of gene regulatory networks associated with pathogenicity. This knowledge-assisted data analysis aims at establishing mathematical models with predictive strength to assist further experimental work. Recently, first steps were done to extend the integrative data analysis and computational modeling by evaluating spatio-temporal data (movies) that monitor interactions of A. fumigatus morphotypes (e.g. conidia) with host immune cells. Copyright © 2011 Elsevier GmbH. All rights reserved.
USDA-ARS?s Scientific Manuscript database
The yeast, Metschnikowia fructicola, is an antagonist with biological control activity against postharvest diseases of several fruits. We performed a transcriptome analysis, using RNA-Seq technology, to examine the response of M. fructicola with citrus fruit and with the postharvest pathogen, Penic...
J. D. Tang; L. A. Parker; A. D. Perkins; T. S. Sonstegard; S. G. Schroeder; D. D. Nicholas; S. V. Diehl
2013-01-01
High-throughput transcriptomics was used to identify Fibroporia radiculosa genes that were differentially regulated during colonization of wood treated with a copper-based preservative. The transcriptome was profiled at two time points while the fungus was growing on wood treated with micronized copper quat (MCQ). A total of 917 transcripts were...
Toh, Su San; Treves, David S; Barati, Michelle T; Perlin, Michael H
2016-10-01
Microbotryum lychnidis-dioicae is a member of a species complex infecting host plants in the Caryophyllaceae. It is used as a model system in many areas of research, but attempts to make this organism tractable for reverse genetic approaches have not been fruitful. Here, we exploited the recently obtained genome sequence and transcriptome analysis to inform our design of constructs for use in Agrobacterium-mediated transformation techniques currently available for other fungi. Reproducible transformation was demonstrated at the genomic, transcriptional and functional levels. Moreover, these initial proof-of-principle experiments provide evidence that supports the findings from initial global transcriptome analysis regarding expression from the respective promoters under different growth conditions of the fungus. The technique thus provides for the first time the ability to stably introduce transgenes and over-express target M. lychnidis-dioicae genes.
Guo, Li; Breakspear, Andrew; Zhao, Guoyi; Gao, Lixin; Kistler, H Corby; Xu, Jin-Rong; Ma, Li-Jun
2016-02-01
The cyclic adenosine monophosphate-protein kinase A (cAMP-PKA) pathway is a central signalling cascade that transmits extracellular stimuli and governs cell responses through the second messenger cAMP. The importance of cAMP signalling in fungal biology has been well documented and the key conserved components, adenylate cyclase (AC) and the catalytic subunit of PKA (CPKA), have been functionally characterized. However, other genes involved in this signalling pathway and their regulation are not well understood in filamentous fungi. Here, we performed a comparative transcriptomics analysis of AC and CPKA mutants in two closely related fungi: Fusarium graminearum (Fg) and F. verticillioides (Fv). Combining available Fg transcriptomics and phenomics data, we reconstructed the Fg cAMP signalling pathway. We developed a computational program that combines sequence conservation and patterns of orthologous gene expression to facilitate global transcriptomics comparisons between different organisms. We observed highly correlated expression patterns for most orthologues (80%) between Fg and Fv. We also identified a subset of 482 (6%) diverged orthologues, whose expression under all conditions was at least 50% higher in one genome than in the other. This enabled us to dissect the conserved and unique portions of the cAMP-PKA pathway. Although the conserved portions controlled essential functions, such as metabolism, the cell cycle, chromatin remodelling and the oxidative stress response, the diverged portions had species-specific roles, such as the production and detoxification of secondary metabolites unique to each species. The evolution of the cAMP-PKA signalling pathway seems to have contributed directly to fungal divergence and niche adaptation. © 2015 The Authors. Molecular Plant Pathology published by British Society for Plant Pathology and John Wiley & Sons Ltd.
Ochsner, Scott A.; Tsimelzon, Anna; Dong, Jianrong; Coarfa, Cristian
2016-01-01
The pregnane X receptor (PXR) (PXR/NR1I3) and constitutive androstane receptor (CAR) (CAR/NR1I2) members of the nuclear receptor (NR) superfamily of ligand-regulated transcription factors are well-characterized mediators of xenobiotic and endocrine-disrupting chemical signaling. The Nuclear Receptor Signaling Atlas maintains a growing library of transcriptomic datasets involving perturbations of NR signaling pathways, many of which involve perturbations relevant to PXR and CAR xenobiotic signaling. Here, we generated a reference transcriptome based on the frequency of differential expression of genes across 159 experiments compiled from 22 datasets involving perturbations of CAR and PXR signaling pathways. In addition to the anticipated overrepresentation in the reference transcriptome of genes encoding components of the xenobiotic stress response, the ranking of genes involved in carbohydrate metabolism and gonadotropin action sheds mechanistic light on the suspected role of xenobiotics in metabolic syndrome and reproductive disorders. Gene Set Enrichment Analysis showed that although acetaminophen, chlorpromazine, and phenobarbital impacted many similar gene sets, differences in direction of regulation were evident in a variety of processes. Strikingly, gene sets representing genes linked to Parkinson's, Huntington's, and Alzheimer's diseases were enriched in all 3 transcriptomes. The reference xenobiotic transcriptome will be supplemented with additional future datasets to provide the community with a continually updated reference transcriptomic dataset for CAR- and PXR-mediated xenobiotic signaling. Our study demonstrates how aggregating and annotating transcriptomic datasets, and making them available for routine data mining, facilitates research into the mechanisms by which xenobiotics and endocrine-disrupting chemicals subvert conventional NR signaling modalities. PMID:27409825
2018-01-01
SUMMARY Transcriptomics, the analysis of genome-wide RNA expression, is a common approach to investigate host and pathogen processes in infectious diseases. Technical and bioinformatic advances have permitted increasingly thorough analyses of the association of RNA expression with fundamental biology, immunity, pathogenesis, diagnosis, and prognosis. Transcriptomic approaches can now be used to realize a previously unattainable goal, the simultaneous study of RNA expression in host and pathogen, in order to better understand their interactions. This exciting prospect is not without challenges, especially as focus moves from interactions in vitro under tightly controlled conditions to tissue- and systems-level interactions in animal models and natural and experimental infections in humans. Here we review the contribution of transcriptomic studies to the understanding of malaria, a parasitic disease which has exerted a major influence on human evolution and continues to cause a huge global burden of disease. We consider malaria a paradigm for the transcriptomic assessment of systemic host-pathogen interactions in humans, because much of the direct host-pathogen interaction occurs within the blood, a readily sampled compartment of the body. We illustrate lessons learned from transcriptomic studies of malaria and how these lessons may guide studies of host-pathogen interactions in other infectious diseases. We propose that the potential of transcriptomic studies to improve the understanding of malaria as a disease remains partly untapped because of limitations in study design rather than as a consequence of technological constraints. Further advances will require the integration of transcriptomic data with analytical approaches from other scientific disciplines, including epidemiology and mathematical modeling. PMID:29695497
Ochsner, Scott A; Tsimelzon, Anna; Dong, Jianrong; Coarfa, Cristian; McKenna, Neil J
2016-08-01
The pregnane X receptor (PXR) (PXR/NR1I3) and constitutive androstane receptor (CAR) (CAR/NR1I2) members of the nuclear receptor (NR) superfamily of ligand-regulated transcription factors are well-characterized mediators of xenobiotic and endocrine-disrupting chemical signaling. The Nuclear Receptor Signaling Atlas maintains a growing library of transcriptomic datasets involving perturbations of NR signaling pathways, many of which involve perturbations relevant to PXR and CAR xenobiotic signaling. Here, we generated a reference transcriptome based on the frequency of differential expression of genes across 159 experiments compiled from 22 datasets involving perturbations of CAR and PXR signaling pathways. In addition to the anticipated overrepresentation in the reference transcriptome of genes encoding components of the xenobiotic stress response, the ranking of genes involved in carbohydrate metabolism and gonadotropin action sheds mechanistic light on the suspected role of xenobiotics in metabolic syndrome and reproductive disorders. Gene Set Enrichment Analysis showed that although acetaminophen, chlorpromazine, and phenobarbital impacted many similar gene sets, differences in direction of regulation were evident in a variety of processes. Strikingly, gene sets representing genes linked to Parkinson's, Huntington's, and Alzheimer's diseases were enriched in all 3 transcriptomes. The reference xenobiotic transcriptome will be supplemented with additional future datasets to provide the community with a continually updated reference transcriptomic dataset for CAR- and PXR-mediated xenobiotic signaling. Our study demonstrates how aggregating and annotating transcriptomic datasets, and making them available for routine data mining, facilitates research into the mechanisms by which xenobiotics and endocrine-disrupting chemicals subvert conventional NR signaling modalities.
Lee, Hyun Jae; Georgiadou, Athina; Otto, Thomas D; Levin, Michael; Coin, Lachlan J; Conway, David J; Cunnington, Aubrey J
2018-06-01
Transcriptomics, the analysis of genome-wide RNA expression, is a common approach to investigate host and pathogen processes in infectious diseases. Technical and bioinformatic advances have permitted increasingly thorough analyses of the association of RNA expression with fundamental biology, immunity, pathogenesis, diagnosis, and prognosis. Transcriptomic approaches can now be used to realize a previously unattainable goal, the simultaneous study of RNA expression in host and pathogen, in order to better understand their interactions. This exciting prospect is not without challenges, especially as focus moves from interactions in vitro under tightly controlled conditions to tissue- and systems-level interactions in animal models and natural and experimental infections in humans. Here we review the contribution of transcriptomic studies to the understanding of malaria, a parasitic disease which has exerted a major influence on human evolution and continues to cause a huge global burden of disease. We consider malaria a paradigm for the transcriptomic assessment of systemic host-pathogen interactions in humans, because much of the direct host-pathogen interaction occurs within the blood, a readily sampled compartment of the body. We illustrate lessons learned from transcriptomic studies of malaria and how these lessons may guide studies of host-pathogen interactions in other infectious diseases. We propose that the potential of transcriptomic studies to improve the understanding of malaria as a disease remains partly untapped because of limitations in study design rather than as a consequence of technological constraints. Further advances will require the integration of transcriptomic data with analytical approaches from other scientific disciplines, including epidemiology and mathematical modeling. Copyright © 2018 Lee et al.
Urbarova, Ilona; Karlsen, Bård Ove; Okkenhaug, Siri; Seternes, Ole Morten; Johansen, Steinar D.; Emblem, Åse
2012-01-01
Marine bioprospecting is the search for new marine bioactive compounds and large-scale screening in extracts represents the traditional approach. Here, we report an alternative complementary protocol, called digital marine bioprospecting, based on deep sequencing of transcriptomes. We sequenced the transcriptomes from the adult polyp stage of two cold-water sea anemones, Bolocera tuediae and Hormathia digitata. We generated approximately 1.1 million quality-filtered sequencing reads by 454 pyrosequencing, which were assembled into approximately 120,000 contigs and 220,000 single reads. Based on annotation and gene ontology analysis we profiled the expressed mRNA transcripts according to known biological processes. As a proof-of-concept we identified polypeptide toxins with a potential blocking activity on sodium and potassium voltage-gated channels from digital transcriptome libraries. PMID:23170083
Jeon, Jin; Kim, Jae Kwang; Kim, HyeRan; Kim, Yeon Jeong; Park, Yun Ji; Kim, Sun Ju; Kim, Changsoo; Park, Sang Un
2018-02-15
Kale (Brassica oleracea var. acephala) is a rich source of numerous health-benefiting compounds, including vitamins, glucosinolates, phenolic compounds, and carotenoids. However, the genetic resources for exploiting the phyto-nutritional traits of kales are limited. To acquire precise information on secondary metabolites in kales, we performed a comprehensive analysis of the transcriptome and metabolome of green and red kale seedlings. Kale transcriptome datasets revealed 37,149 annotated genes and several secondary metabolite biosynthetic genes. HPLC analysis revealed 14 glucosinolates, 20 anthocyanins, 3 phenylpropanoids, and 6 carotenoids in the kale seedlings that were examined. Red kale contained more glucosinolates, anthocyanins, and phenylpropanoids than green kale, whereas the carotenoid contents were much higher in green kale than in red kale. Ultimately, our data will be a valuable resource for future research on kale bio-engineering and will provide basic information to define gene-to-metabolite networks in kale. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dhanasekaran, Saravana M.; Balbin, O. Alejandro; Chen, Guoan; Nadal, Ernest; Kalyana-Sundaram, Shanker; Pan, Jincheng; Veeneman, Brendan; Cao, Xuhong; Malik, Rohit; Vats, Pankaj; Wang, Rui; Huang, Stephanie; Zhong, Jinjie; Jing, Xiaojun; Iyer, Matthew; Wu, Yi-Mi; Harms, Paul W.; Lin, Jules; Reddy, Rishindra; Brennan, Christine; Palanisamy, Nallasivam; Chang, Andrew C.; Truini, Anna; Truini, Mauro; Robinson, Dan R.; Beer, David G.; Chinnaiyan, Arul M.
2014-01-01
Lung cancer is emerging as a paradigm for disease molecular subtyping, facilitating targeted therapy based on driving somatic alterations. Here, we perform transcriptome analysis of 153 samples representing lung adenocarcinomas, squamous cell carcinomas, large cell lung cancer, adenoid cystic carcinomas and cell lines. By integrating our data with The Cancer Genome Atlas and published sources, we analyze 753 lung cancer samples for gene fusions and other transcriptomic alterations. We show that higher numbers of gene fusions is an independent prognostic factor for poor survival in lung cancer. Our analysis confirms the recently reported CD74-NRG1 fusion and suggests that NRG1, NF1 and Hippo pathway fusions may play important roles in tumors without known driver mutations. In addition, we observe exon skipping events in c-MET, which are attributable to splice site mutations. These classes of genetic aberrations may play a significant role in the genesis of lung cancers lacking known driver mutations. PMID:25531467
Liao, Qiwen; Li, Shengnan; Siu, Shirley Weng In; Yang, Binrui; Huang, Chen; Chan, Judy Yuet-Wa; Morlighem, Jean-Étienne R L; Wong, Clarence Tsun Ting; Rádis-Baptista, Gandhi; Lee, Simon Ming-Yuen
2018-02-02
Palythoa caribaeorum (class Anthozoa) is a zoanthid that together jellyfishes, hydra, and sea anemones, which are venomous and predatory, belongs to the Phyllum Cnidaria. The distinguished feature in these marine animals is the cnidocytes in the body tissues, responsible for toxin production and injection that are used majorly for prey capture and defense. With exception for other anthozoans, the toxin cocktails of zoanthids have been scarcely studied and are poorly known. Here, on the basis of the analysis of P. caribaeorum transcriptome, numerous predicted venom-featured polypeptides were identified including allergens, neurotoxins, membrane-active, and Kunitz-like peptides (PcKuz). The three predicted PcKuz isotoxins (1-3) were selected for functional studies. Through computational processing comprising structural phylogenetic analysis, molecular docking, and dynamics simulation, PcKuz3 was shown to be a potential voltage gated potassium-channel inhibitor. PcKuz3 fitted well as new functional Kunitz-type toxins with strong antilocomotor activity as in vivo assessed in zebrafish larvae, with weak inhibitory effect toward proteases, as evaluated in vitro. Notably, PcKuz3 can suppress, at low concentration, the 6-OHDA-induced neurotoxicity on the locomotive behavior of zebrafish, which indicated PcKuz3 may have a neuroprotective effect. Taken together, PcKuz3 figures as a novel neurotoxin structure, which differs from known homologous peptides expressed in sea anemone. Moreover, the novel PcKuz3 provides an insightful hint for biodrug development for prospective neurodegenerative disease treatment.
Hwang, Dong-Gyu; Park, June Hyun; Lim, Jae Yun; Kim, Donghyun; Choi, Yourim; Kim, Soyoung; Reeves, Gregory; Yeom, Seon-In; Lee, Jeong-Soo; Park, Minkyu; Kim, Seungill; Choi, Ik-Young; Choi, Doil; Shin, Chanseok
2013-01-01
MicroRNAs (miRNAs) are a class of non-coding RNAs approximately 21 nt in length which play important roles in regulating gene expression in plants. Although many miRNA studies have focused on a few model plants, miRNAs and their target genes remain largely unknown in hot pepper (Capsicum annuum), one of the most important crops cultivated worldwide. Here, we employed high-throughput sequencing technology to identify miRNAs in pepper extensively from 10 different libraries, including leaf, stem, root, flower, and six developmental stage fruits. Based on a bioinformatics pipeline, we successfully identified 29 and 35 families of conserved and novel miRNAs, respectively. Northern blot analysis was used to validate further the expression of representative miRNAs and to analyze their tissue-specific or developmental stage-specific expression patterns. Moreover, we computationally predicted miRNA targets, many of which were experimentally confirmed using 5' rapid amplification of cDNA ends analysis. One of the validated novel targets of miR-396 was a domain rearranged methyltransferase, the major de novo methylation enzyme, involved in RNA-directed DNA methylation in plants. This work provides the first reliable draft of the pepper miRNA transcriptome. It offers an expanded picture of pepper miRNAs in relation to other plants, providing a basis for understanding the functional roles of miRNAs in pepper.
Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.
Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong
2014-05-01
We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.
Hyun, Tae Kyung; Lee, Sarah; Kumar, Dhinesh; Rim, Yeonggil; Kumar, Ritesh; Lee, Sang Yeol; Lee, Choong Hwan; Kim, Jae-Yean
2014-10-01
Using Illumina sequencing technology, we have generated the large-scale transcriptome sequencing data containing abundant information on genes involved in the metabolic pathways in R. idaeus cv. Nova fruits. Rubus idaeus (Red raspberry) is one of the important economical crops that possess numerous nutrients, micronutrients and phytochemicals with essential health benefits to human. The molecular mechanism underlying the ripening process and phytochemical biosynthesis in red raspberry is attributed to the changes in gene expression, but very limited transcriptomic and genomic information in public databases is available. To address this issue, we generated more than 51 million sequencing reads from R. idaeus cv. Nova fruit using Illumina RNA-Seq technology. After de novo assembly, we obtained 42,604 unigenes with an average length of 812 bp. At the protein level, Nova fruit transcriptome showed 77 and 68 % sequence similarities with Rubus coreanus and Fragaria versa, respectively, indicating the evolutionary relationship between them. In addition, 69 % of assembled unigenes were annotated using public databases including NCBI non-redundant, Cluster of Orthologous Groups and Gene ontology database, suggesting that our transcriptome dataset provides a valuable resource for investigating metabolic processes in red raspberry. To analyze the relationship between several novel transcripts and the amounts of metabolites such as γ-aminobutyric acid and anthocyanins, real-time PCR and target metabolite analysis were performed on two different ripening stages of Nova. This is the first attempt using Illumina sequencing platform for RNA sequencing and de novo assembly of Nova fruit without reference genome. Our data provide the most comprehensive transcriptome resource available for Rubus fruits, and will be useful for understanding the ripening process and for breeding R. idaeus cultivars with improved fruit quality.
Luo, Hui; Xiao, Shijun; Ye, Hua; Zhang, Zhengshi; Lv, Changhuan; Zheng, Shuming; Wang, Zhiyong; Wang, Xiaoqing
2016-01-01
Schizothorax prenanti (S. prenanti) is mainly distributed in the upstream regions of the Yangtze River and its tributaries in China. This species is indigenous and commercially important. However, in recent years, wild populations and aquacultures have faced the serious challenges of germplasm variation loss and an increased susceptibility to a range of pathogens. Currently, the genetics and immune mechanisms of S. prenanti are unknown, partly due to a lack of genome and transcriptome information. Here, we sought to identify genes related to immune functions and to identify molecular markers to study the function of these genes and for trait mapping. To this end, the transcriptome from spleen tissues of S. prenanti was analyzed and sequenced. Using paired-end reads from the Illumina Hiseq2500 platform, 48,517 transcripts were isolated from the spleen transcriptome. These transcripts could be clustered into 37,785 unigenes with an N50 length of 2,539 bp. The majority of the unigenes (35,653, 94.4%) were successfully annotated using non-redundant nucleotide sequence analysis (nt), and the non-redundant protein (nr), Swiss-Prot, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. KEGG pathway assignment identified more than 500 immune-related genes. Furthermore, 7,545 putative simple sequence repeats (SSRs), 857,535 single nucleotide polymorphisms (SNPs), and 53,481 insertion/deletion (InDels) were detected from the transcriptome. This is the first reported high-throughput transcriptome analysis of S. prenanti, and it provides valuable genetic resources for the investigation of immune mechanisms, conservation of germplasm, and molecular marker-assisted breeding of S. prenanti.
Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun
2013-01-01
Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids.
Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun
2013-01-01
Background Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. Methodology and Principal Findings In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. Conclusion The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids. PMID:24278202
Sun, Li Xue; Teng, Jian; Zhao, Yan; Li, Ning; Wang, Hui
2018-01-01
Background: Nowadays, the molecular mechanisms governing TSD (temperature-dependent sex determination) or GSD + TE (genotypic sex determination + temperature effects) remain a mystery in fish. Methods: We developed three all-female families of Nile tilapia (Oreochromis niloticus), and the family with the highest male ratio after high-temperature treatment was used for transcriptome analysis. Results: First, gonadal histology analysis indicated that the histological morphology of control females (CF) was not significantly different from that of high-temperature-treated females (TF) at various development stages. However, the high-temperature treatment caused a lag of spermatogenesis in high-temperature-induced neomales (IM). Next, we sequenced the transcriptome of CF, TF, and IM Nile tilapia. 79, 11,117, and 11,000 differentially expressed genes (DEGs) were detected in the CF–TF, CF–IM, and TF–IM comparisons, respectively, and 44 DEGs showed identical expression changes in the CF–TF and CF–IM comparisons. Principal component analysis (PCA) indicated that three individuals in CF and three individuals in TF formed a cluster, and three individuals in IM formed a distinct cluster, which confirmed that the gonad transcriptome profile of TF was similar to that of CF and different from that of IM. Finally, six sex-related genes were validated by qRT-PCR. Conclusions: This study identifies a number of genes that may be involved in GSD + TE, which will be useful for investigating the molecular mechanisms of TSD or GSD + TE in fish. PMID:29495590
Sun, Li Xue; Teng, Jian; Zhao, Yan; Li, Ning; Wang, Hui; Ji, Xiang Shan
2018-02-28
Nowadays, the molecular mechanisms governing TSD (temperature-dependent sex determination) or GSD + TE (genotypic sex determination + temperature effects) remain a mystery in fish. We developed three all-female families of Nile tilapia ( Oreochromis niloticus ), and the family with the highest male ratio after high-temperature treatment was used for transcriptome analysis. First, gonadal histology analysis indicated that the histological morphology of control females (CF) was not significantly different from that of high-temperature-treated females (TF) at various development stages. However, the high-temperature treatment caused a lag of spermatogenesis in high-temperature-induced neomales (IM). Next, we sequenced the transcriptome of CF, TF, and IM Nile tilapia. 79, 11,117, and 11,000 differentially expressed genes (DEGs) were detected in the CF-TF, CF-IM, and TF-IM comparisons, respectively, and 44 DEGs showed identical expression changes in the CF-TF and CF-IM comparisons. Principal component analysis (PCA) indicated that three individuals in CF and three individuals in TF formed a cluster, and three individuals in IM formed a distinct cluster, which confirmed that the gonad transcriptome profile of TF was similar to that of CF and different from that of IM. Finally, six sex-related genes were validated by qRT-PCR. This study identifies a number of genes that may be involved in GSD + TE, which will be useful for investigating the molecular mechanisms of TSD or GSD + TE in fish.
Gaur, Mahendra; Das, Aradhana; Sahoo, Rajesh Kumar; Mohanty, Sujata; Joshi, Raj Kumar; Subudhi, Enketeswara
2016-09-01
Ginger (Zingiber officinale Rosc.), a well-known member of family Zingiberaceae, is bestowed with number of medicinal properties which is because of the secondary metabolites, essential oil and oleoresin, it contains in its rhizome. The drug yielding potential is known to depend on agro-climatic conditions prevailing at the place cultivation. Present study deals with comparative transcriptome analysis of two sample of elite ginger variety Suprabha collected from two different agro-climatic zones of Odisha. Transcriptome assembly for both the samples was done using next generation sequencing methodology. The raw data of size 10.8 and 11.8 GB obtained from analysis of two rhizomes S1Z4 and S2Z5 collected from Bhubaneswar and Koraput and are available in NCBI accession number SAMN03761169 and SAMN03761176 respectively. We identified 60,452 and 54,748 transcripts using trinity tool respectively from ginger rhizome of S1Z4 and S2Z5. The transcript length varied from 300 bp to 15,213 bp and 8988 bp and N50 value of 1415 bp and 1334 bp respectively for S1Z4 and S2Z5. To the best of our knowledge, this is the first comparative transcriptome analysis of elite ginger cultivars Suprabha from two different agro-climatic conditions of Odisha, India which will help to understand the effect of agro-climatic conditions on differential expression of secondary metabolites.
Sa, Renna; Zhong, Ruqing; Xing, Huan; Zhang, Hongfu
2016-01-01
Atmospheric ammonia is a common problem in poultry industry. High concentrations of aerial ammonia cause great harm to broilers' health and production. For the consideration of human health, the limit exposure concentration of ammonia in houses is set at 25 ppm. Previous reports have shown that 25 ppm is still detrimental to livestock, especially the gastrointestinal tract and respiratory tract, but the negative relationship between ammonia exposure and the tissue of breast muscle of broilers is still unknown. In the present study, 25 ppm ammonia in poultry houses was found to lower slaughter performance and breast yield. Then, high-throughput RNA sequencing was utilized to identify differentially expressed genes in breast muscle of broiler chickens exposed to high (25 ppm) or low (3 ppm) levels of atmospheric ammonia. The transcriptome analysis showed that 163 genes (fold change ≥ 2 or ≤ 0.5; P-value < 0.05) were differentially expressed between Ammonia25 (treatment group) and Ammonia3 (control group), including 96 down-regulated and 67 up-regulated genes. qRT-PCR analysis validated the transcriptomic results of RNA sequencing. Gene Ontology (GO) functional annotation analysis revealed potential genes, processes and pathways with putative involvement in growth and development inhibition of breast muscle in broilers caused by aerial ammonia exposure. This study facilitates understanding of the genetic architecture of the chicken breast muscle transcriptome, and has identified candidate genes for breast muscle response to atmospheric ammonia exposure. PMID:27611572
Gene expression analysis of induced pluripotent stem cells from aneuploid chromosomal syndromes
2013-01-01
Background Human aneuploidy is the leading cause of early pregnancy loss, mental retardation, and multiple congenital anomalies. Due to the high mortality associated with aneuploidy, the pathophysiological mechanisms of aneuploidy syndrome remain largely unknown. Previous studies focused mostly on whether dosage compensation occurs, and the next generation transcriptomics sequencing technology RNA-seq is expected to eventually uncover the mechanisms of gene expression regulation and the related pathological phenotypes in human aneuploidy. Results Using next generation transcriptomics sequencing technology RNA-seq, we profiled the transcriptomes of four human aneuploid induced pluripotent stem cell (iPSC) lines generated from monosomy × (Turner syndrome), trisomy 8 (Warkany syndrome 2), trisomy 13 (Patau syndrome), and partial trisomy 11:22 (Emanuel syndrome) as well as two umbilical cord matrix iPSC lines as euploid controls to examine how phenotypic abnormalities develop with aberrant karyotype. A total of 466 M (50-bp) reads were obtained from the six iPSC lines, and over 13,000 mRNAs were identified by gene annotation. Global analysis of gene expression profiles and functional analysis of differentially expressed (DE) genes were implemented. Over 5000 DE genes are determined between aneuploidy and euploid iPSCs respectively while 9 KEGG pathways are overlapped enriched in four aneuploidy samples. Conclusions Our results demonstrate that the extra or missing chromosome has extensive effects on the whole transcriptome. Functional analysis of differentially expressed genes reveals that the genes most affected in aneuploid individuals are related to central nervous system development and tumorigenesis. PMID:24564826
Histological and Transcriptomic Analysis during Bulbil Formation in Lilium lancifolium
Yang, Panpan; Xu, Leifeng; Xu, Hua; Tang, Yuchao; He, Guoren; Cao, Yuwei; Feng, Yayan; Yuan, Suxia; Ming, Jun
2017-01-01
Aerial bulbils are an important propagative organ, playing an important role in population expansion. However, the detailed gene regulatory patterns and molecular mechanism underlying bulbil formation remain unclear. Triploid Lilium lancifolium, which develops many aerial bulbils on the leaf axils of middle-upper stem, is a useful species for investigating bulbil formation. To investigate the mechanism of bulbil formation in triploid L. lancifolium, we performed histological and transcriptomic analyses using samples of leaf axils located in the upper and lower stem of triploid L. lancifolium during bulbil formation. Histological results indicated that the bulbils of triploid L. lancifolium are derived from axillary meristems that initiate de novo from cells on the adaxial side of the petiole base. Transcriptomic analysis generated ~650 million high-quality reads and 11,871 differentially expressed genes (DEGs). Functional analysis showed that the DEGs were significantly enriched in starch and sucrose metabolism and plant hormone signal transduction. Starch synthesis and accumulation likely promoted the initiation of upper bulbils in triploid L. lancifolium. Hormone-associated pathways exhibited distinct patterns of change in each sample. Auxin likely promoted the initiation of bulbils and then inhibited further bulbil formation. High biosynthesis and low degradation of cytokinin might have led to bulbil formation in the upper leaf axil. The present study achieved a global transcriptomic analysis focused on gene expression changes and pathways' enrichment during upper bulbil formation in triploid L. lancifolium, laying a solid foundation for future molecular studies on bulbil formation. PMID:28912794
Agrawal, A; Khan, MJ; Graugnard, DE; Vailati-Riboni, M; Rodriguez-Zas, SL; Osorio, JS; Loor, JJ
2017-01-01
In the dairy industry, cow health and farmer profits depend on the balance between diet (ie, nutrient composition, daily intake) and metabolism. This is especially true during the transition period, where dramatic physiological changes foster vulnerability to immunosuppression, negative energy balance, and clinical and subclinical disorders. Using an Agilent microarray platform, this study examined changes in the transcriptome of bovine polymorphonuclear leukocytes (PMNLs) due to prepartal dietary intake. Holstein cows were fed a high-straw, control-energy diet (CON; NEL = 1.34 Mcal/kg) or overfed a moderate-energy diet (OVE; NEL = 1.62 Mcal/kg) during the dry period. Blood for PMNL isolation and metabolite analysis was collected at −14 and +7 days relative to parturition. At an analysis of variance false discovery rate <0.05, energy intake (OVE vs CON) influenced 1806 genes. Dynamic Impact Approach bioinformatics analysis classified treatment effects on Kyoto Encyclopedia of Genes and Genomes pathways, including activated oxidative phosphorylation and biosynthesis of unsaturated fatty acids and inhibited RNA polymerase, proteasome, and toll-like receptor signaling pathway. This analysis indicates that processes critical for energy metabolism and cellular and immune function were affected with mixed results. However, overall interpretation of the transcriptome data agreed in part with literature documenting a potentially detrimental, chronic activation of PMNL in response to overfeeding. The widespread, transcriptome-level changes captured here confirm the importance of dietary energy adjustments around calving on the immune system. PMID:28579762
Meena, Seema; Kumar, Sarma R; Venkata Rao, D K; Dwivedi, Varun; Shilpashree, H B; Rastogi, Shubhra; Shasany, Ajit K; Nagegowda, Dinesh A
2016-01-01
Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition.
Valenzuela-Muñoz, Valentina; Sturm, Armin; Gallardo-Escárate, Cristian
2015-04-09
ATP-binding cassette (ABC) protein family encode for membrane proteins involved in the transport of various biomolecules through the cellular membrane. These proteins have been identified in all taxa and present important physiological functions, including the process of insecticide detoxification in arthropods. For that reason the ectoparasite Caligus rogercresseyi represents a model species for understanding the molecular underpinnings involved in insecticide drug resistance. llumina sequencing was performed using sea lice exposed to 2 and 3 ppb of deltamethrin and azamethiphos. Contigs obtained from de novo assembly were annotated by Blastx. RNA-Seq analysis was performed and validated by qPCR analysis. From the transcriptome database of C. rogercresseyi, 57 putative members of ABC protein sequences were identified and phylogenetically classified into the eight subfamilies described for ABC transporters in arthropods. Transcriptomic profiles for ABC proteins subfamilies were evaluated throughout C. rogercresseyi development. Moreover, RNA-Seq analysis was performed for adult male and female salmon lice exposed to the delousing drugs azamethiphos and deltamethrin. High transcript levels of the ABCB and ABCC subfamilies were evidenced. Furthermore, SNPs mining was carried out for the ABC proteins sequences, revealing pivotal genomic information. The present study gives a comprehensive transcriptome analysis of ABC proteins from C. rogercresseyi, providing relevant information about transporter roles during ontogeny and in relation to delousing drug responses in salmon lice. This genomic information represents a valuable tool for pest management in the Chilean salmon aquaculture industry.
Meena, Seema; Kumar, Sarma R.; Venkata Rao, D. K.; Dwivedi, Varun; Shilpashree, H. B.; Rastogi, Shubhra; Shasany, Ajit K.; Nagegowda, Dinesh A.
2016-01-01
Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition. PMID:27516768
Codina-Solà, Marta; Rodríguez-Santiago, Benjamín; Homs, Aïda; Santoyo, Javier; Rigau, Maria; Aznar-Laín, Gemma; Del Campo, Miguel; Gener, Blanca; Gabau, Elisabeth; Botella, María Pilar; Gutiérrez-Arumí, Armand; Antiñolo, Guillermo; Pérez-Jurado, Luis Alberto; Cuscó, Ivon
2015-01-01
Autism spectrum disorders (ASD) are a group of neurodevelopmental disorders with high heritability. Recent findings support a highly heterogeneous and complex genetic etiology including rare de novo and inherited mutations or chromosomal rearrangements as well as double or multiple hits. We performed whole-exome sequencing (WES) and blood cell transcriptome by RNAseq in a subset of male patients with idiopathic ASD (n = 36) in order to identify causative genes, transcriptomic alterations, and susceptibility variants. We detected likely monogenic causes in seven cases: five de novo (SCN2A, MED13L, KCNV1, CUL3, and PTEN) and two inherited X-linked variants (MAOA and CDKL5). Transcriptomic analyses allowed the identification of intronic causative mutations missed by the usual filtering of WES and revealed functional consequences of some rare mutations. These included aberrant transcripts (PTEN, POLR3C), deregulated expression in 1.7% of mutated genes (that is, SEMA6B, MECP2, ANK3, CREBBP), allele-specific expression (FUS, MTOR, TAF1C), and non-sense-mediated decay (RIT1, ALG9). The analysis of rare inherited variants showed enrichment in relevant pathways such as the PI3K-Akt signaling and the axon guidance. Integrative analysis of WES and blood RNAseq data has proven to be an efficient strategy to identify likely monogenic forms of ASD (19% in our cohort), as well as additional rare inherited mutations that can contribute to ASD risk in a multifactorial manner. Blood transcriptomic data, besides validating 88% of expressed variants, allowed the identification of missed intronic mutations and revealed functional correlations of genetic variants, including changes in splicing, expression levels, and allelic expression.
He, Lin; Jiang, Hui; Cao, Dandan; Liu, Lihua; Hu, Songnian; Wang, Qun
2013-01-01
The accessory sex gland (ASG) is an important component of the male reproductive system, which functions to enhance the fertility of spermatozoa during male reproduction. Certain proteins secreted by the ASG are known to bind to the spermatozoa membrane and affect its function. The ASG gene expression profile in Chinese mitten crab (Eriocheir sinensis) has not been extensively studied, and limited genetic research has been conducted on this species. The advent of high-throughput sequencing technologies enables the generation of genomic resources within a short period of time and at minimal cost. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for the ASG of E. sinensis using Illumina sequencing technology. This analysis yielded a total of 33,221,284 sequencing reads, including 2.6 Gb of total nucleotides. Reads were assembled into 85,913 contigs (average 218 bp), or 58,567 scaffold sequences (average 292 bp), that identified 37,955 unigenes (average 385 bp). We assembled all unigenes and compared them with the published testis transcriptome from E. sinensis. In order to identify which genes may be involved in ASG function, as it pertains to modification of spermatozoa, we compared the ASG and testis transcriptome of E. sinensis. Our analysis identified specific genes with both higher and lower tissue expression levels in the two tissues, and the functions of these genes were analyzed to elucidate their potential roles during maturation of spermatozoa. Availability of detailed transcriptome data from ASG and testis in E. sinensis can assist our understanding of the molecular mechanisms involved with spermatozoa conservation, transport, maturation and capacitation and potentially acrosome activation. PMID:23342039
Hussain, Tajammul; Plunkett, Blue; Ejaz, Mahwish; Espley, Richard V.; Kayser, Oliver
2018-01-01
The liverwort Radula marginata belongs to the bryophyte division of land plants and is a prospective alternate source of cannabinoid-like compounds. However, mechanistic insights into the molecular pathways directing the synthesis of these cannabinoid-like compounds have been hindered due to the lack of genetic information. This prompted us to do deep sequencing, de novo assembly and annotation of R. marginata transcriptome, which resulted in the identification and validation of the genes for cannabinoid biosynthetic pathway. In total, we have identified 11,421 putative genes encoding 1,554 enzymes from 145 biosynthetic pathways. Interestingly, we have identified all the upstream genes of the central precursor of cannabinoid biosynthesis, cannabigerolic acid (CBGA), including its two first intermediates, stilbene acid (SA) and geranyl diphosphate (GPP). Expression of all these genes was validated using quantitative real-time PCR. We have characterized the protein structure of stilbene synthase (STS), which is considered as a homolog of olivetolic acid in R. marginata. Moreover, the metabolomics approach enabled us to identify CBGA-analogous compounds using electrospray ionization mass spectrometry (ESI-MS/MS) and gas chromatography mass spectrometry (GC-MS). Transcriptomic analysis revealed 1085 transcription factors (TF) from 39 families. Comparative analysis showed that six TF families have been uniquely predicted in R. marginata. In addition, the bioinformatics analysis predicted a large number of simple sequence repeats (SSRs) and non-coding RNAs (ncRNAs). Our results collectively provide mechanistic insights into the putative precursor genes for the biosynthesis of cannabinoid-like compounds and a novel transcriptomic resource for R. marginata. The large-scale transcriptomic resource generated in this study would further serve as a reference transcriptome to explore the Radulaceae family.
Transcriptomic responses to wounding: meta-analysis of gene expression microarray data.
Sass, Piotr Andrzej; Dąbrowski, Michał; Charzyńska, Agata; Sachadyn, Paweł
2017-11-07
A vast amount of microarray data on transcriptomic response to injury has been collected so far. We designed the analysis in order to identify the genes displaying significant changes in expression after wounding in different organisms and tissues. This meta-analysis is the first study to compare gene expression profiles in response to wounding in as different tissues as heart, liver, skin, bones, and spinal cord, and species, including rat, mouse and human. We collected available microarray transcriptomic profiles obtained from different tissue injury experiments and selected the genes showing a minimum twofold change in expression in response to wounding in prevailing number of experiments for each of five wound healing stages we distinguished: haemostasis & early inflammation, inflammation, early repair, late repair and remodelling. During the initial phases after wounding, haemostasis & early inflammation and inflammation, the transcriptomic responses showed little consistency between different tissues and experiments. For the later phases, wound repair and remodelling, we identified a number of genes displaying similar transcriptional responses in all examined tissues. As revealed by ontological analyses, activation of certain pathways was rather specific for selected phases of wound healing, such as e.g. responses to vitamin D pronounced during inflammation. Conversely, we observed induction of genes encoding inflammatory agents and extracellular matrix proteins in all wound healing phases. Further, we selected several genes differentially upregulated throughout different stages of wound response, including established factors of wound healing in addition to those previously unreported in this context such as PTPRC and AQP4. We found that transcriptomic responses to wounding showed similar traits in a diverse selection of tissues including skin, muscles, internal organs and nervous system. Notably, we distinguished transcriptional induction of inflammatory genes not only in the initial response to wounding, but also later, during wound repair and tissue remodelling.
Acclimation of Antarctic Chlamydomonas to the sea-ice environment: a transcriptomic analysis.
Liu, Chenlin; Wang, Xiuliang; Wang, Xingna; Sun, Chengjun
2016-07-01
The Antarctic green alga Chlamydomonas sp. ICE-L was isolated from sea ice. As a psychrophilic microalga, it can tolerate the environmental stress in the sea-ice brine, such as freezing temperature and high salinity. We performed a transcriptome analysis to identify freezing stress responding genes and explore the extreme environmental acclimation-related strategies. Here, we show that many genes in ICE-L transcriptome that encoding PUFA synthesis enzymes, molecular chaperon proteins, and cell membrane transport proteins have high similarity to the gens from Antarctic bacteria. These ICE-L genes are supposed to be acquired through horizontal gene transfer from its symbiotic microbes in the sea-ice brine. The presence of these genes in both sea-ice microalgae and bacteria indicated the biological processes they involved in are possibly contributing to ICE-L success in sea ice. In addition, the biological pathways were compared between ICE-L and its closely related sister species, Chlamydomonas reinhardtii and Volvox carteri. In ICE-L transcripome, many sequences homologous to the plant or bacteria proteins in the post-transcriptional, post-translational modification, and signal-transduction KEGG pathways, are absent in the nonpsychrophilic green algae. These complex structural components might imply enhanced stress adaptation capacity. At last, differential gene expression analysis at the transcriptome level of ICE-L indicated that genes that associated with post-translational modification, lipid metabolism, and nitrogen metabolism are responding to the freezing treatment. In conclusion, the transcriptome of Chlamydomonas sp. ICE-L is very useful for exploring the mutualistic interaction between microalgae and bacteria in sea ice; and discovering the specific genes and metabolism pathways responding to the freezing acclimation in psychrophilic microalgae.
Madio, Bruno; Undheim, Eivind A B; King, Glenn F
2017-08-23
More than a century of research on sea anemone venoms has shown that they contain a diversity of biologically active proteins and peptides. However, recent omics studies have revealed that much of the venom proteome remains unexplored. We used, for the first time, a combination of proteomic and transcriptomic techniques to obtain a holistic overview of the venom arsenal of the well-studied sea anemone Stichodactyla haddoni. A purely search-based approach to identify putative toxins in a transcriptome from tentacles regenerating after venom extraction identified 508 unique toxin-like transcripts grouped into 63 families. However, proteomic analysis of venom revealed that 52 of these toxin families are likely false positives. In contrast, the combination of transcriptomic and proteomic data enabled positive identification of 23 families of putative toxins, 12 of which have no homology known proteins or peptides. Our data highlight the importance of using proteomics of milked venom to correctly identify venom proteins/peptides, both known and novel, while minimizing false positive identifications from non-toxin homologues identified in transcriptomes of venom-producing tissues. This work lays the foundation for uncovering the role of individual toxins in sea anemone venom and how they contribute to the envenomation of prey, predators, and competitors. Proteomic analysis of milked venom combined with analysis of a tentacle transcriptome revealed the full extent of the venom arsenal of the sea anemone Stichodactyla haddoni. This combined approach led to the discovery of 12 entirely new families of disulfide-rich peptides and proteins in a genus of anemones that have been studied for over a century. Copyright © 2017 Elsevier B.V. All rights reserved.
Analysis of the Salivary Gland Transcriptome of Frankliniella occidentalis
Stafford-Banks, Candice A.; Rotenberg, Dorith; Johnson, Brian R.; Whitfield, Anna E.; Ullman, Diane E.
2014-01-01
Saliva is known to play a crucial role in insect feeding behavior and virus transmission. Currently, little is known about the salivary glands and saliva of thrips, despite the fact that Frankliniella occidentalis (Pergande) (the western flower thrips) is a serious pest due to its destructive feeding, wide host range, and transmission of tospoviruses. As a first step towards characterizing thrips salivary gland functions, we sequenced the transcriptome of the primary salivary glands of F. occidentalis using short read sequencing (Illumina) technology. A de novo-assembled transcriptome revealed 31,392 high quality contigs with an average size of 605 bp. A total of 12,166 contigs had significant BLASTx or tBLASTx hits (E≤1.0E−6) to known proteins, whereas a high percentage (61.24%) of contigs had no apparent protein or nucleotide hits. Comparison of the F. occidentalis salivary gland transcriptome (sialotranscriptome) against a published F. occidentalis full body transcriptome assembled from Roche-454 reads revealed several contigs with putative annotations associated with salivary gland functions. KEGG pathway analysis of the sialotranscriptome revealed that the majority (18 out of the top 20 predicted KEGG pathways) of the salivary gland contig sequences match proteins involved in metabolism. We identified several genes likely to be involved in detoxification and inhibition of plant defense responses including aldehyde dehydrogenase, metalloprotease, glucose oxidase, glucose dehydrogenase, and regucalcin. We also identified several genes that may play a role in the extra-oral digestion of plant structural tissues including β-glucosidase and pectin lyase; and the extra-oral digestion of sugars, including α-amylase, maltase, sucrase, and α-glucosidase. This is the first analysis of a sialotranscriptome for any Thysanopteran species and it provides a foundational tool to further our understanding of how thrips interact with their plant hosts and the viruses they transmit. PMID:24736614
Analysis of the salivary gland transcriptome of Frankliniella occidentalis.
Stafford-Banks, Candice A; Rotenberg, Dorith; Johnson, Brian R; Whitfield, Anna E; Ullman, Diane E
2014-01-01
Saliva is known to play a crucial role in insect feeding behavior and virus transmission. Currently, little is known about the salivary glands and saliva of thrips, despite the fact that Frankliniella occidentalis (Pergande) (the western flower thrips) is a serious pest due to its destructive feeding, wide host range, and transmission of tospoviruses. As a first step towards characterizing thrips salivary gland functions, we sequenced the transcriptome of the primary salivary glands of F. occidentalis using short read sequencing (Illumina) technology. A de novo-assembled transcriptome revealed 31,392 high quality contigs with an average size of 605 bp. A total of 12,166 contigs had significant BLASTx or tBLASTx hits (E≤1.0E-6) to known proteins, whereas a high percentage (61.24%) of contigs had no apparent protein or nucleotide hits. Comparison of the F. occidentalis salivary gland transcriptome (sialotranscriptome) against a published F. occidentalis full body transcriptome assembled from Roche-454 reads revealed several contigs with putative annotations associated with salivary gland functions. KEGG pathway analysis of the sialotranscriptome revealed that the majority (18 out of the top 20 predicted KEGG pathways) of the salivary gland contig sequences match proteins involved in metabolism. We identified several genes likely to be involved in detoxification and inhibition of plant defense responses including aldehyde dehydrogenase, metalloprotease, glucose oxidase, glucose dehydrogenase, and regucalcin. We also identified several genes that may play a role in the extra-oral digestion of plant structural tissues including β-glucosidase and pectin lyase; and the extra-oral digestion of sugars, including α-amylase, maltase, sucrase, and α-glucosidase. This is the first analysis of a sialotranscriptome for any Thysanopteran species and it provides a foundational tool to further our understanding of how thrips interact with their plant hosts and the viruses they transmit.
Transcriptomic immune response of Tenebrio molitor pupae to parasitization by Scleroderma guani.
Zhu, Jia-Ying; Yang, Pu; Zhang, Zhong; Wu, Guo-Xing; Yang, Bin
2013-01-01
Host and parasitoid interaction is one of the most fascinating relationships of insects, which is currently receiving an increasing interest. Understanding the mechanisms evolved by the parasitoids to evade or suppress the host immune system is important for dissecting this interaction, while it was still poorly known. In order to gain insight into the immune response of Tenebrio molitor to parasitization by Scleroderma guani, the transcriptome of T. molitor pupae was sequenced with focus on immune-related gene, and the non-parasitized and parasitized T. molitor pupae were analyzed by digital gene expression (DGE) analysis with special emphasis on parasitoid-induced immune-related genes using Illumina sequencing. In a single run, 264,698 raw reads were obtained. De novo assembly generated 71,514 unigenes with mean length of 424 bp. Of those unigenes, 37,373 (52.26%) showed similarity to the known proteins in the NCBI nr database. Via analysis of the transcriptome data in depth, 430 unigenes related to immunity were identified. DGE analysis revealed that parasitization by S. guani had considerable impacts on the transcriptome profile of T. molitor pupae, as indicated by the significant up- or down-regulation of 3,431 parasitism-responsive transcripts. The expression of a total of 74 unigenes involved in immune response of T. molitor was significantly altered after parasitization. obtained T. molitor transcriptome, in addition to establishing a fundamental resource for further research on functional genomics, has allowed the discovery of a large group of immune genes that might provide a meaningful framework to better understand the immune response in this species and other beetles. The DGE profiling data provides comprehensive T. molitor immune gene expression information at the transcriptional level following parasitization, and sheds valuable light on the molecular understanding of the host-parasitoid interaction.
Torre, Sara; Tattini, Massimiliano; Brunetti, Cecilia; Guidi, Lucia; Gori, Antonella; Marzano, Cristina; Landi, Marco; Sebastiani, Federico
2016-01-01
Sweet basil (Ocimum basilicum), one of the most popular cultivated herbs worldwide, displays a number of varieties differing in several characteristics, such as the color of the leaves. The development of a reference transcriptome for sweet basil, and the analysis of differentially expressed genes in acyanic and cyanic cultivars exposed to natural sunlight irradiance, has interest from horticultural and biological point of views. There is still great uncertainty about the significance of anthocyanins in photoprotection, and how green and red morphs may perform when exposed to photo-inhibitory light, a condition plants face on daily and seasonal basis. We sequenced the leaf transcriptome of the green-leaved Tigullio (TIG) and the purple-leaved Red Rubin (RR) exposed to full sunlight over a four-week experimental period. We assembled and annotated 111,007 transcripts. A total of 5,468 and 5,969 potential SSRs were identified in TIG and RR, respectively, out of which 66 were polymorphic in silico. Comparative analysis of the two transcriptomes showed 2,372 differentially expressed genes (DEGs) clustered in 222 enriched Gene ontology terms. Green and red basil mostly differed for transcripts abundance of genes involved in secondary metabolism. While the biosynthesis of waxes was up-regulated in red basil, the biosynthesis of flavonols and carotenoids was up-regulated in green basil. Data from our study provides a comprehensive transcriptome survey, gene sequence resources and microsatellites that can be used for further investigations in sweet basil. The analysis of DEGs and their functional classification also offers new insights on the functional role of anthocyanins in photoprotection. PMID:27483170
Zhang, Le-Le; Zhang, Zi-Ning; Wu, Xian; Jiang, Yong-Jun; Fu, Ya-Jing; Shang, Hong
2017-09-12
A small proportion of HIV-infected patients remain clinically and/or immunologically stable for years, including elite controllers (ECs) who have undetectable viremia (<50 copies/ml) and long-term nonprogressors (LTNPs) who maintain normal CD4 + T cell counts for prolonged periods (>10 years). However, the mechanism of nonprogression needs to be further resolved. In this study, a transcriptome meta-analysis was performed on nonprogressor and progressor microarray data to identify differential transcriptome pathways and potential biomarkers. Using the INMEX (integrative meta-analysis of expression data) program, we performed the meta-analysis to identify consistently differentially expressed genes (DEGs) in nonprogressors and further performed functional interpretation (gene ontology analysis and pathway analysis) of the DEGs identified in the meta-analysis. Five microarray datasets (81 cases and 98 controls in total), including whole blood, CD4 + and CD8 + T cells, were collected for meta-analysis. We determined that nonprogressors have reduced expression of important interferon-stimulated genes (ISGs), CD38, lymphocyte activation gene 3 (LAG-3) in whole blood, CD4 + and CD8 + T cells. Gene ontology (GO) analysis showed a significant enrichment in DEGs that function in the type I interferon signaling pathway. Upregulated pathways, including the PI3K-Akt signaling pathway in whole blood, cytokine-cytokine receptor interaction in CD4 + T cells and the MAPK signaling pathway in CD8 + T cells, were identified in nonprogressors compared with progressors. In each metabolic functional category, the number of downregulated DEGs was more than the upregulated DEGs, and almost all genes were downregulated DEGs in the oxidative phosphorylation (OXPHOS) and tricarboxylic acid (TCA) cycle in the three types of samples. Our transcriptomic meta-analysis provides a comprehensive evaluation of the gene expression profiles in major blood types of nonprogressors, providing new insights in the understanding of HIV pathogenesis and developing strategies to delay HIV disease progression.
Sarkar, Soumyadev; Chakravorty, Somnath; Mukherjee, Avishek; Bhattacharya, Debanjana; Bhattacharya, Semantee; Gachhui, Ratan
2018-03-01
Nitrogen is a key nutrient for all cell forms. Most organisms respond to nitrogen scarcity by slowing down their growth rate. On the contrary, our previous studies have shown that Papiliotrema laurentii strain RY1 has a robust growth under nitrogen starvation. To understand the global regulation that leads to such an extraordinary response, we undertook a de novo approach for transcriptome analysis of the yeast. Close to 33 million sequence reads of high quality for nitrogen limited and enriched condition were generated using Illumina NextSeq500. Trinity analysis and clustered transcripts annotation of the reads produced 17,611 unigenes, out of which 14,157 could be annotated. Gene Ontology term analysis generated 44.92% cellular component terms, 39.81% molecular function terms and 15.24% biological process terms. The most over represented pathways in general were translation, carbohydrate metabolism, amino acid metabolism, general metabolism, folding, sorting, degradation followed by transport and catabolism, nucleotide metabolism, replication and repair, transcription and lipid metabolism. A total of 4256 Single Sequence Repeats were identified. Differential gene expression analysis detected 996 P-significant transcripts to reveal transmembrane transport, lipid homeostasis, fatty acid catabolism and translation as the enriched terms which could be essential for Papiliotrema laurentii strain RY1 to adapt during nitrogen deprivation. Transcriptome data was validated by quantitative real-time PCR analysis of twelve transcripts. To the best of our knowledge, this is the first report of Papiliotrema laurentii strain RY1 transcriptome which would play a pivotal role in understanding the biochemistry of the yeast under acute nitrogen stress and this study would be encouraging to initiate extensive investigations into this Papiliotrema system. Copyright © 2017 Elsevier B.V. All rights reserved.
Irla, Marta; Neshat, Armin; Brautaset, Trygve; Rückert, Christian; Kalinowski, Jörn; Wendisch, Volker F
2015-02-14
Bacillus methanolicus MGA3 is a thermophilic, facultative ribulose monophosphate (RuMP) cycle methylotroph. Together with its ability to produce high yields of amino acids, the relevance of this microorganism as a promising candidate for biotechnological applications is evident. The B. methanolicus MGA3 genome consists of a 3,337,035 nucleotides (nt) circular chromosome, the 19,174 nt plasmid pBM19 and the 68,999 nt plasmid pBM69. 3,218 protein-coding regions were annotated on the chromosome, 22 on pBM19 and 82 on pBM69. In the present study, the RNA-seq approach was used to comprehensively investigate the transcriptome of B. methanolicus MGA3 in order to improve the genome annotation, identify novel transcripts, analyze conserved sequence motifs involved in gene expression and reveal operon structures. For this aim, two different cDNA library preparation methods were applied: one which allows characterization of the whole transcriptome and another which includes enrichment of primary transcript 5'-ends. Analysis of the primary transcriptome data enabled the detection of 2,167 putative transcription start sites (TSSs) which were categorized into 1,642 TSSs located in the upstream region (5'-UTR) of known protein-coding genes and 525 TSSs of novel antisense, intragenic, or intergenic transcripts. Firstly, 14 wrongly annotated translation start sites (TLSs) were corrected based on primary transcriptome data. Further investigation of the identified 5'-UTRs resulted in the detailed characterization of their length distribution and the detection of 75 hitherto unknown cis-regulatory RNA elements. Moreover, the exact TSSs positions were utilized to define conserved sequence motifs for translation start sites, ribosome binding sites and promoters in B. methanolicus MGA3. Based on the whole transcriptome data set, novel transcripts, operon structures and mRNA abundances were determined. The analysis of the operon structures revealed that almost half of the genes are transcribed monocistronically (940), whereas 1,164 genes are organized in 381 operons. Several of the genes related to methylotrophy had highly abundant transcripts. The extensive insights into the transcriptional landscape of B. methanolicus MGA3, gained in this study, represent a valuable foundation for further comparative quantitative transcriptome analyses and possibly also for the development of molecular biology tools which at present are very limited for this organism.
Identification of innate lymphoid cells in single-cell RNA-Seq data.
Suffiotti, Madeleine; Carmona, Santiago J; Jandus, Camilla; Gfeller, David
2017-07-01
Innate lymphoid cells (ILCs) consist of natural killer (NK) cells and non-cytotoxic ILCs that are broadly classified into ILC1, ILC2, and ILC3 subtypes. These cells recently emerged as important early effectors of innate immunity for their roles in tissue homeostasis and inflammation. Over the last few years, ILCs have been extensively studied in mouse and human at the functional and molecular level, including gene expression profiling. However, sorting ILCs with flow cytometry for gene expression analysis is a delicate and time-consuming process. Here we propose and validate a novel framework for studying ILCs at the transcriptomic level using single-cell RNA-Seq data. Our approach combines unsupervised clustering and a new cell type classifier trained on mouse ILC gene expression data. We show that this approach can accurately identify different ILCs, especially ILC2 cells, in human lymphocyte single-cell RNA-Seq data. Our new model relies only on genes conserved across vertebrates, thereby making it in principle applicable in any vertebrate species. Considering the rapid increase in throughput of single-cell RNA-Seq technology, our work provides a computational framework for studying ILC2 cells in single-cell transcriptomic data and may help exploring their conservation in distant vertebrate species.
Zywicki, Marek; Bakowska-Zywicka, Kamilla; Polacek, Norbert
2012-05-01
The exploration of the non-protein-coding RNA (ncRNA) transcriptome is currently focused on profiling of microRNA expression and detection of novel ncRNA transcription units. However, recent studies suggest that RNA processing can be a multi-layer process leading to the generation of ncRNAs of diverse functions from a single primary transcript. Up to date no methodology has been presented to distinguish stable functional RNA species from rapidly degraded side products of nucleases. Thus the correct assessment of widespread RNA processing events is one of the major obstacles in transcriptome research. Here, we present a novel automated computational pipeline, named APART, providing a complete workflow for the reliable detection of RNA processing products from next-generation-sequencing data. The major features include efficient handling of non-unique reads, detection of novel stable ncRNA transcripts and processing products and annotation of known transcripts based on multiple sources of information. To disclose the potential of APART, we have analyzed a cDNA library derived from small ribosome-associated RNAs in Saccharomyces cerevisiae. By employing the APART pipeline, we were able to detect and confirm by independent experimental methods multiple novel stable RNA molecules differentially processed from well known ncRNAs, like rRNAs, tRNAs or snoRNAs, in a stress-dependent manner.
Polak, Marta E; Ung, Chuin Ying; Masapust, Joanna; Freeman, Tom C; Ardern-Jones, Michael R
2017-04-06
Langerhans cells (LCs) are able to orchestrate adaptive immune responses in the skin by interpreting the microenvironmental context in which they encounter foreign substances, but the regulatory basis for this has not been established. Utilising systems immunology approaches combining in silico modelling of a reconstructed gene regulatory network (GRN) with in vitro validation of the predictions, we sought to determine the mechanisms of regulation of immune responses in human primary LCs. The key role of Interferon regulatory factors (IRFs) as controllers of the human Langerhans cell response to epidermal cytokines was revealed by whole transcriptome analysis. Applying Boolean logic we assembled a Petri net-based model of the IRF-GRN which provides molecular pathway predictions for the induction of different transcriptional programmes in LCs. In silico simulations performed after model parameterisation with transcription factor expression values predicted that human LC activation of antigen-specific CD8 T cells would be differentially regulated by epidermal cytokine induction of specific IRF-controlled pathways. This was confirmed by in vitro measurement of IFN-γ production by activated T cells. As a proof of concept, this approach shows that stochastic modelling of a specific immune networks renders transcriptome data valuable for the prediction of functional outcomes of immune responses.
Pick, Thea R; Bräutigam, Andrea; Schlüter, Urte; Denton, Alisandra K; Colmsee, Christian; Scholz, Uwe; Fahnenstich, Holger; Pieruschka, Roland; Rascher, Uwe; Sonnewald, Uwe; Weber, Andreas P M
2011-12-01
We systematically analyzed a developmental gradient of the third maize (Zea mays) leaf from the point of emergence into the light to the tip in 10 continuous leaf slices to study organ development and physiological and biochemical functions. Transcriptome analysis, oxygen sensitivity of photosynthesis, and photosynthetic rate measurements showed that the maize leaf undergoes a sink-to-source transition without an intermediate phase of C(3) photosynthesis or operation of a photorespiratory carbon pump. Metabolome and transcriptome analysis, chlorophyll and protein measurements, as well as dry weight determination, showed continuous gradients for all analyzed items. The absence of binary on-off switches and regulons pointed to a morphogradient along the leaf as the determining factor of developmental stage. Analysis of transcription factors for differential expression along the leaf gradient defined a list of putative regulators orchestrating the sink-to-source transition and establishment of C(4) photosynthesis. Finally, transcriptome and metabolome analysis, as well as enzyme activity measurements, and absolute quantification of selected metabolites revised the current model of maize C(4) photosynthesis. All data sets are included within the publication to serve as a resource for maize leaf systems biology.
A generic Transcriptomics Reporting Framework (TRF) for 'omics data processing and analysis.
Gant, Timothy W; Sauer, Ursula G; Zhang, Shu-Dong; Chorley, Brian N; Hackermüller, Jörg; Perdichizzi, Stefania; Tollefsen, Knut E; van Ravenzwaay, Ben; Yauk, Carole; Tong, Weida; Poole, Alan
2017-12-01
A generic Transcriptomics Reporting Framework (TRF) is presented that lists parameters that should be reported in 'omics studies used in a regulatory context. The TRF encompasses the processes from transcriptome profiling from data generation to a processed list of differentially expressed genes (DEGs) ready for interpretation. Included within the TRF is a reference baseline analysis (RBA) that encompasses raw data selection; data normalisation; recognition of outliers; and statistical analysis. The TRF itself does not dictate the methodology for data processing, but deals with what should be reported. Its principles are also applicable to sequencing data and other 'omics. In contrast, the RBA specifies a simple data processing and analysis methodology that is designed to provide a comparison point for other approaches and is exemplified here by a case study. By providing transparency on the steps applied during 'omics data processing and analysis, the TRF will increase confidence processing of 'omics data, and regulatory use. Applicability of the TRF is ensured by its simplicity and generality. The TRF can be applied to all types of regulatory 'omics studies, and it can be executed using different commonly available software tools. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
Pick, Thea R.; Bräutigam, Andrea; Schlüter, Urte; Denton, Alisandra K.; Colmsee, Christian; Scholz, Uwe; Fahnenstich, Holger; Pieruschka, Roland; Rascher, Uwe; Sonnewald, Uwe; Weber, Andreas P.M.
2011-01-01
We systematically analyzed a developmental gradient of the third maize (Zea mays) leaf from the point of emergence into the light to the tip in 10 continuous leaf slices to study organ development and physiological and biochemical functions. Transcriptome analysis, oxygen sensitivity of photosynthesis, and photosynthetic rate measurements showed that the maize leaf undergoes a sink-to-source transition without an intermediate phase of C3 photosynthesis or operation of a photorespiratory carbon pump. Metabolome and transcriptome analysis, chlorophyll and protein measurements, as well as dry weight determination, showed continuous gradients for all analyzed items. The absence of binary on–off switches and regulons pointed to a morphogradient along the leaf as the determining factor of developmental stage. Analysis of transcription factors for differential expression along the leaf gradient defined a list of putative regulators orchestrating the sink-to-source transition and establishment of C4 photosynthesis. Finally, transcriptome and metabolome analysis, as well as enzyme activity measurements, and absolute quantification of selected metabolites revised the current model of maize C4 photosynthesis. All data sets are included within the publication to serve as a resource for maize leaf systems biology. PMID:22186372
Valencia, Arnubio; Wang, Haichuan; Soto, Alberto; Aristizabal, Manuel; Arboleda, Jorge W; Eyun, Seong-Il; Noriega, Daniel D; Siegfried, Blair
2016-01-01
The banana weevil Cosmopolites sordidus is an important and serious insect pest in most banana and plantain-growing areas of the world. In spite of the economic importance of this insect pest very little genomic and transcriptomic information exists for this species. In the present study, we characterized the midgut transcriptome of C. sordidus using massive 454-pyrosequencing. We generated over 590,000 sequencing reads that assembled into 30,840 contigs with more than 400 bp, representing a significant expansion of existing sequences available for this insect pest. Among them, 16,427 contigs contained one or more GO terms. In addition, 15,263 contigs were assigned an EC number. In-depth transcriptome analysis identified genes potentially involved in insecticide resistance, peritrophic membrane biosynthesis, immunity-related function and defense against pathogens, and Bacillus thuringiensis toxins binding proteins as well as multiple enzymes involved with protein digestion. This transcriptome will provide a valuable resource for understanding larval physiology and for identifying novel target sites and management approaches for this important insect pest.
Valencia, Arnubio; Wang, Haichuan; Soto, Alberto; Aristizabal, Manuel; Arboleda, Jorge W.; Eyun, Seong-il; Noriega, Daniel D.; Siegfried, Blair
2016-01-01
The banana weevil Cosmopolites sordidus is an important and serious insect pest in most banana and plantain-growing areas of the world. In spite of the economic importance of this insect pest very little genomic and transcriptomic information exists for this species. In the present study, we characterized the midgut transcriptome of C. sordidus using massive 454-pyrosequencing. We generated over 590,000 sequencing reads that assembled into 30,840 contigs with more than 400 bp, representing a significant expansion of existing sequences available for this insect pest. Among them, 16,427 contigs contained one or more GO terms. In addition, 15,263 contigs were assigned an EC number. In-depth transcriptome analysis identified genes potentially involved in insecticide resistance, peritrophic membrane biosynthesis, immunity-related function and defense against pathogens, and Bacillus thuringiensis toxins binding proteins as well as multiple enzymes involved with protein digestion. This transcriptome will provide a valuable resource for understanding larval physiology and for identifying novel target sites and management approaches for this important insect pest. PMID:26949943
USDA-ARS?s Scientific Manuscript database
Rose is one of the most important cut flowers among ornamental plants. Rose flower longevity is largely dependent on the timing of petal shedding occurrence. To understand the molecular mechanism underlying petal abscission in rose, we performed transcriptome profiling of the petal abscission zone d...
USDA-ARS?s Scientific Manuscript database
This study reports generation of large-scale genomic resources for pigeonpea, a so-called ‘orphan crop species’ of the semi-arid tropic regions. Roche FLX/454 sequencing was carried out on a normalized cDNA pool prepared from 31 tissues produced 494,353 short transcript reads (STRs). Cluster analysi...
Santos, Patricia; Plaszczyca, Marian; Pawlowski, Katharina
2013-01-01
Actinorhizal root nodule symbioses are very diverse, and the symbiosis of Datisca glomerata has previously been shown to have many unusual aspects. In order to gain molecular information on the infection mechanism, nodule development and nodule metabolism, we compared the transcriptomes of D. glomerata roots and nodules. Root and nodule libraries representing the 3′-ends of cDNAs were subjected to high-throughput parallel 454 sequencing. To identify the corresponding genes and to improve the assembly, Illumina sequencing of the nodule transcriptome was performed as well. The evaluation revealed 406 differentially regulated genes, 295 of which (72.7%) could be assigned a function based on homology. Analysis of the nodule transcriptome showed that genes encoding components of the common symbiosis signaling pathway were present in nodules of D. glomerata, which in combination with the previously established function of SymRK in D. glomerata nodulation suggests that this pathway is also active in actinorhizal Cucurbitales. Furthermore, comparison of the D. glomerata nodule transcriptome with nodule transcriptomes from actinorhizal Fagales revealed a new subgroup of nodule-specific defensins that might play a role specific to actinorhizal symbioses. The D. glomerata members of this defensin subgroup contain an acidic C-terminal domain that was never found in plant defensins before. PMID:24009681
RNA-Seq Technology and Its Application in Fish Transcriptomics
Ba, Yi; Zhuang, Qianfeng
2014-01-01
Abstract High-throughput sequencing technologies, also known as next-generation sequencing (NGS) technologies, have revolutionized the way that genomic research is advancing. In addition to the static genome, these state-of-art technologies have been recently exploited to analyze the dynamic transcriptome, and the resulting technology is termed RNA sequencing (RNA-seq). RNA-seq is free from many limitations of other transcriptomic approaches, such as microarray and tag-based sequencing method. Although RNA-seq has only been available for a short time, studies using this method have completely changed our perspective of the breadth and depth of eukaryotic transcriptomes. In terms of the transcriptomics of teleost fishes, both model and non-model species have benefited from the RNA-seq approach and have undergone tremendous advances in the past several years. RNA-seq has helped not only in mapping and annotating fish transcriptome but also in our understanding of many biological processes in fish, such as development, adaptive evolution, host immune response, and stress response. In this review, we first provide an overview of each step of RNA-seq from library construction to the bioinformatic analysis of the data. We then summarize and discuss the recent biological insights obtained from the RNA-seq studies in a variety of fish species. PMID:24380445
Grace, Peter M.; Hurley, Daniel; Barratt, Daniel T.; Tsykin, Anna; Watkins, Linda R.; Rolan, Paul E.; Hutchinson, Mark R.
2017-01-01
A quantitative, peripherally accessible biomarker for neuropathic pain has great potential to improve clinical outcomes. Based on the premise that peripheral and central immunity contribute to neuropathic pain mechanisms, we hypothesized that biomarkers could be identified from the whole blood of adult male rats, by integrating graded chronic constriction injury (CCI), ipsilateral lumbar dorsal quadrant (iLDQ) and whole blood transcriptomes, and pathway analysis with pain behavior. Correlational bioinformatics identified a range of putative biomarker genes for allodynia intensity, many encoding for proteins with a recognized role in immune/nociceptive mechanisms. A selection of these genes was validated in a separate replication study. Pathway analysis of the iLDQ transcriptome identified Fcγ and Fcε signaling pathways, among others. This study is the first to employ the whole blood transcriptome to identify pain biomarker panels. The novel correlational bioinformatics, developed here, selected such putative biomarkers based on a correlation with pain behavior and formation of signaling pathways with iLDQ genes. Future studies may demonstrate the predictive ability of these biomarker genes across other models and additional variables. PMID:22697386
Liu, Miaomiao; Zhu, Jinhang; Wu, Shengbing; Wang, Chenkai; Guo, Xingyi; Wu, Jiawen; Zhou, Meiqi
2018-04-11
Artemisia argyi Lev. et Vant. (A. argyi) is widely utilized for moxibustion in Chinese medicine, and the mechanism underlying terpenoid biosynthesis in its leaves is suggested to play an important role in its medicinal use. However, the A. argyi transcriptome has not been sequenced. Herein, we performed RNA sequencing for A. argyi leaf, root and stem tissues to identify as many as possible of the transcribed genes. In total, 99,807 unigenes were assembled by analysing the expression profiles generated from the three tissue types, and 67,446 of those unigenes were annotated in public databases. We further performed differential gene expression analysis to compare leaf tissue with the other two tissue types and identified numerous genes that were specifically expressed or up-regulated in leaf tissue. Specifically, we identified multiple genes encoding significant enzymes or transcription factors related to terpenoid synthesis. This study serves as a valuable resource for transcriptome information, as many transcribed genes related to terpenoid biosynthesis were identified in the A. argyi transcriptome, providing a functional genomic basis for additional studies on molecular mechanisms underlying the medicinal use of A. argyi.
Huang, Xiaoyun; Zang, Xiaonan; Wu, Fei; Jin, Yuming; Wang, Haitao; Liu, Chang; Ding, Yating; He, Bangxiang; Xiao, Dongfang; Song, Xinwei; Liu, Zhu
2017-01-01
Gracilariopsis lemaneiformis (aka Gracilaria lemaneiformis) is a red macroalga rich in phycoerythrin, which can capture light efficiently and transfer it to photosystemⅡ. However, little is known about the synthesis of optically active phycoerythrinin in G. lemaneiformis at the molecular level. With the advent of high-throughput sequencing technology, analysis of genetic information for G. lemaneiformis by transcriptome sequencing is an effective means to get a deeper insight into the molecular mechanism of phycoerythrin synthesis. Illumina technology was employed to sequence the transcriptome of two strains of G. lemaneiformis- the wild type and a green-pigmented mutant. We obtained a total of 86915 assembled unigenes as a reference gene set, and 42884 unigenes were annotated in at least one public database. Taking the above transcriptome sequencing as a reference gene set, 4041 differentially expressed genes were screened to analyze and compare the gene expression profiles of the wild type and green mutant. By GO and KEGG pathway analysis, we concluded that three factors, including a reduction in the expression level of apo-phycoerythrin, an increase of chlorophyll light-harvesting complex synthesis, and reduction of phycoerythrobilin by competitive inhibition, caused the reduction of optically active phycoerythrin in the green-pigmented mutant.
Brahma, Rajeev Kungur; McCleary, Ryan J R; Kini, R Manjunatha; Doley, Robin
2015-01-01
Snake venoms are cocktails of protein toxins that play important roles in capture and digestion of prey. Significant qualitative and quantitative variation in snake venom composition has been observed among and within species. Understanding these variations in protein components is instrumental in interpreting clinical symptoms during human envenomation and in searching for novel venom proteins with potential therapeutic applications. In the last decade, transcriptomic analyses of venom glands have helped in understanding the composition of various snake venoms in great detail. Here we review transcriptomic analysis as a powerful tool for understanding venom profile, variation and evolution. Copyright © 2014 Elsevier Ltd. All rights reserved.
Meng, Xian-liang; Liu, Ping; Jia, Fu-long; Li, Jian; Gao, Bao-Quan
2015-01-01
The swimming crab Portunus trituberculatus is a commercially important crab species in East Asia countries. Gonadal development is a physiological process of great significance to the reproduction as well as commercial seed production for P. trituberculatus. However, little is currently known about the molecular mechanisms governing the developmental processes of gonads in this species. To open avenues of molecular research on P. trituberculatus gonadal development, Illumina paired-end sequencing technology was employed to develop deep-coverage transcriptome sequencing data for its gonads. Illumina sequencing generated 58,429,148 and 70,474,978 high-quality reads from the ovary and testis cDNA library, respectively. All these reads were assembled into 54,960 unigenes with an average sequence length of 879 bp, of which 12,340 unigenes (22.45% of the total) matched sequences in GenBank non-redundant database. Based on our transcriptome analysis as well as published literature, a number of candidate genes potentially involved in the regulation of gonadal development of P. trituberculatus were identified, such as FAOMeT, mPRγ, PGMRC1, PGDS, PGER4, 3β-HSD and 17β-HSDs. Differential expression analysis generated 5,919 differentially expressed genes between ovary and testis, among which many genes related to gametogenesis and several genes previously reported to be critical in differentiation and development of gonads were found, including Foxl2, Wnt4, Fst, Fem-1 and Sox9. Furthermore, 28,534 SSRs and 111,646 high-quality SNPs were identified in this transcriptome dataset. This work represents the first transcriptome analysis of P. trituberculatus gonads using the next generation sequencing technology and provides a valuable dataset for understanding molecular mechanisms controlling development of gonads and facilitating future investigation of reproductive biology in this species. The molecular markers obtained in this study will provide a fundamental basis for population genetics and functional genomics in P. trituberculatus and other closely related species. PMID:26042806
Narnoliya, Lokesh K; Kaushal, Girija; Singh, Sudhir P; Sangwan, Rajender S
2017-01-13
Rose-scented geranium (Pelargonium sp.) is a perennial herb that produces a high value essential oil of fragrant significance due to the characteristic compositional blend of rose-oxide and acyclic monoterpenoids in foliage. Recently, the plant has also been shown to produce tartaric acid in leaf tissues. Rose-scented geranium represents top-tier cash crop in terms of economic returns and significance of the plant and plant products. However, there has hardly been any study on its metabolism and functional genomics, nor any genomic expression dataset resource is available in public domain. Therefore, to begin the gains in molecular understanding of specialized metabolic pathways of the plant, de novo sequencing of rose-scented geranium leaf transcriptome, transcript assembly, annotation, expression profiling as well as their validation were carried out. De novo transcriptome analysis resulted a total of 78,943 unique contigs (average length: 623 bp, and N50 length: 752 bp) from 15.44 million high quality raw reads. In silico functional annotation led to the identification of several putative genes representing terpene, ascorbic acid and tartaric acid biosynthetic pathways, hormone metabolism, and transcription factors. Additionally, a total of 6,040 simple sequence repeat (SSR) motifs were identified in 6.8% of the expressed transcripts. The highest frequency of SSR was of tri-nucleotides (50%). Further, transcriptome assembly was validated for randomly selected putative genes by standard PCR-based approach. In silico expression profile of assembled contigs were validated by real-time PCR analysis of selected transcripts. Being the first report on transcriptome analysis of rose-scented geranium the data sets and the leads and directions reflected in this investigation will serve as a foundation for pursuing and understanding molecular aspects of its biology, and specialized metabolic pathways, metabolic engineering, genetic diversity as well as molecular breeding.
Influence of socioeconomic status on the whole blood transcriptome in African Americans.
Gaye, Amadou; Gibbons, Gary H; Barry, Charles; Quarells, Rakale; Davis, Sharon K
2017-01-01
The correlation between low socioeconomic status (SES) and poor health outcome or higher risk of disease has been consistently reported by many epidemiological studies across various race/ancestry groups. However, the biological mechanisms linking low SES to disease and/or disease risk factors are not well understood and remain relatively under-studied. The analysis of the blood transcriptome is a promising window for elucidating how social and environmental factors influence the molecular networks governing health and disease. To further define the mechanistic pathways between social determinants and health, this study examined the impact of SES on the blood transcriptome in a sample of African-Americans. An integrative approach leveraging three complementary methods (Weighted Gene Co-expression Network Analysis, Random Forest and Differential Expression) was adopted to identify the most predictive and robust transcriptome pathways associated with SES. We analyzed the expression of 15079 genes (RNA-seq) from whole blood across 36 samples. The results revealed a cluster of 141 co-expressed genes over-expressed in the low SES group. Three pro-inflammatory pathways (IL-8 Signaling, NF-κB Signaling and Dendritic Cell Maturation) are activated in this module and over-expressed in low SES. Random Forest analysis revealed 55 of the 141 genes that, collectively, predict SES with an area under the curve of 0.85. One third of the 141 genes are significantly over-expressed in the low SES group. Lower SES has consistently been linked to many social and environmental conditions acting as stressors and known to be correlated with vulnerability to chronic illnesses (e.g. asthma, diabetes) associated with a chronic inflammatory state. Our unbiased analysis of the blood transcriptome in African-Americans revealed evidence of a robust molecular signature of increased inflammation associated with low SES. The results provide a plausible link between the social factors and chronic inflammation.
Alkan, Noam; Friedlander, Gilgi; Ment, Dana; Prusky, Dov; Fluhr, Robert
2015-01-01
The fungus Colletotrichum gloeosporioides breaches the fruit cuticle but remains quiescent until fruit ripening signals a switch to necrotrophy, culminating in devastating anthracnose disease. There is a need to understand the distinct fungal arms strategy and the simultaneous fruit response. Transcriptome analysis of fungal-fruit interactions was carried out concurrently in the appressoria, quiescent and necrotrophic stages. Conidia germinating on unripe fruit cuticle showed stage-specific transcription that was accompanied by massive fruit defense responses. The subsequent quiescent stage showed the development of dendritic-like structures and swollen hyphae within the fruit epidermis. The quiescent fungal transcriptome was characterized by activation of chromatin remodeling genes and unsuspected environmental alkalization. Fruit response was portrayed by continued highly integrated massive up-regulation of defense genes. During cuticle infection of green or ripe fruit, fungi recapitulate the same developmental stages but with differing quiescent time spans. The necrotrophic stage showed a dramatic shift in fungal metabolism and up-regulation of pathogenicity factors. Fruit response to necrotrophy showed activation of the salicylic acid pathway, climaxing in cell death. Transcriptome analysis of C. gloeosporioides infection of fruit reveals its distinct stage-specific lifestyle and the concurrent changing fruit response, deepening our perception of the unfolding fungal-fruit arms and defenses race. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
Liu, Lei; Fu, Yuanyuan; Zhu, Fang; Mu, Changkao; Li, Ronghua; Song, Weiwei; Shi, Ce; Ye, Yangfang; Wang, Chunlin
2018-06-05
The swimming crab (Portunus trituberculatus) is among the most economically important seawater crustacean species in Asia. Despite its commercial importance and being well-studied status, genomic and transcriptomic data are scarce for this crab species. In the present study, limb bud tissue was collected at different developmental stages post amputation for transcriptomic analysis. Illumina RNA-sequencing was applied to characterise the limb regeneration transcriptome and identify the most characteristic genes. A total of 289,018 transcripts were obtained by clustering and assembly of clean reads, producing 150,869 unigenes with an average length of 956 bp. Subsequent analysis revealed WNT signalling as the key pathway involved in limb regeneration, with WNT4 a key mediator. Overall, limb regeneration appears to be regulated by multiple signalling pathways, with numerous cell differentiation, muscle growth, moult, metabolism, and immune-related genes upregulated, including WNT4, LAMA, FIP2, FSTL5, TNC, HUS1, SWI5, NCGL, SLC22, PLA2, Tdc2, SMOX, GDH, and SMPD4. This is the first experimental study done on regenerating claws of P. trituberculatus. These findings expand existing sequence resources for crab species, and will likely accelerate research into regeneration and development in crustaceans, particularly functional studies on genes involved in limb regeneration. Copyright © 2018 Elsevier B.V. All rights reserved.
Niu, Jun; Wang, Jia; An, Jiyong; Liu, Lili; Lin, Zixin; Wang, Rui; Wang, Libing; Ma, Chao; Shi, Lingling; Lin, Shanzhi
2016-01-01
Recently, our transcriptomic analysis has identified some functional genes responsible for oil biosynthesis in developing SASK, yet miRNA-mediated regulation for SASK development and oil accumulation is poorly understood. Here, 3 representative periods of 10, 30 and 60 DAF were selected for sRNA sequencing based on the dynamic patterns of growth tendency and oil content of developing SASK. By miRNA transcriptomic analysis, we characterized 296 known and 44 novel miRNAs in developing SASK, among which 36 known and 6 novel miRNAs respond specifically to developing SASK. Importantly, we performed an integrated analysis of mRNA and miRNA transcriptome as well as qRT-PCR detection to identify some key miRNAs and their targets (miR156-SPL, miR160-ARF18, miR164-NAC1, miR171h-SCL6, miR172-AP2, miR395-AUX22B, miR530-P2C37, miR393h-TIR1/AFB2 and psi-miRn5-SnRK2A) potentially involved in developing response and hormone signaling of SASK. Our results provide new insights into the important regulatory function of cross-talk between development response and hormone signaling for SASK oil accumulation. PMID:27762296
Transcriptome Dynamics during Maize Endosperm Development
Feng, Jiaojiao; Xu, Shutu; Wang, Lei; Li, Feifei; Li, Yibo; Zhang, Renhe; Zhang, Xinghua; Xue, Jiquan; Guo, Dongwei
2016-01-01
The endosperm is a major organ of the seed that plays vital roles in determining seed weight and quality. However, genome-wide transcriptome patterns throughout maize endosperm development have not been comprehensively investigated to date. Accordingly, we performed a high-throughput RNA sequencing (RNA-seq) analysis of the maize endosperm transcriptome at 5, 10, 15 and 20 days after pollination (DAP). We found that more than 11,000 protein-coding genes underwent alternative splicing (AS) events during the four developmental stages studied. These genes were mainly involved in intracellular protein transport, signal transmission, cellular carbohydrate metabolism, cellular lipid metabolism, lipid biosynthesis, protein modification, histone modification, cellular amino acid metabolism, and DNA repair. Additionally, 7,633 genes, including 473 transcription factors (TFs), were differentially expressed among the four developmental stages. The differentially expressed TFs were from 50 families, including the bZIP, WRKY, GeBP and ARF families. Further analysis of the stage-specific TFs showed that binding, nucleus and ligand-dependent nuclear receptor activities might be important at 5 DAP, that immune responses, signalling, binding and lumen development are involved at 10 DAP, that protein metabolic processes and the cytoplasm might be important at 15 DAP, and that the responses to various stimuli are different at 20 DAP compared with the other developmental stages. This RNA-seq analysis provides novel, comprehensive insights into the transcriptome dynamics during early endosperm development in maize. PMID:27695101
Niu, Jun; Wang, Jia; An, Jiyong; Liu, Lili; Lin, Zixin; Wang, Rui; Wang, Libing; Ma, Chao; Shi, Lingling; Lin, Shanzhi
2016-10-20
Recently, our transcriptomic analysis has identified some functional genes responsible for oil biosynthesis in developing SASK, yet miRNA-mediated regulation for SASK development and oil accumulation is poorly understood. Here, 3 representative periods of 10, 30 and 60 DAF were selected for sRNA sequencing based on the dynamic patterns of growth tendency and oil content of developing SASK. By miRNA transcriptomic analysis, we characterized 296 known and 44 novel miRNAs in developing SASK, among which 36 known and 6 novel miRNAs respond specifically to developing SASK. Importantly, we performed an integrated analysis of mRNA and miRNA transcriptome as well as qRT-PCR detection to identify some key miRNAs and their targets (miR156-SPL, miR160-ARF18, miR164-NAC1, miR171h-SCL6, miR172-AP2, miR395-AUX22B, miR530-P2C37, miR393h-TIR1/AFB2 and psi-miRn5-SnRK2A) potentially involved in developing response and hormone signaling of SASK. Our results provide new insights into the important regulatory function of cross-talk between development response and hormone signaling for SASK oil accumulation.
Zhang, Jin; Wang, Bing; Dong, Shuanglin; Cao, Depan; Dong, Junfeng; Walker, William B.; Liu, Yang; Wang, Guirong
2015-01-01
To better understand the olfactory mechanisms in the two lepidopteran pest model species, the Helicoverpa armigera and H. assulta, we conducted transcriptome analysis of the adult antennae using Illumina sequencing technology and compared the chemosensory genes between these two related species. Combined with the chemosensory genes we had identified previously in H. armigera by 454 sequencing, we identified 133 putative chemosensory unigenes in H. armigera including 60 odorant receptors (ORs), 19 ionotropic receptors (IRs), 34 odorant binding proteins (OBPs), 18 chemosensory proteins (CSPs), and 2 sensory neuron membrane proteins (SNMPs). Consistent with these results, 131 putative chemosensory genes including 64 ORs, 19 IRs, 29 OBPs, 17 CSPs, and 2 SNMPs were identified through male and female antennal transcriptome analysis in H. assulta. Reverse Transcription-PCR (RT-PCR) was conducted in H. assulta to examine the accuracy of the assembly and annotation of the transcriptome and the expression profile of these unigenes in different tissues. Most of the ORs, IRs and OBPs were enriched in adult antennae, while almost all the CSPs were expressed in antennae as well as legs. We compared the differences of the chemosensory genes between these two species in detail. Our work will surely provide valuable information for further functional studies of pheromones and host volatile recognition genes in these two related species. PMID:25659090
Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data
Ching, Travers; Zhu, Xun
2018-01-01
Artificial neural networks (ANN) are computing architectures with many interconnections of simple neural-inspired computing elements, and have been applied to biomedical fields such as imaging analysis and diagnosis. We have developed a new ANN framework called Cox-nnet to predict patient prognosis from high throughput transcriptomics data. In 10 TCGA RNA-Seq data sets, Cox-nnet achieves the same or better predictive accuracy compared to other methods, including Cox-proportional hazards regression (with LASSO, ridge, and mimimax concave penalty), Random Forests Survival and CoxBoost. Cox-nnet also reveals richer biological information, at both the pathway and gene levels. The outputs from the hidden layer node provide an alternative approach for survival-sensitive dimension reduction. In summary, we have developed a new method for accurate and efficient prognosis prediction on high throughput data, with functional biological insights. The source code is freely available at https://github.com/lanagarmire/cox-nnet. PMID:29634719
RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application.
D'Antonio, Mattia; D'Onorio De Meo, Paolo; Pallocca, Matteo; Picardi, Ernesto; D'Erchia, Anna Maria; Calogero, Raffaele A; Castrignanò, Tiziana; Pesole, Graziano
2015-01-01
The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs.
Computational Tools for Stem Cell Biology
Bian, Qin; Cahan, Patrick
2016-01-01
For over half a century, the field of developmental biology has leveraged computation to explore mechanisms of developmental processes. More recently, computational approaches have been critical in the translation of high throughput data into knowledge of both developmental and stem cell biology. In the last several years, a new sub-discipline of computational stem cell biology has emerged that synthesizes the modeling of systems-level aspects of stem cells with high-throughput molecular data. In this review, we provide an overview of this new field and pay particular attention to the impact that single-cell transcriptomics is expected to have on our understanding of development and our ability to engineer cell fate. PMID:27318512
Computational Tools for Stem Cell Biology.
Bian, Qin; Cahan, Patrick
2016-12-01
For over half a century, the field of developmental biology has leveraged computation to explore mechanisms of developmental processes. More recently, computational approaches have been critical in the translation of high throughput data into knowledge of both developmental and stem cell biology. In the past several years, a new subdiscipline of computational stem cell biology has emerged that synthesizes the modeling of systems-level aspects of stem cells with high-throughput molecular data. In this review, we provide an overview of this new field and pay particular attention to the impact that single cell transcriptomics is expected to have on our understanding of development and our ability to engineer cell fate. Copyright © 2016 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haggard, Derik E.; Noyes, Pamela D.; Waters, Katrina M.
There is a need to develop novel, high-throughput screening and prioritization methods to identify chemicals with adverse estrogen, androgen, and thyroid activity to protect human health and the environment and is of interest to the Endocrine Disruptor Screening Program. The current aim is to explore the utility of zebrafish as a testing paradigm to classify endocrine activity using phenotypically anchored transcriptome profiling. Transcriptome analysis was conducted on embryos exposed to 25 estrogen-, androgen-, or thyroid-active chemicals at a concentration that elicited adverse malformations or mortality at 120 hours post-fertilization in 80% of the animals exposed. Analysis of the top 1000more » significant differentially expressed transcripts across all treatments identified a unique transcriptional and phenotypic profile for thyroid hormone receptor agonists, which can be used as a biomarker screen for potential thyroid hormone agonists.« less
Ni, Jun; Dong, Lixiang; Jiang, Zhifang; Yang, Xiuli; Chen, Ziying; Wu, Yuhuan; Xu, Maojun
2018-01-01
Ginkgo leaves are raw materials for flavonoid extraction. Thus, the timing of their harvest is important to optimize the extraction efficiency, which benefits the pharmaceutical industry. In this research, we compared the transcriptomes of Ginkgo leaves harvested at midday and midnight. The differentially expressed genes with the highest probabilities in each step of flavonoid biosynthesis were down-regulated at midnight. Furthermore, real-time PCR corroborated the transcriptome results, indicating the decrease in flavonoid biosynthesis at midnight. The flavonoid profiles of Ginkgo leaves harvested at midday and midnight were compared, and the total flavonoid content decreased at midnight. A detailed analysis of individual flavonoids showed that most of their contents were decreased by various degrees. Our results indicated that circadian rhythms affected the flavonoid contents in Ginkgo leaves, which provides valuable information for optimizing their harvesting times to benefit the pharmaceutical industry.
Ma, Yibao; Zhao, Yong; Zhao, Ruiming; Zhang, Weiping; He, Yawen; Wu, Yingliang; Cao, Zhijian; Guo, Lin; Li, Wenxin
2010-07-01
Scorpion venoms contain a vast untapped reservoir of natural products, which have the potential for medicinal value in drug discovery. In this study, toxin components from the scorpion Heterometrus petersii venom were evaluated by transcriptome and proteome analysis.Ten known families of venom peptides and proteins were identified, which include: two families of potassium channel toxins, four families of antimicrobial and cytolytic peptides,and one family from each of the calcium channel toxins, La1-like peptides, phospholipase A2,and the serine proteases. In addition, we also identified 12 atypical families, which include the acid phosphatases, diuretic peptides, and ten orphan families. From the data presented here, the extreme diversity and convergence of toxic components in scorpion venom was uncovered. Our work demonstrates the power of combining transcriptomic and proteomic approaches in the study of animal venoms.
Transcriptome analysis and related databases of Lactococcus lactis.
Kuipers, Oscar P; de Jong, Anne; Baerends, Richard J S; van Hijum, Sacha A F T; Zomer, Aldert L; Karsens, Harma A; den Hengst, Chris D; Kramer, Naomi E; Buist, Girbe; Kok, Jan
2002-08-01
Several complete genome sequences of Lactococcus lactis and their annotations will become available in the near future, next to the already published genome sequence of L. lactis ssp. lactis IL 1403. This will allow intraspecies comparative genomics studies as well as functional genomics studies aimed at a better understanding of physiological processes and regulatory networks operating in lactococci. This paper describes the initial set-up of a DNA-microarray facility in our group, to enable transcriptome analysis of various Gram-positive bacteria, including a ssp. lactis and a ssp. cremoris strain of Lactococcus lactis. Moreover a global description will be given of the hardware and software requirements for such a set-up, highlighting the crucial integration of relevant bioinformatics tools and methods. This includes the development of MolGenIS, an information system for transcriptome data storage and retrieval, and LactococCye, a metabolic pathway/genome database of Lactococcus lactis.
Transcriptome profile of Trichoderma harzianum IOC-3844 induced by sugarcane bagasse.
Horta, Maria Augusta Crivelente; Vicentini, Renato; Delabona, Priscila da Silva; Laborda, Prianda; Crucello, Aline; Freitas, Sindélia; Kuroshu, Reginaldo Massanobu; Polikarpov, Igor; Pradella, José Geraldo da Cruz; Souza, Anete Pereira
2014-01-01
Profiling the transcriptome that underlies biomass degradation by the fungus Trichoderma harzianum allows the identification of gene sequences with potential application in enzymatic hydrolysis processing. In the present study, the transcriptome of T. harzianum IOC-3844 was analyzed using RNA-seq technology. The sequencing generated 14.7 Gbp for downstream analyses. De novo assembly resulted in 32,396 contigs, which were submitted for identification and classified according to their identities. This analysis allowed us to define a principal set of T. harzianum genes that are involved in the degradation of cellulose and hemicellulose and the accessory genes that are involved in the depolymerization of biomass. An additional analysis of expression levels identified a set of carbohydrate-active enzymes that are upregulated under different conditions. The present study provides valuable information for future studies on biomass degradation and contributes to a better understanding of the role of the genes that are involved in this process.
USDA-ARS?s Scientific Manuscript database
There are many plant pathogen-specific diagnostic assays, based on PCR and immune-detection. However, the ability to test for large numbers of pathogens simultaneously is lacking. Next generation sequencing (NGS) allows one to detect all organisms within a given sample, but has computational limitat...
Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng
2016-01-01
Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides.
Sonnack, Laura; Klawonn, Thorsten; Kriehuber, Ralf; Hollert, Henner; Schäfers, Christoph; Fenske, Martina
2018-03-01
Metal toxicity is a global environmental challenge. Fish are particularly prone to metal exposure, which can be lethal or cause sublethal physiological impairments. The objective of this study was to investigate how adverse effects of chronic exposure to non-toxic levels of essential and non-essential metals in early life stage zebrafish may be explained by changes in the transcriptome. We therefore studied the effects of three different metals at low concentrations in zebrafish embryos by transcriptomics analysis. The study design compared exposure effects caused by different metals at different developmental stages (pre-hatch and post-hatch). Wild-type embryos were exposed to solutions of low concentrations of copper (CuSO 4 ), cadmium (CdCl 2 ) and cobalt (CoSO 4 ) until 96h post-fertilization (hpf) and microarray experiments were carried out to determine transcriptome profiles at 48 and 96hpf. We found that the toxic metal cadmium affected the expression of more genes at 96hpf than 48hpf. The opposite effect was observed for the essential metals cobalt and copper, which also showed enrichment of different GO terms. Genes involved in neuromast and motor neuron development were significantly enriched, agreeing with our previous results showing motor neuron and neuromast damage in the embryos. Our data provide evidence that the response of the transcriptome of fish embryos to metal exposure differs for essential and non-essential metals. Copyright © 2017 Elsevier Inc. All rights reserved.
Shah, Faheem Afzal; Wang, Qiaojian; Wang, Zhaocheng; Wu, Lifang
2018-01-01
Pecan is an economically important nut crop tree due to its unique texture and flavor properties. The pecan seed is rich of unsaturated fatty acid and protein. However, little is known about the molecular mechanisms of the biosynthesis of fatty acids in the developing seeds. In this study, transcriptome sequencing of the developing seeds was performed using Illumina sequencing technology. Pecan seed embryos at different developmental stages were collected and sequenced. The transcriptomes of pecan seeds at two key developing stages (PA, the initial stage and PS, the fast oil accumulation stage) were also compared. A total of 82,155 unigenes, with an average length of 1,198 bp from seven independent libraries were generated. After functional annotations, we detected approximately 55,854 CDS, among which, 2,807 were Transcription Factor (TF) coding unigenes. Further, there were 13,325 unigenes that showed a 2-fold or greater expression difference between the two groups of libraries (two developmental stages). After transcriptome analysis, we identified abundant unigenes that could be involved in fatty acid biosynthesis, degradation and some other aspects of seed development in pecan. This study presents a comprehensive dataset of transcriptomic changes during the seed development of pecan. It provides insights in understanding the molecular mechanisms responsible for fatty acid biosynthesis in the seed development. The identification of functional genes will also be useful for the molecular breeding work of pecan. PMID:29694395
Xu, Zheng; Ni, Jun; Shah, Faheem Afzal; Wang, Qiaojian; Wang, Zhaocheng; Wu, Lifang; Fu, Songling
2018-01-01
Pecan is an economically important nut crop tree due to its unique texture and flavor properties. The pecan seed is rich of unsaturated fatty acid and protein. However, little is known about the molecular mechanisms of the biosynthesis of fatty acids in the developing seeds. In this study, transcriptome sequencing of the developing seeds was performed using Illumina sequencing technology. Pecan seed embryos at different developmental stages were collected and sequenced. The transcriptomes of pecan seeds at two key developing stages (PA, the initial stage and PS, the fast oil accumulation stage) were also compared. A total of 82,155 unigenes, with an average length of 1,198 bp from seven independent libraries were generated. After functional annotations, we detected approximately 55,854 CDS, among which, 2,807 were Transcription Factor (TF) coding unigenes. Further, there were 13,325 unigenes that showed a 2-fold or greater expression difference between the two groups of libraries (two developmental stages). After transcriptome analysis, we identified abundant unigenes that could be involved in fatty acid biosynthesis, degradation and some other aspects of seed development in pecan. This study presents a comprehensive dataset of transcriptomic changes during the seed development of pecan. It provides insights in understanding the molecular mechanisms responsible for fatty acid biosynthesis in the seed development. The identification of functional genes will also be useful for the molecular breeding work of pecan.
Blood transcriptomics and metabolomics for personalized medicine.
Li, Shuzhao; Todor, Andrei; Luo, Ruiyan
2016-01-01
Molecular analysis of blood samples is pivotal to clinical diagnosis and has been intensively investigated since the rise of systems biology. Recent developments have opened new opportunities to utilize transcriptomics and metabolomics for personalized and precision medicine. Efforts from human immunology have infused into this area exquisite characterizations of subpopulations of blood cells. It is now possible to infer from blood transcriptomics, with fine accuracy, the contribution of immune activation and of cell subpopulations. In parallel, high-resolution mass spectrometry has brought revolutionary analytical capability, detecting > 10,000 metabolites, together with environmental exposure, dietary intake, microbial activity, and pharmaceutical drugs. Thus, the re-examination of blood chemicals by metabolomics is in order. Transcriptomics and metabolomics can be integrated to provide a more comprehensive understanding of the human biological states. We will review these new data and methods and discuss how they can contribute to personalized medicine.
Niu, Donghong; Wang, Fei; Xie, Shumei; Sun, Fanyue; Wang, Ze; Peng, Maoxiao; Li, Jiale
2016-04-01
The razor clam Sinonovacula constricta is an important commercial species. The deficiency of developmental transcriptomic data is becoming the bottleneck of further researches on the mechanisms underlying settlement and metamorphosis in early development. In this study, de novo transcriptome sequencing was performed for S. constricta at different early developmental stages by using Illumina HiSeq 2000 paired-end (PE) sequencing technology. A total of 112,209,077 PE clean reads were generated. De novo assembly generated 249,795 contigs with an average length of 585 bp. Gene annotation resulted in the identification of 22,870 unigene hits against the NCBI database. Eight unique sequences related to metamorphosis were identified and analyzed using real-time PCR. The razor clam reference transcriptome would provide useful information on early developmental and metamorphosis mechanisms and could be used in the genetic breeding of shellfish.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.
2012-04-25
Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.
Targeted exploration and analysis of large cross-platform human transcriptomic compendia
Zhu, Qian; Wong, Aaron K; Krishnan, Arjun; Aure, Miriam R; Tadych, Alicja; Zhang, Ran; Corney, David C; Greene, Casey S; Bongo, Lars A; Kristensen, Vessela N; Charikar, Moses; Li, Kai; Troyanskaya, Olga G.
2016-01-01
We present SEEK (http://seek.princeton.edu), a query-based search engine across very large transcriptomic data collections, including thousands of human data sets from almost 50 microarray and next-generation sequencing platforms. SEEK uses a novel query-level cross-validation-based algorithm to automatically prioritize data sets relevant to the query and a robust search approach to identify query-coregulated genes, pathways, and processes. SEEK provides cross-platform handling, multi-gene query search, iterative metadata-based search refinement, and extensive visualization-based analysis options. PMID:25581801
RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application
2015-01-01
Background The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). Moreover, the huge volume of data generated by NGS platforms introduces unprecedented computational and technological challenges to efficiently analyze and store sequence data and results. Methods In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Results Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs. PMID:26046471
Strain-Dependent Transcriptome Signatures for Robustness in Lactococcus lactis
Dijkstra, Annereinou R.; Alkema, Wynand; Starrenburg, Marjo J. C.; van Hijum, Sacha A. F. T.; Bron, Peter A.
2016-01-01
Recently, we demonstrated that fermentation conditions have a strong impact on subsequent survival of Lactococcus lactis strain MG1363 during heat and oxidative stress, two important parameters during spray drying. Moreover, employment of a transcriptome-phenotype matching approach revealed groups of genes associated with robustness towards heat and/or oxidative stress. To investigate if other strains have similar or distinct transcriptome signatures for robustness, we applied an identical transcriptome-robustness phenotype matching approach on the L. lactis strains IL1403, KF147 and SK11, which have previously been demonstrated to display highly diverse robustness phenotypes. These strains were subjected to an identical fermentation regime as was performed earlier for strain MG1363 and consisted of twelve conditions, varying in the level of salt and/or oxygen, as well as fermentation temperature and pH. In the exponential phase of growth, cells were harvested for transcriptome analysis and assessment of heat and oxidative stress survival phenotypes. The variation in fermentation conditions resulted in differences in heat and oxidative stress survival of up to five 10-log units. Effects of the fermentation conditions on stress survival of the L. lactis strains were typically strain-dependent, although the fermentation conditions had mainly similar effects on the growth characteristics of the different strains. By association of the transcriptomes and robustness phenotypes highly strain-specific transcriptome signatures for robustness towards heat and oxidative stress were identified, indicating that multiple mechanisms exist to increase robustness and, as a consequence, robustness of each strain requires individual optimization. However, a relatively small overlap in the transcriptome responses of the strains was also identified and this generic transcriptome signature included genes previously associated with stress (ctsR and lplL) and novel genes, including nanE and genes encoding transport proteins. The transcript levels of these genes can function as indicators of robustness and could aid in selection of fermentation parameters, potentially resulting in more optimal robustness during spray drying. PMID:27973578
2011-01-01
Background Amaranthus hypochondriacus, a grain amaranth, is a C4 plant noted by its ability to tolerate stressful conditions and produce highly nutritious seeds. These possess an optimal amino acid balance and constitute a rich source of health-promoting peptides. Although several recent studies, mostly involving subtractive hybridization strategies, have contributed to increase the relatively low number of grain amaranth expressed sequence tags (ESTs), transcriptomic information of this species remains limited, particularly regarding tissue-specific and biotic stress-related genes. Thus, a large scale transcriptome analysis was performed to generate stem- and (a)biotic stress-responsive gene expression profiles in grain amaranth. Results A total of 2,700,168 raw reads were obtained from six 454 pyrosequencing runs, which were assembled into 21,207 high quality sequences (20,408 isotigs + 799 contigs). The average sequence length was 1,064 bp and 930 bp for isotigs and contigs, respectively. Only 5,113 singletons were recovered after quality control. Contigs/isotigs were further incorporated into 15,667 isogroups. All unique sequences were queried against the nr, TAIR, UniRef100, UniRef50 and Amaranthaceae EST databases for annotation. Functional GO annotation was performed with all contigs/isotigs that produced significant hits with the TAIR database. Only 8,260 sequences were found to be homologous when the transcriptomes of A. tuberculatus and A. hypochondriacus were compared, most of which were associated with basic house-keeping processes. Digital expression analysis identified 1,971 differentially expressed genes in response to at least one of four stress treatments tested. These included several multiple-stress-inducible genes that could represent potential candidates for use in the engineering of stress-resistant plants. The transcriptomic data generated from pigmented stems shared similarity with findings reported in developing stems of Arabidopsis and black cottonwood (Populus trichocarpa). Conclusions This study represents the first large-scale transcriptomic analysis of A. hypochondriacus, considered to be a highly nutritious and stress-tolerant crop. Numerous genes were found to be induced in response to (a)biotic stress, many of which could further the understanding of the mechanisms that contribute to multiple stress-resistance in plants, a trait that has potential biotechnological applications in agriculture. PMID:21752295
KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella.
Jouraku, Akiya; Yamamoto, Kimiko; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Narukawa, Junko; Miyamoto, Kazuhisa; Kurita, Kanako; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Noda, Hiroaki
2013-07-09
The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to facilitate the development of insecticides with a novel mode of action for more effective and environmentally less harmful insecticide rotation. To contribute to this goal, we developed KONAGAbase, a genomic and transcriptomic database for DBM (KONAGA is the Japanese word for DBM). KONAGAbase provides (1) transcriptomic sequences of 37,340 ESTs/mRNAs and 147,370 RNA-seq contigs which were clustered and assembled into 84,570 unigenes (30,695 contigs, 50,548 pseudo singletons, and 3,327 singletons); and (2) genomic sequences of 88,530 WGS contigs with 246,244 degenerate contigs and 106,455 singletons from which 6,310 de novo identified repeat sequences and 34,890 predicted gene-coding sequences were extracted. The unigenes and predicted gene-coding sequences were clustered and 32,800 representative sequences were extracted as a comprehensive putative gene set. These sequences were annotated with BLAST descriptions, Gene Ontology (GO) terms, and Pfam descriptions, respectively. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data. Five useful search interfaces consisting of BLAST search, keyword search, BLAST result-based search, GO tree-based search, and genome browser are provided. KONAGAbase is publicly available from our website (http://dbm.dna.affrc.go.jp/px/) through standard web browsers. KONAGAbase provides DBM comprehensive transcriptomic and draft genomic sequences with useful annotation information with easy-to-use web interfaces, which helps researchers to efficiently search for target sequences such as insect resistance-related genes. KONAGAbase will be continuously updated and additional genomic/transcriptomic resources and analysis tools will be provided for further efficient analysis of the mechanism of insecticide resistance and the development of effective insecticides with a novel mode of action for DBM.
RNA-seq analysis of broiler liver transcriptome reveals novel responses to high ambient temperature.
Coble, Derrick J; Fleming, Damarius; Persia, Michael E; Ashwell, Chris M; Rothschild, Max F; Schmidt, Carl J; Lamont, Susan J
2014-12-10
In broilers, high ambient temperature can result in reduced feed consumption, digestive inefficiency, impaired metabolism, and even death. The broiler sector of the U.S. poultry industry incurs approximately $52 million in heat-related losses annually. The objective of this study is to characterize the effects of cyclic high ambient temperature on the transcriptome of a metabolically active organ, the liver. This study provides novel insight into the effects of high ambient temperature on metabolism in broilers, because it is the first reported RNA-seq study to characterize the effect of heat on the transcriptome of a metabolic-related tissue. This information provides a platform for future investigations to further elucidate physiologic responses to high ambient temperature and seek methods to ameliorate the negative impacts of heat. Transcriptome sequencing of the livers of 8 broiler males using Illumina HiSeq 2000 technology resulted in 138 million, 100-base pair single end reads, yielding a total of 13.8 gigabases of sequence. Forty genes were differentially expressed at a significance level of P-value < 0.05 and a fold-change ≥ 2 in response to a week of cyclic high ambient temperature with 27 down-regulated and 13 up-regulated genes. Two gene networks were created from the function-based Ingenuity Pathway Analysis (IPA) of the differentially expressed genes: "Cell Signaling" and "Endocrine System Development and Function". The gene expression differences in the liver transcriptome of the heat-exposed broilers reflected physiological responses to decrease internal temperature, reduce hyperthermia-induced apoptosis, and promote tissue repair. Additionally, the differential gene expression revealed a physiological response to regulate the perturbed cellular calcium levels that can result from high ambient temperature exposure. Exposure to cyclic high ambient temperature results in changes at the metabolic, physiologic, and cellular level that can be characterized through RNA-seq analysis of the liver transcriptome of broilers. The findings highlight specific physiologic mechanisms by which broilers reduce the effects of exposure to high ambient temperature. This information provides a foundation for future investigations into the gene networks involved in the broiler stress response and for development of strategies to ameliorate the negative impacts of heat on animal production and welfare.
Tao, Si-Qi; Cao, Bin; Tian, Cheng-Ming; Liang, Ying-Mei
2017-08-23
Rust fungi constitute the largest group of plant fungal pathogens. However, a paucity of data, including genomic sequences, transcriptome sequences, and associated molecular markers, hinders the development of inhibitory compounds and prevents their analysis from an evolutionary perspective. Gymnosporangium yamadae and G. asiaticum are two closely related rust fungal species, which are ecologically and economically important pathogens that cause apple rust and pear rust, respectively, proved to be devastating to orchards. In this study, we investigated the transcriptomes of these two Gymnosporangium species during the telial stage of their lifecycles. The aim of this study was to understand the evolutionary patterns of these two related fungi and to identify genes that developed by selection. The transcriptomes of G. yamadae and G. asiaticum were generated from a mixture of RNA from three biological replicates of each species. We obtained 49,318 and 54,742 transcripts, with N50 values of 1957 and 1664, for G. yamadae and G. asiaticum, respectively. We also identified a repertoire of candidate effectors and other gene families associated with pathogenicity. A total of 4947 pairs of putative orthologues between the two species were identified. Estimation of the non-synonymous/synonymous substitution rate ratios for these orthologues identified 116 pairs with Ka/Ks values greater than1 that are under positive selection and 170 pairs with Ka/Ks values of 1 that are under neutral selection, whereas the remaining 4661 genes are subjected to purifying selection. We estimate that the divergence time between the two species is approximately 5.2 Mya. This study constitutes a de novo assembly and comparative analysis between the transcriptomes of the two rust species G. yamadae and G. asiaticum. The results identified several orthologous genes, and many expressed genes were identified by annotation. Our analysis of Ka/Ks ratios identified orthologous genes subjected to positive or purifying selection. An evolutionary analysis of these two species provided a relatively precise divergence time. Overall, the information obtained in this study increases the genetic resources available for research on the genetic diversity of the Gymnosporangium genus.
Comparison of software packages for detecting differential expression in RNA-seq studies
Seyednasrollah, Fatemeh; Laiho, Asta
2015-01-01
RNA-sequencing (RNA-seq) has rapidly become a popular tool to characterize transcriptomes. A fundamental research problem in many RNA-seq studies is the identification of reliable molecular markers that show differential expression between distinct sample groups. Together with the growing popularity of RNA-seq, a number of data analysis methods and pipelines have already been developed for this task. Currently, however, there is no clear consensus about the best practices yet, which makes the choice of an appropriate method a daunting task especially for a basic user without a strong statistical or computational background. To assist the choice, we perform here a systematic comparison of eight widely used software packages and pipelines for detecting differential expression between sample groups in a practical research setting and provide general guidelines for choosing a robust pipeline. In general, our results demonstrate how the data analysis tool utilized can markedly affect the outcome of the data analysis, highlighting the importance of this choice. PMID:24300110
Comparison of software packages for detecting differential expression in RNA-seq studies.
Seyednasrollah, Fatemeh; Laiho, Asta; Elo, Laura L
2015-01-01
RNA-sequencing (RNA-seq) has rapidly become a popular tool to characterize transcriptomes. A fundamental research problem in many RNA-seq studies is the identification of reliable molecular markers that show differential expression between distinct sample groups. Together with the growing popularity of RNA-seq, a number of data analysis methods and pipelines have already been developed for this task. Currently, however, there is no clear consensus about the best practices yet, which makes the choice of an appropriate method a daunting task especially for a basic user without a strong statistical or computational background. To assist the choice, we perform here a systematic comparison of eight widely used software packages and pipelines for detecting differential expression between sample groups in a practical research setting and provide general guidelines for choosing a robust pipeline. In general, our results demonstrate how the data analysis tool utilized can markedly affect the outcome of the data analysis, highlighting the importance of this choice. © The Author 2013. Published by Oxford University Press.
Brooks, Matthew J.; Rajasimha, Harsha K.; Roger, Jerome E.
2011-01-01
Purpose Next-generation sequencing (NGS) has revolutionized systems-based analysis of cellular pathways. The goals of this study are to compare NGS-derived retinal transcriptome profiling (RNA-seq) to microarray and quantitative reverse transcription polymerase chain reaction (qRT–PCR) methods and to evaluate protocols for optimal high-throughput data analysis. Methods Retinal mRNA profiles of 21-day-old wild-type (WT) and neural retina leucine zipper knockout (Nrl−/−) mice were generated by deep sequencing, in triplicate, using Illumina GAIIx. The sequence reads that passed quality filters were analyzed at the transcript isoform level with two methods: Burrows–Wheeler Aligner (BWA) followed by ANOVA (ANOVA) and TopHat followed by Cufflinks. qRT–PCR validation was performed using TaqMan and SYBR Green assays. Results Using an optimized data analysis workflow, we mapped about 30 million sequence reads per sample to the mouse genome (build mm9) and identified 16,014 transcripts in the retinas of WT and Nrl−/− mice with BWA workflow and 34,115 transcripts with TopHat workflow. RNA-seq data confirmed stable expression of 25 known housekeeping genes, and 12 of these were validated with qRT–PCR. RNA-seq data had a linear relationship with qRT–PCR for more than four orders of magnitude and a goodness of fit (R2) of 0.8798. Approximately 10% of the transcripts showed differential expression between the WT and Nrl−/− retina, with a fold change ≥1.5 and p value <0.05. Altered expression of 25 genes was confirmed with qRT–PCR, demonstrating the high degree of sensitivity of the RNA-seq method. Hierarchical clustering of differentially expressed genes uncovered several as yet uncharacterized genes that may contribute to retinal function. Data analysis with BWA and TopHat workflows revealed a significant overlap yet provided complementary insights in transcriptome profiling. Conclusions Our study represents the first detailed analysis of retinal transcriptomes, with biologic replicates, generated by RNA-seq technology. The optimized data analysis workflows reported here should provide a framework for comparative investigations of expression profiles. Our results show that NGS offers a comprehensive and more accurate quantitative and qualitative evaluation of mRNA content within a cell or tissue. We conclude that RNA-seq based transcriptome characterization would expedite genetic network analyses and permit the dissection of complex biologic functions. PMID:22162623
Juranic Lisnic, Vanda; Babic Cac, Marina; Lisnic, Berislav; Trsan, Tihana; Mefferd, Adam; Das Mukhopadhyay, Chitrangada; Cook, Charles H.; Jonjic, Stipan; Trgovcich, Joanne
2013-01-01
Major gaps in our knowledge of pathogen genes and how these gene products interact with host gene products to cause disease represent a major obstacle to progress in vaccine and antiviral drug development for the herpesviruses. To begin to bridge these gaps, we conducted a dual analysis of Murine Cytomegalovirus (MCMV) and host cell transcriptomes during lytic infection. We analyzed the MCMV transcriptome during lytic infection using both classical cDNA cloning and sequencing of viral transcripts and next generation sequencing of transcripts (RNA-Seq). We also investigated the host transcriptome using RNA-Seq combined with differential gene expression analysis, biological pathway analysis, and gene ontology analysis. We identify numerous novel spliced and unspliced transcripts of MCMV. Unexpectedly, the most abundantly transcribed viral genes are of unknown function. We found that the most abundant viral transcript, recently identified as a noncoding RNA regulating cellular microRNAs, also codes for a novel protein. To our knowledge, this is the first viral transcript that functions both as a noncoding RNA and an mRNA. We also report that lytic infection elicits a profound cellular response in fibroblasts. Highly upregulated and induced host genes included those involved in inflammation and immunity, but also many unexpected transcription factors and host genes related to development and differentiation. Many top downregulated and repressed genes are associated with functions whose roles in infection are obscure, including host long intergenic noncoding RNAs, antisense RNAs or small nucleolar RNAs. Correspondingly, many differentially expressed genes cluster in biological pathways that may shed new light on cytomegalovirus pathogenesis. Together, these findings provide new insights into the molecular warfare at the virus-host interface and suggest new areas of research to advance the understanding and treatment of cytomegalovirus-associated diseases. PMID:24086132
Transcriptome analysis of Jatropha curcas L. flower buds responded to the paclobutrazol treatment.
Seesangboon, Anupharb; Gruneck, Lucsame; Pokawattana, Tittinat; Eungwanichayapant, Prapassorn Damrongkool; Tovaranonte, Jantrararuk; Popluechai, Siam
2018-06-01
Jatropha seeds can be used to produce high-quality biodiesel due to their high oil content. However, Jatropha produces low numbers of female flowers, which limits seed yield. Paclobutrazol (PCB), a plant growth retardant, can increase number of Jatropha female flowers and seed yield. However, the underlying mechanisms of flower development after PCB treatment are not well understood. To identify the critical genes associated with flower development, the transcriptome of flower buds following PCB treatment was analyzed. Scanning Electron Microscope (SEM) analysis revealed that the flower developmental stage between PCB-treated and control flower buds was similar. Based on the presence of sex organs, flower buds at 0, 4, and 24 h after treatment were chosen for global transcriptome analysis. In total, 100,597 unigenes were obtained, 174 of which were deemed as interesting based on their response to PCB treatment. Our analysis showed that the JcCKX5 and JcTSO1 genes were up-regulated at 4 h, suggesting roles in promoting organogenic capacity and ovule primordia formation in Jatropha. The JcNPGR2, JcMGP2-3, and JcHUA1 genes were down-regulated indicating that they may contribute to increased number of female flowers and amount of seed yield. Expression of cell division and cellulose biosynthesis-related genes, including JcGASA3, JcCycB3;1, JcCycP2;1, JcKNAT7, and JcCSLG3 was decreased, which might have caused the compacted inflorescences. This study represents the first report combining SEM-based morphology, qRT-PCR and transcriptome analysis of PCB-treated Jatropha flower buds at different stages of flower development. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Stare, Tjaša; Stare, Katja; Weckwerth, Wolfram; Wienkoop, Stefanie; Gruden, Kristina
2017-07-06
Plant diseases caused by viral infection are affecting all major crops. Being an obligate intracellular organisms, chemical control of these pathogens is so far not applied in the field except to control the insect vectors of the viruses. Understanding of molecular responses of plant immunity is therefore economically important, guiding the enforcement of crop resistance. To disentangle complex regulatory mechanisms of the plant immune responses, understanding system as a whole is a must. However, integrating data from different molecular analysis (transcriptomics, proteomics, metabolomics, smallRNA regulation etc.) is not straightforward. We evaluated the response of potato ( Solanum tuberosum L.) following the infection with potato virus Y (PVY). The response has been analyzed on two molecular levels, with microarray transcriptome analysis and mass spectroscopy-based proteomics. Within this report, we performed detailed analysis of the results on both levels and compared two different approaches for analysis of proteomic data (spectral count versus MaxQuant). To link the data on different molecular levels, each protein was mapped to the corresponding potato transcript according to StNIB paralogue grouping. Only 33% of the proteins mapped to microarray probes in a one-to-one relation and additionally many showed discordance in detected levels of proteins with corresponding transcripts. We discussed functional importance of true biological differences between both levels and showed that the reason for the discordance between transcript and protein abundance lies partly in complexity and structure of biological regulation of proteome and transcriptome and partly in technical issues contributing to it.
Stare, Tjaša; Stare, Katja; Weckwerth, Wolfram; Wienkoop, Stefanie
2017-01-01
Plant diseases caused by viral infection are affecting all major crops. Being an obligate intracellular organisms, chemical control of these pathogens is so far not applied in the field except to control the insect vectors of the viruses. Understanding of molecular responses of plant immunity is therefore economically important, guiding the enforcement of crop resistance. To disentangle complex regulatory mechanisms of the plant immune responses, understanding system as a whole is a must. However, integrating data from different molecular analysis (transcriptomics, proteomics, metabolomics, smallRNA regulation etc.) is not straightforward. We evaluated the response of potato (Solanum tuberosum L.) following the infection with potato virus Y (PVY). The response has been analyzed on two molecular levels, with microarray transcriptome analysis and mass spectroscopy-based proteomics. Within this report, we performed detailed analysis of the results on both levels and compared two different approaches for analysis of proteomic data (spectral count versus MaxQuant). To link the data on different molecular levels, each protein was mapped to the corresponding potato transcript according to StNIB paralogue grouping. Only 33% of the proteins mapped to microarray probes in a one-to-one relation and additionally many showed discordance in detected levels of proteins with corresponding transcripts. We discussed functional importance of true biological differences between both levels and showed that the reason for the discordance between transcript and protein abundance lies partly in complexity and structure of biological regulation of proteome and transcriptome and partly in technical issues contributing to it. PMID:28684682
Schäpe, Paul; Müller-Hagen, Dirk; Ouedraogo, Jean-Paul; Heiderich, Caroline; Jedamzick, Johanna; van den Hondel, Cees A.; Ram, Arthur F.; Meyer, Vera
2016-01-01
Understanding the genetic, molecular and evolutionary basis of cysteine-stabilized antifungal proteins (AFPs) from fungi is important for understanding whether their function is mainly defensive or associated with fungal growth and development. In the current study, a transcriptome meta-analysis of the Aspergillus niger γ-core protein AnAFP was performed to explore co-expressed genes and pathways, based on independent expression profiling microarrays covering 155 distinct cultivation conditions. This analysis uncovered that anafp displays a highly coordinated temporal and spatial transcriptional profile which is concomitant with key nutritional and developmental processes. Its expression profile coincides with early starvation response and parallels with genes involved in nutrient mobilization and autophagy. Using fluorescence- and luciferase reporter strains we demonstrated that the anafp promoter is active in highly vacuolated compartments and foraging hyphal cells during carbon starvation with CreA and FlbA, but not BrlA, as most likely regulators of anafp. A co-expression network analysis supported by luciferase-based reporter assays uncovered that anafp expression is embedded in several cellular processes including allorecognition, osmotic and oxidative stress survival, development, secondary metabolism and autophagy, and predicted StuA and VelC as additional regulators. The transcriptomic resources available for A. niger provide unparalleled resources to investigate the function of proteins. Our work illustrates how transcriptomic meta-analyses can lead to hypotheses regarding protein function and predict a role for AnAFP during slow growth, allorecognition, asexual development and nutrient recycling of A. niger and propose that it interacts with the autophagic machinery to enable these processes. PMID:27835655
Paege, Norman; Jung, Sascha; Schäpe, Paul; Müller-Hagen, Dirk; Ouedraogo, Jean-Paul; Heiderich, Caroline; Jedamzick, Johanna; Nitsche, Benjamin M; van den Hondel, Cees A; Ram, Arthur F; Meyer, Vera
2016-01-01
Understanding the genetic, molecular and evolutionary basis of cysteine-stabilized antifungal proteins (AFPs) from fungi is important for understanding whether their function is mainly defensive or associated with fungal growth and development. In the current study, a transcriptome meta-analysis of the Aspergillus niger γ-core protein AnAFP was performed to explore co-expressed genes and pathways, based on independent expression profiling microarrays covering 155 distinct cultivation conditions. This analysis uncovered that anafp displays a highly coordinated temporal and spatial transcriptional profile which is concomitant with key nutritional and developmental processes. Its expression profile coincides with early starvation response and parallels with genes involved in nutrient mobilization and autophagy. Using fluorescence- and luciferase reporter strains we demonstrated that the anafp promoter is active in highly vacuolated compartments and foraging hyphal cells during carbon starvation with CreA and FlbA, but not BrlA, as most likely regulators of anafp. A co-expression network analysis supported by luciferase-based reporter assays uncovered that anafp expression is embedded in several cellular processes including allorecognition, osmotic and oxidative stress survival, development, secondary metabolism and autophagy, and predicted StuA and VelC as additional regulators. The transcriptomic resources available for A. niger provide unparalleled resources to investigate the function of proteins. Our work illustrates how transcriptomic meta-analyses can lead to hypotheses regarding protein function and predict a role for AnAFP during slow growth, allorecognition, asexual development and nutrient recycling of A. niger and propose that it interacts with the autophagic machinery to enable these processes.
Liu, S; Liu, L; Tang, Y; Xiong, S; Long, J; Liu, Z; Tian, N
2017-07-01
The regulatory mechanism of flavonoids, which synergise anti-malarial and anti-cancer compounds in Artemisia annua, is still unclear. In this study, an anthocyanidin-accumulating mutant callus was induced from A. annua and comparative transcriptomic analysis of wild-type and mutant calli performed, based on the next-generation Illumina/Solexa sequencing platform and de novo assembly. A total of 82,393 unigenes were obtained and 34,764 unigenes were annotated in the public database. Among these, 87 unigenes were assigned to 14 structural genes involved in the flavonoid biosynthetic pathway and 37 unigenes were assigned to 17 structural genes related to metabolism of flavonoids. More than 30 unigenes were assigned to regulatory genes, including R2R3-MYB, bHLH and WD40, which might regulate flavonoid biosynthesis. A further 29 unigenes encoding flavonoid biosynthetic enzymes or transcription factors were up-regulated in the mutant, while 19 unigenes were down-regulated, compared with the wild type. Expression levels of nine genes involved in the flavonoid pathway were compared using semi-quantitative RT-PCR, and results were consistent with comparative transcriptomic analysis. Finally, a putative flavonol synthase gene (AaFLS1) was identified from enzyme assay in vitro and in vivo through heterogeneous expression, and confirmed comparative transcriptomic analysis of wild-type and mutant callus. The present work has provided important target genes for the regulation of flavonoid biosynthesis in A. annua. © 2017 German Botanical Society and The Royal Botanical Society of the Netherlands.
Variant discovery in the sheep milk transcriptome using RNA sequencing.
Suárez-Vega, Aroa; Gutiérrez-Gil, Beatriz; Klopp, Christophe; Tosser-Klopp, Gwenola; Arranz, Juan José
2017-02-15
The identification of genetic variation underlying desired phenotypes is one of the main challenges of current livestock genetic research. High-throughput transcriptome sequencing (RNA-Seq) offers new opportunities for the detection of transcriptome variants (SNPs and short indels) in different tissues and species. In this study, we used RNA-Seq on Milk Sheep Somatic Cells (MSCs) with the goal of characterizing the genetic variation within the coding regions of the milk transcriptome in Churra and Assaf sheep, two common dairy sheep breeds farmed in Spain. A total of 216,637 variants were detected in the MSCs transcriptome of the eight ewes analyzed. Among them, a total of 57,795 variants were detected in the regions harboring Quantitative Trait Loci (QTL) for milk yield, protein percentage and fat percentage, of which 21.44% were novel variants. Among the total variants detected, 561 (2.52%) and 1,649 (7.42%) were predicted to produce high or moderate impact changes in the corresponding transcriptional unit, respectively. In the functional enrichment analysis of the genes positioned within selected QTL regions harboring novel relevant functional variants (high and moderate impact), the KEGG pathway with the highest enrichment was "protein processing in endoplasmic reticulum". Additionally, a total of 504 and 1,063 variants were identified in the genes encoding principal milk proteins and molecules involved in the lipid metabolism, respectively. Of these variants, 20 mutations were found to have putative relevant effects on the encoded proteins. We present herein the first transcriptomic approach aimed at identifying genetic variants of the genes expressed in the lactating mammary gland of sheep. Through the transcriptome analysis of variability within regions harboring QTL for milk yield, protein percentage and fat percentage, we have found several pathways and genes that harbor mutations that could affect dairy production traits. Moreover, remarkable variants were also found in candidate genes coding for major milk proteins and proteins related to milk fat metabolism. Several of the SNPs found in this study could be included as suitable markers in genotyping platforms or custom SNP arrays to perform association analyses in commercial populations and apply genomic selection protocols in the dairy production industry.
Miao, Yuanyuan; Zhu, Zaibiao; Guo, Qiaosheng; Zhu, Yunhao; Yang, Xiaohua; Sun, Yuan
2016-01-01
Tulipa edulis (Miq.) Baker is an important medicinal plant with a variety of anti-cancer properties. The stolon is one of the main asexual reproductive organs of T. edulis and possesses a unique morphology. To explore the molecular mechanism of stolon formation, we performed an RNA-seq analysis of the transcriptomes of stolons at three developmental stages. In the present study, 15.49 Gb of raw data were generated and assembled into 74,006 unigenes, and a total of 2,811 simple sequence repeats were detected in T. edulis. Among the three libraries of stolons at different developmental stages, there were 5,119 differentially expressed genes (DEGs). A functional annotation analysis based on sequence similarity queries of the GO, COG, KEGG databases showed that these DEGs were mainly involved in many physiological and biochemical processes, such as material and energy metabolism, hormone signaling, cell growth, and transcription regulation. In addition, quantitative real-time PCR analysis revealed that the expression patterns of the DEGs were consistent with the transcriptome data, which further supported a role for the DEGs in stolon formation. This study provides novel resources for future genetic and molecular studies in T. edulis. PMID:27064558
Miao, Yuanyuan; Zhu, Zaibiao; Guo, Qiaosheng; Zhu, Yunhao; Yang, Xiaohua; Sun, Yuan
2016-01-01
Tulipa edulis (Miq.) Baker is an important medicinal plant with a variety of anti-cancer properties. The stolon is one of the main asexual reproductive organs of T. edulis and possesses a unique morphology. To explore the molecular mechanism of stolon formation, we performed an RNA-seq analysis of the transcriptomes of stolons at three developmental stages. In the present study, 15.49 Gb of raw data were generated and assembled into 74,006 unigenes, and a total of 2,811 simple sequence repeats were detected in T. edulis. Among the three libraries of stolons at different developmental stages, there were 5,119 differentially expressed genes (DEGs). A functional annotation analysis based on sequence similarity queries of the GO, COG, KEGG databases showed that these DEGs were mainly involved in many physiological and biochemical processes, such as material and energy metabolism, hormone signaling, cell growth, and transcription regulation. In addition, quantitative real-time PCR analysis revealed that the expression patterns of the DEGs were consistent with the transcriptome data, which further supported a role for the DEGs in stolon formation. This study provides novel resources for future genetic and molecular studies in T. edulis.
Amano, Ikuko; Kitajima, Sakihito; Suzuki, Hideyuki; Koeduka, Takao
2018-01-01
The biosynthesis of plant secondary metabolites is associated with morphological and metabolic differentiation. As a consequence, gene expression profiles can change drastically, and primary and secondary metabolites, including intermediate and end-products, move dynamically within and between cells. However, little is known about the molecular mechanisms underlying differentiation and transport mechanisms. In this study, we performed a transcriptome analysis of Petunia axillaris subsp. parodii, which produces various volatiles in its corolla limbs and emits metabolites to attract pollinators. RNA-sequencing from leaves, buds, and limbs identified 53,243 unigenes. Analysis of differentially expressed genes, combined with gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses, showed that many biological processes were highly enriched in limbs. These included catabolic processes and signaling pathways of hormones, such as gibberellins, and metabolic pathways, including phenylpropanoids and fatty acids. Moreover, we identified five transporter genes that showed high expression in limbs, and we performed spatiotemporal expression analyses and homology searches to infer their putative functions. Our systematic analysis provides comprehensive transcriptomic information regarding morphological differentiation and metabolite transport in the Petunia flower and lays the foundation for establishing the specific mechanisms that control secondary metabolite biosynthesis in plants. PMID:29902274
Transcriptome Analysis and Development of SSR Molecular Markers in Glycyrrhiza uralensis Fisch.
Liu, Yaling; Zhang, Pengfei; Song, Meiling; Hou, Junling; Qing, Mei; Wang, Wenquan; Liu, Chunsheng
2015-01-01
Licorice is an important traditional Chinese medicine with clinical and industrial applications. Genetic resources of licorice are insufficient for analysis of molecular biology and genetic functions; as such, transcriptome sequencing must be conducted for functional characterization and development of molecular markers. In this study, transcriptome sequencing on the Illumina HiSeq 2500 sequencing platform generated a total of 5.41 Gb clean data. De novo assembly yielded a total of 46,641 unigenes. Comparison analysis using BLAST showed that the annotations of 29,614 unigenes were conserved. Further study revealed 773 genes related to biosynthesis of secondary metabolites of licorice, 40 genes involved in biosynthesis of the terpenoid backbone, and 16 genes associated with biosynthesis of glycyrrhizic acid. Analysis of unigenes larger than 1 Kb with a length of 11,702 nt presented 7,032 simple sequence repeats (SSR). Sixty-four of 69 randomly designed and synthesized SSR pairs were successfully amplified, 33 pairs of primers were polymorphism in in Glycyrrhiza uralensis Fisch., Glycyrrhiza inflata Bat., Glycyrrhiza glabra L. and Glycyrrhiza pallidiflora Maxim. This study not only presents the molecular biology data of licorice but also provides a basis for genetic diversity research and molecular marker-assisted breeding of licorice. PMID:26571372
Giustacchini, Alice; Thongjuea, Supat; Barkas, Nikolaos; Woll, Petter S; Povinelli, Benjamin J; Booth, Christopher A G; Sopp, Paul; Norfo, Ruggiero; Rodriguez-Meira, Alba; Ashley, Neil; Jamieson, Lauren; Vyas, Paresh; Anderson, Kristina; Segerstolpe, Åsa; Qian, Hong; Olsson-Strömberg, Ulla; Mustjoki, Satu; Sandberg, Rickard; Jacobsen, Sten Eirik W; Mead, Adam J
2017-06-01
Recent advances in single-cell transcriptomics are ideally placed to unravel intratumoral heterogeneity and selective resistance of cancer stem cell (SC) subpopulations to molecularly targeted cancer therapies. However, current single-cell RNA-sequencing approaches lack the sensitivity required to reliably detect somatic mutations. We developed a method that combines high-sensitivity mutation detection with whole-transcriptome analysis of the same single cell. We applied this technique to analyze more than 2,000 SCs from patients with chronic myeloid leukemia (CML) throughout the disease course, revealing heterogeneity of CML-SCs, including the identification of a subgroup of CML-SCs with a distinct molecular signature that selectively persisted during prolonged therapy. Analysis of nonleukemic SCs from patients with CML also provided new insights into cell-extrinsic disruption of hematopoiesis in CML associated with clinical outcome. Furthermore, we used this single-cell approach to identify a blast-crisis-specific SC population, which was also present in a subclone of CML-SCs during the chronic phase in a patient who subsequently developed blast crisis. This approach, which might be broadly applied to any malignancy, illustrates how single-cell analysis can identify subpopulations of therapy-resistant SCs that are not apparent through cell-population analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lockhart, Ainsley; Zvenigorodsky, Natasha; Pedraza, Mary Ann
2011-08-11
The biosynthesis of chlorophyll and other tetrapyrroles is a vital but poorly understood process. Recent genomic advances with the unicellular green algae Chlamydomonas reinhardtii have created opportunity to more closely examine the mechanisms of the chlorophyll biosynthesis pathway via transcriptome analysis. Manganese is a nutrient of interest for complex reactions because of its multiple stable oxidation states and role in molecular oxygen coordination. C. reinhardtii was cultured in Manganese-deplete Tris-acetate-phosphate (TAP) media for 24 hours and used to create cDNA libraries for sequencing using Illumina TruSeq technology. Transcriptome analysis provided intriguing insight on possible regulatory mechanisms in the pathway. Evidencemore » supports similarities of GTR (Glutamyl-tRNA synthase) to its Chlorella vulgaris homolog in terms of Mn requirements. Data was also suggestive of Mn-related compensatory up-regulation for pathway proteins CHLH1 (Manganese Chelatase), GUN4 (Magnesium chelatase activating protein), and POR1 (Light-dependent protochlorophyllide reductase). Intriguingly, data suggests possible reciprocal expression of oxygen dependent CPX1 (coproporphyrinogen III oxidase) and oxygen independent CPX2. Further analysis using RT-PCR could provide compelling evidence for several novel regulatory mechanisms in the chlorophyll biosynthesis pathway.« less
Baldrian, Petr; López-Mondéjar, Rubén
2014-02-01
Molecular methods for the analysis of biomolecules have undergone rapid technological development in the last decade. The advent of next-generation sequencing methods and improvements in instrumental resolution enabled the analysis of complex transcriptome, proteome and metabolome data, as well as a detailed annotation of microbial genomes. The mechanisms of decomposition by model fungi have been described in unprecedented detail by the combination of genome sequencing, transcriptomics and proteomics. The increasing number of available genomes for fungi and bacteria shows that the genetic potential for decomposition of organic matter is widespread among taxonomically diverse microbial taxa, while expression studies document the importance of the regulation of expression in decomposition efficiency. Importantly, high-throughput methods of nucleic acid analysis used for the analysis of metagenomes and metatranscriptomes indicate the high diversity of decomposer communities in natural habitats and their taxonomic composition. Today, the metaproteomics of natural habitats is of interest. In combination with advanced analytical techniques to explore the products of decomposition and the accumulation of information on the genomes of environmentally relevant microorganisms, advanced methods in microbial ecophysiology should increase our understanding of the complex processes of organic matter transformation.
Transcriptomic profiling as a screening tool to detect trenbolone treatment in beef cattle.
Pegolo, S; Cannizzo, F T; Biolatti, B; Castagnaro, M; Bargelloni, L
2014-06-01
The effects of steroid hormone implants containing trenbolone alone (Finaplix-H), combined with 17β-oestradiol (17β-E; Revalor-H), or with 17β-E and dexamethasone (Revalor-H plus dexamethasone per os) on the bovine muscle transcriptome were examined by DNA-microarray. Overall, large sets of genes were shown to be modulated by the different growth promoters (GPs) and the regulated pathways and biological processes were mostly shared among the treatment groups. Using the Prediction Analysis of Microarray program, GP-treated animals were accurately identified by a small number of predictive genes. A meta-analysis approach was also carried out for the Revalor group to potentially increase the robustness of class prediction analysis. After data pre-processing, a high level of accuracy (90%) was obtained in the classification of samples, using 105 predictive gene markers. Transcriptomics could thus help in the identification of indirect biomarkers for anabolic treatment in beef cattle to be applied for the screening of muscle samples collected after slaughtering. Copyright © 2014 Elsevier Ltd. All rights reserved.
Time-series analysis of the transcriptome and proteome of Escherichia coli upon glucose repression.
Borirak, Orawan; Rolfe, Matthew D; de Koning, Leo J; Hoefsloot, Huub C J; Bekker, Martijn; Dekker, Henk L; Roseboom, Winfried; Green, Jeffrey; de Koster, Chris G; Hellingwerf, Klaas J
2015-10-01
Time-series transcript- and protein-profiles were measured upon initiation of carbon catabolite repression in Escherichia coli, in order to investigate the extent of post-transcriptional control in this prototypical response. A glucose-limited chemostat culture was used as the CCR-free reference condition. Stopping the pump and simultaneously adding a pulse of glucose, that saturated the cells for at least 1h, was used to initiate the glucose response. Samples were collected and subjected to quantitative time-series analysis of both the transcriptome (using microarray analysis) and the proteome (through a combination of 15N-metabolic labeling and mass spectrometry). Changes in the transcriptome and corresponding proteome were analyzed using statistical procedures designed specifically for time-series data. By comparison of the two sets of data, a total of 96 genes were identified that are post-transcriptionally regulated. This gene list provides candidates for future in-depth investigation of the molecular mechanisms involved in post-transcriptional regulation during carbon catabolite repression in E. coli, like the involvement of small RNAs. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Barbé, Caroline; Bray, Fabrice; Gueugneau, Marine; Devassine, Stéphanie; Lause, Pascale; Tokarski, Caroline; Rolando, Christian; Thissen, Jean-Paul
2017-10-06
Skeletal muscle, the most abundant body tissue, plays vital roles in locomotion and metabolism. Myostatin is a negative regulator of skeletal muscle mass. In addition to increasing muscle mass, Myostatin inhibition impacts muscle contractility and energy metabolism. To decipher the mechanisms of action of the Myostatin inhibitors, we used proteomic and transcriptomic approaches to investigate the changes induced in skeletal muscles of transgenic mice overexpressing Follistatin, a physiological Myostatin inhibitor. Our proteomic workflow included a fractionation step to identify weakly expressed proteins and a comparison of fast versus slow muscles. Functional annotation of altered proteins supports the phenotypic changes induced by Myostatin inhibition, including modifications in energy metabolism, fiber type, insulin and calcium signaling, as well as membrane repair and regeneration. Less than 10% of the differentially expressed proteins were found to be also regulated at the mRNA level but the Biological Process annotation, and the KEGG pathways analysis of transcriptomic results shows a great concordance with the proteomic data. Thus this study describes the most extensive omics analysis of muscle overexpressing Follistatin, providing molecular-level insights to explain the observed muscle phenotypic changes.
Bioinformatics analysis of transcriptome dynamics during growth in angus cattle longissimus muscle.
Moisá, Sonia J; Shike, Daniel W; Graugnard, Daniel E; Rodriguez-Zas, Sandra L; Everts, Robin E; Lewin, Harris A; Faulkner, Dan B; Berger, Larry L; Loor, Juan J
2013-01-01
Transcriptome dynamics in the longissimus muscle (LM) of young Angus cattle were evaluated at 0, 60, 120, and 220 days from early-weaning. Bioinformatic analysis was performed using the dynamic impact approach (DIA) by means of Kyoto Encyclopedia of Genes and Genomes (KEGG) and Database for Annotation, Visualization and Integrated Discovery (DAVID) databases. Between 0 to 120 days (growing phase) most of the highly-impacted pathways (eg, ascorbate and aldarate metabolism, drug metabolism, cytochrome P450 and Retinol metabolism) were inhibited. The phase between 120 to 220 days (finishing phase) was characterized by the most striking differences with 3,784 differentially expressed genes (DEGs). Analysis of those DEGs revealed that the most impacted KEGG canonical pathway was glycosylphosphatidylinositol (GPI)-anchor biosynthesis, which was inhibited. Furthermore, inhibition of calpastatin and activation of tyrosine aminotransferase ubiquitination at 220 days promotes proteasomal degradation, while the concurrent activation of ribosomal proteins promotes protein synthesis. Therefore, the balance of these processes likely results in a steady-state of protein turnover during the finishing phase. Results underscore the importance of transcriptome dynamics in LM during growth.
Transcriptome Analysis of PA Gain and Loss of Function Mutants.
Marco, Francisco; Carrasco, Pedro
2018-01-01
Functional genomics has become a forefront methodology for plant science thanks to the widespread development of microarray technology. While technical difficulties associated with the process of obtaining raw expression data have been diminishing, allowing the appearance of tremendous amounts of transcriptome data in different databases, a common problem using "omic" technologies remains: the interpretation of these data and the inference of its biological meaning. In order to assist to this complex task, a wide variety of software tools have been developed. In this chapter we describe our current workflow of the application of some of these analyses. We have used it to compare the transcriptome of plants with differences in their polyamine levels.
Safikhani, Zhaleh; Sadeghi, Mehdi; Pezeshk, Hamid; Eslahchi, Changiz
2013-01-01
Recent advances in the sequencing technologies have provided a handful of RNA-seq datasets for transcriptome analysis. However, reconstruction of full-length isoforms and estimation of the expression level of transcripts with a low cost are challenging tasks. We propose a novel de novo method named SSP that incorporates interval integer linear programming to resolve alternatively spliced isoforms and reconstruct the whole transcriptome from short reads. Experimental results show that SSP is fast and precise in determining different alternatively spliced isoforms along with the estimation of reconstructed transcript abundances. The SSP software package is available at http://www.bioinf.cs.ipm.ir/software/ssp. © 2013.
Guo, Yang; Townsend, Richard; Tsoi, Lam C
2017-01-01
In the past decade, high-throughput techniques have facilitated the "-omics" research. Transcriptomic study, for instance, has advanced our understanding on the expression landscape of different human diseases and cellular mechanisms. The National Center for Biotechnology Center (NCBI) initialized Genetic Expression Omnibus (GEO) to promote the sharing of transcriptomic data to facilitate biomedical research. In this chapter, we will illustrate how to use GEO to search and analyze the public available transcriptomic data, and we will provide easy to follow protocol for researchers to data mine the powerful resources in GEO to retrieve relevant information that can be valuable for fibrosis research.
Kel, AlexanderE
2017-02-01
Computational analysis of master regulators through the search for transcription factor binding sites followed by analysis of signal transduction networks of a cell is a new approach of causal analysis of multi-omics data. This paper contains results on analysis of multi-omics data that include transcriptomics, proteomics and epigenomics data of methotrexate (MTX) resistant colon cancer cell line. The data were used for analysis of mechanisms of resistance and for prediction of potential drug targets and promising compounds for reverting the MTX resistance of these cancer cells. We present all results of the analysis including the lists of identified transcription factors and their binding sites in genome and the list of predicted master regulators - potential drug targets. This data was generated in the study recently published in the article "Multi-omics "Upstream Analysis" of regulatory genomic regions helps identifying targets against methotrexate resistance of colon cancer" (Kel et al., 2016) [4]. These data are of interest for researchers from the field of multi-omics data analysis and for biologists who are interested in identification of novel drug targets against NTX resistance.
Estimating the efficiency of fish cross-species cDNA microarray hybridization.
Cohen, Raphael; Chalifa-Caspi, Vered; Williams, Timothy D; Auslander, Meirav; George, Stephen G; Chipman, James K; Tom, Moshe
2007-01-01
Using an available cross-species cDNA microarray is advantageous for examining multigene expression patterns in non-model organisms, saving the need for construction of species-specific arrays. The aim of the present study was to estimate relative efficiency of cross-species hybridizations across bony fishes, using bioinformatics tools. The methodology may serve also as a model for similar evaluations in other taxa. The theoretical evaluation was done by substituting comparative whole-transcriptome sequence similarity information into the thermodynamic hybridization equation. Complementary DNA sequence assemblages of nine fish species belonging to common families or suborders and distributed across the bony fish taxonomic branch were selected for transcriptome-wise comparisons. Actual cross-species hybridizations among fish of different taxonomic distances were used to validate and eventually to calibrate the theoretically computed relative efficiencies.
The developmental transcriptome atlas of the spoon worm Urechis unicinctus (Echiurida: Annelida).
Park, Chungoo; Han, Yong-Hee; Lee, Sung-Gwon; Ry, Kyoung-Bin; Oh, Jooseong; Kern, Elizabeth M A; Park, Joong-Ki; Cho, Sung-Jin
2018-03-01
Echiurida is one of the most intriguing major subgroups of annelida because, unlike most other annelids, echiurids lack metameric body segmentation as adults. For this reason, transcriptome analyses from various developmental stages of echiurid species can be of substantial value for understanding precise expression levels and the complex regulatory networks during early and larval development. A total of 914 million raw RNA-Seq reads were produced from 14 developmental stages of Urechis unicinctus and were de novo assembled into contigs spanning 63,928,225 bp with an N50 length of 2700 bp. The resulting comprehensive transcriptome database of the early developmental stages of U. unicinctus consists of 20,305 representative functional protein-coding transcripts. Approximately 66% of unigenes were assigned to superphylum-level taxa, including Lophotrochozoa (40%). The completeness of the transcriptome assembly was assessed using benchmarking universal single-copy orthologs; 75.7% of the single-copy orthologs were presented in our transcriptome database. We observed 3 distinct patterns of global transcriptome profiles from 14 developmental stages and identified 12,705 genes that showed dynamic regulation patterns during the differentiation and maturation of U. unicinctus cells. We present the first large-scale developmental transcriptome dataset of U. unicinctus and provide a general overview of the dynamics of global gene expression changes during its early developmental stages. The analysis of time-course gene expression data is a first step toward understanding the complex developmental gene regulatory networks in U. unicinctus and will furnish a valuable resource for analyzing the functions of gene repertoires in various developmental phases.
Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai
2014-01-01
The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies. PMID:24498047
Expanding frontiers in plant transcriptomics in aid of functional genomics and molecular breeding.
Agarwal, Pinky; Parida, Swarup K; Mahto, Arunima; Das, Sweta; Mathew, Iny Elizebeth; Malik, Naveen; Tyagi, Akhilesh K
2014-12-01
The transcript pool of a plant part, under any given condition, is a collection of mRNAs that will pave the way for a biochemical reaction of the plant to stimuli. Over the past decades, transcriptome study has advanced from Northern blotting to RNA sequencing (RNA-seq), through other techniques, of which real-time quantitative polymerase chain reaction (PCR) and microarray are the most significant ones. The questions being addressed by such studies have also matured from a solitary process to expression atlas and marker-assisted genetic enhancement. Not only genes and their networks involved in various developmental processes of plant parts have been elucidated, but also stress tolerant genes have been highlighted. The transcriptome of a plant with altered expression of a target gene has given information about the downstream genes. Marker information has been used for breeding improved varieties. Fortunately, the data generated by transcriptome analysis has been made freely available for ample utilization and comparison. The review discusses this wide variety of transcriptome data being generated in plants, which includes developmental stages, abiotic and biotic stress, effect of altered gene expression, as well as comparative transcriptomics, with a special emphasis on microarray and RNA-seq. Such data can be used to determine the regulatory gene networks, which can subsequently be utilized for generating improved plant varieties. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparative whole genome transcriptome and metabolome analyses of five Klebsiella pneumonia strains.
Lee, Soojin; Kim, Borim; Yang, Jeongmo; Jeong, Daun; Park, Soohyun; Shin, Sang Heum; Kook, Jun Ho; Yang, Kap-Seok; Lee, Jinwon
2015-11-01
The integration of transcriptomics and metabolomics can provide precise information on gene-to-metabolite networks for identifying the function of novel genes. The goal of this study was to identify novel gene functions involved in 2,3-butanediol (2,3-BDO) biosynthesis by a comprehensive analysis of the transcriptome and metabolome of five mutated Klebsiella pneumonia strains (∆wabG = SGSB100, ∆wabG∆budA = SGSB106, ∆wabG∆budB = SGSB107, ∆wabG∆budC = SGSB108, ∆wabG∆budABC = SGSB109). First, the transcriptomes of all five mutants were analyzed and the genes exhibiting reproducible changes in expression were determined. The transcriptome was well conserved among the five strains, and differences in gene expression occurred mainly in genes coding for 2,3-BDO biosynthesis (budA, budB, and budC) and the genes involved in the degradation of reactive oxygen, biosynthesis and transport of arginine, cysteine biosynthesis, sulfur metabolism, oxidoreductase reaction, and formate dehydrogenase reaction. Second, differences in the metabolome (estimated by carbon distribution, CO2 emission, and redox balance) among the five mutant strains due to gene alteration of the 2,3-BDO operon were detected. The functional genomics approach integrating metabolomics and transcriptomics in K. Pneumonia presented here provides an innovative means of identifying novel gene functions involved in 2,3-BDO biosynthesis metabolism and whole cell metabolism.
Gao, Bei; Li, Xiaoshuang; Zhang, Daoyuan; Liang, Yuqing; Yang, Honglan; Chen, Moxian; Zhang, Yuanming; Zhang, Jianhua; Wood, Andrew J
2017-08-08
The desiccation tolerant bryophyte Bryum argenteum is an important component of desert biological soil crusts (BSCs) and is emerging as a model system for studying vegetative desiccation tolerance. Here we present and analyze the hydration-dehydration-rehydration transcriptomes in B. argenteum to establish a desiccation-tolerance transcriptomic atlas. B. argenteum gametophores representing five different hydration stages (hydrated (H0), dehydrated for 2 h (D2), 24 h (D24), then rehydrated for 2 h (R2) and 48 h (R48)), were sampled for transcriptome analyses. Illumina high throughput RNA-Seq technology was employed and generated more than 488.46 million reads. An in-house de novo transcriptome assembly optimization pipeline based on Trinity assembler was developed to obtain a reference Hydration-Dehydration-Rehydration (H-D-R) transcriptome comprising of 76,206 transcripts, with an N50 of 2,016 bp and average length of 1,222 bp. Comprehensive transcription factor (TF) annotation discovered 978 TFs in 62 families, among which 404 TFs within 40 families were differentially expressed upon dehydration-rehydration. Pfam term enrichment analysis revealed 172 protein families/domains were significantly associated with the H-D-R cycle and confirmed early rehydration (i.e. the R2 stage) as exhibiting the maximum stress-induced changes in gene expression.
Li, Yukuo; Fang, Jinbao; Qi, Xiujuan; Lin, Miaomiao; Zhong, Yunpeng; Sun, Leiming; Cui, Wen
2018-05-15
To assess the interrelation between the change of metabolites and the change of fruit color, we performed a combined metabolome and transcriptome analysis of the flesh in two different Actinidia arguta cultivars: "HB" ("Hongbaoshixing") and "YF" ("Yongfengyihao") at two different fruit developmental stages: 70d (days after full bloom) and 100d (days after full bloom). Metabolite and transcript profiling was obtained by ultra-performance liquid chromatography quadrupole time-of-flight tandem mass spectrometer and high-throughput RNA sequencing, respectively. The identification and quantification results of metabolites showed that a total of 28,837 metabolites had been obtained, of which 13,715 were annotated. In comparison of HB100 vs. HB70, 41 metabolites were identified as being flavonoids, 7 of which, with significant difference, were identified as bracteatin, luteolin, dihydromyricetin, cyanidin, pelargonidin, delphinidin and (-)-epigallocatechin. Association analysis between metabolome and transcriptome revealed that there were two metabolic pathways presenting significant differences during fruit development, one of which was flavonoid biosynthesis, in which 14 structural genes were selected to conduct expression analysis, as well as 5 transcription factor genes obtained by transcriptome analysis. RT-qPCR results and cluster analysis revealed that AaF3H , AaLDOX , AaUFGT , AaMYB , AabHLH , and AaHB2 showed the best possibility of being candidate genes. A regulatory network of flavonoid biosynthesis was established to illustrate differentially expressed candidate genes involved in accumulation of metabolites with significant differences, inducing red coloring during fruit development. Such a regulatory network linking genes and flavonoids revealed a system involved in the pigmentation of all-red-fleshed and all-green-fleshed A. arguta , suggesting this conjunct analysis approach is not only useful in understanding the relationship between genotype and phenotype, but is also a powerful tool for providing more valuable information for breeding.
Aballai, Víctor; Aedo, Jorge E; Maldonado, Jonathan; Bastias-Molina, Macarena; Silva, Herman; Meneses, Claudio; Boltaña, Sebastian; Reyes, Ariel; Molina, Alfredo; Valdés, Juan Antonio
2017-12-01
Stress is a primary contributing factor of fish disease and mortality in aquaculture. We have previously reported that the red cusk-eel (Genypterus chilensis), an important farmed marine fish, demonstrates a handling-stress response that results in increased juvenile mortality, which is mainly associated with skeletal muscle atrophy and liver steatosis. To better understand the systemic effects of stress on red cusk-eel immune-related gene expression, the present study assessed the transcriptomic head-kidney response to handling-stress. The RNA sequencing generated a total of 61,655,525 paired-end reads from control and stressed conditions. De novo assembly using the CLC Genomic Workbench produced 86,840 transcripts and created a reference transcriptome with a N50 of 1426bp. Reads mapped onto the assembled reference transcriptome resulted in the identification of 569 up-regulated and 513 down-regulated transcripts. Gene ontology enrichment analysis revealed a significant up-regulation of the biological processes, like response to stress, response to biotic stimulus, and immune response. Conversely, a significant down-regulation of biological processes is associated with metabolic processes. These results were validated by RT-qPCR analysis for nine candidate genes involved in the immune response. The present data demonstrated that short term stress promotes the immune innate response in the marine teleost G. chilensis. This study is an important step towards understanding the immune adaptive response to stress in non-model teleost species. Copyright © 2017 Elsevier Inc. All rights reserved.
Zhang, X J; Jiang, H Y; Li, L M; Yuan, L H; Chen, J P
2016-06-20
The aim of this study was to provide comprehensive insights into the genetic background of sturgeon by transcriptome study. We performed a de novo assembly of the Amur sturgeon Acipenser schrenckii transcriptome using Illumina Hiseq 2000 sequencing. A total of 148,817 non-redundant unigenes with base length of approximately 121,698,536 bp and ranges from 201 to 26,789 bp were obtained. All the unigenes were classified into 3368 distinct categories and 145,449 singletons by homologous transcript cluster analysis. In all, 46,865 (31.49%) unigenes showed homologous matches with Nr database and 32,214 (21.65%) unigenes were matched to Nt database. In total, 24,862 unigenes were categorized into significantly enriched 52 function groups by GO analysis, and 38,436 unigenes were classified into 25 groups by KOG prediction, as well as 128 enriched KEGG pathways were identified by 45,598 unigenes (P < 0.05). Subsequently, a total of 19,860 SSRs markers were identified with the abundant di-nucleotide type (10,658; 53.67%) and the most AT/TA motif repeats (2689; 13.54%). A total of 1341 conserved lncRNAs were identified by a customized pipeline. Our study provides new sequence and function information for A. schrenckii, which will be the basis for further genetic studies on sturgeon species. The huge number of potential SSRs and putatively conserved lncRNAs isolated by the transcriptome also shed light on research in many fields, including the evolution, conservation management, and biological processes in sturgeon.
Tian, Shan; Wang, Bei; Zhao, Xusheng
2017-01-01
Wild jujube (Ziziphus acidojujuba Mill.) is highly tolerant to alkaline, saline and drought stress; however, no studies have performed transcriptome profiling to study the response of wild jujube to these and other abiotic stresses. In this study, we examined the tolerance of wild jujube to NaHCO3-NaOH solution and analyzed gene expression profiles in response to alkaline stress. Physiological experiments revealed that H2O2 content in leaves increased significantly and root activity decreased quickly during alkaline of pH 9.5 treatment. For transcriptome analysis, wild jujube plants grown hydroponically were treated with NaHCO3-NaOH solution for 0, 1, and 12 h and six transcriptomes from roots were built. In total, 32,758 genes were generated, and 3,604 differentially expressed genes (DEGs) were identified. After 1 h, 853 genes showed significantly different expression between control and treated plants; after 12 h, expression of 2,856 genes was significantly different. The expression pattern of nine genes was validated by quantitative real-time PCR. After gene annotation and gene ontology enrichment analysis, the genes encoding transcriptional factors, serine/threonine-protein kinases, heat shock proteins, cysteine-like kinases, calmodulin-like proteins, and reactive oxygen species (ROS) scavengers were found to be closely involved in alkaline stress response. These results will provide useful insights for elucidating the mechanisms underlying alkaline tolerance in wild jujube. PMID:28976994
Virtaneva, Kimmo; Porcella, Stephen F; Graham, Morag R; Ireland, Robin M; Johnson, Claire A; Ricklefs, Stacy M; Babar, Imran; Parkins, Larye D; Romero, Romina A; Corn, G Judson; Gardner, Don J; Bailey, John R; Parnell, Michael J; Musser, James M
2005-06-21
Identification of the genetic events that contribute to host-pathogen interactions is important for understanding the natural history of infectious diseases and developing therapeutics. Transcriptome studies conducted on pathogens have been central to this goal in recent years. However, most of these investigations have focused on specific end points or disease phases, rather than analysis of the entire time course of infection. To gain a more complete understanding of how bacterial gene expression changes over time in a primate host, the transcriptome of group A Streptococcus (GAS) was analyzed during an 86-day infection protocol in 20 cynomolgus macaques with experimental pharyngitis. The study used 260 custom Affymetrix (Santa Clara, CA) chips, and data were confirmed by TaqMan analysis. Colonization, acute, and asymptomatic phases of disease were identified. Successful colonization and severe inflammation were significantly correlated with an early onset of superantigen gene expression. The differential expression of two-component regulators covR and spy0680 (M1_spy0874) was significantly associated with GAS colony-forming units, inflammation, and phases of disease. Prophage virulence gene expression and prophage induction occurred predominantly during high pathogen cell densities and acute inflammation. We discovered that temporal changes in the GAS transcriptome were integrally linked to the phase of clinical disease and host-defense response. Knowledge of the gene expression patterns characterizing each phase of pathogen-host interaction provides avenues for targeted investigation of proven and putative virulence factors and genes of unknown function and will assist vaccine research.
Transcriptome characterisation of Pinus tabuliformis and evolution of genes in the Pinus phylogeny
2013-01-01
Background The Chinese pine (Pinus tabuliformis) is an indigenous conifer species in northern China but is relatively underdeveloped as a genomic resource; thus, limiting gene discovery and breeding. Large-scale transcriptome data were obtained using a next-generation sequencing platform to compensate for the lack of P. tabuliformis genomic information. Results The increasing amount of transcriptome data on Pinus provides an excellent resource for multi-gene phylogenetic analysis and studies on how conserved genes and functions are maintained in the face of species divergence. The first P. tabuliformis transcriptome from a normalised cDNA library of multiple tissues and individuals was sequenced in a full 454 GS-FLX run, producing 911,302 sequencing reads. The high quality overlapping expressed sequence tags (ESTs) were assembled into 46,584 putative transcripts, and more than 700 SSRs and 92,000 SNPs/InDels were characterised. Comparative analysis of the transcriptome of six conifer species yielded 191 orthologues, from which we inferred a phylogenetic tree, evolutionary patterns and calculated rates of gene diversion. We also identified 938 fast evolving sequences that may be useful for identifying genes that perhaps evolved in response to positive selection and might be responsible for speciation in the Pinus lineage. Conclusions A large collection of high-quality ESTs was obtained, de novo assembled and characterised, which represents a dramatic expansion of the current transcript catalogues of P. tabuliformis and which will gradually be applied in breeding programs of P. tabuliformis. Furthermore, these data will facilitate future studies of the comparative genomics of P. tabuliformis and other related species. PMID:23597112
Cheng, Bing; Furtado, Agnelo
2017-01-01
Abstract Polyploidization contributes to the complexity of gene expression, resulting in numerous related but different transcripts. This study explored the transcriptome diversity and complexity of the tetraploid Arabica coffee (Coffea arabica) bean. Long-read sequencing (LRS) by Pacbio Isoform sequencing (Iso-seq) was used to obtain full-length transcripts without the difficulty and uncertainty of assembly required for reads from short-read technologies. The tetraploid transcriptome was annotated and compared with data from the sub-genome progenitors. Caffeine and sucrose genes were targeted for case analysis. An isoform-level tetraploid coffee bean reference transcriptome with 95 995 distinct transcripts (average 3236 bp) was obtained. A total of 88 715 sequences (92.42%) were annotated with BLASTx against NCBI non-redundant plant proteins, including 34 719 high-quality annotations. Further BLASTn analysis against NCBI non-redundant nucleotide sequences, Coffea canephora coding sequences with UTR, C. arabica ESTs, and Rfam resulted in 1213 sequences without hits, were potential novel genes in coffee. Longer UTRs were captured, especially in the 5΄UTRs, facilitating the identification of upstream open reading frames. The LRS also revealed more and longer transcript variants in key caffeine and sucrose metabolism genes from this polyploid genome. Long sequences (>10 kilo base) were poorly annotated. LRS technology shows the limitation of previous studies. It provides an important tool to produce a reference transcriptome including more of the diversity of full-length transcripts to help understand the biology and support the genetic improvement of polyploid species such as coffee. PMID:29048540
Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng
2016-01-01
Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides. PMID:27625674
NASA Astrophysics Data System (ADS)
Zhang, Hui; Zhai, Yuxiu; Yao, Lin; Jiang, Yanhua; Li, Fengling
2017-05-01
Chlamys farreri is an economically important mollusk that can accumulate excessive amounts of cadmium (Cd). Studying the molecular mechanism of Cd accumulation in bivalves is difficult because of the lack of genome background. Transcriptomic analysis based on high-throughput RNA sequencing has been shown to be an efficient and powerful method for the discovery of relevant genes in non-model and genome reference-free organisms. Here, we constructed two cDNA libraries (control and Cd exposure groups) from the digestive gland of C. farreri and compared the transcriptomic data between them. A total of 227 673 transcripts were assembled into 105 071 unigenes, most of which shared high similarity with sequences in the NCBI non-redundant protein database. For functional classification, 24 493 unigenes were assigned to Gene Ontology terms. Additionally, EuKaryotic Ortholog Groups and Kyoto Encyclopedia of Genes and Genomes analyses assigned 12 028 unigenes to 26 categories and 7 849 unigenes to five pathways, respectively. Comparative transcriptomics analysis identified 3 800 unigenes that were differentially expressed in the Cd-treated group compared with the control group. Among them, genes associated with heavy metal accumulation were screened, including metallothionein, divalent metal transporter, and metal tolerance protein. The functional genes and predicted pathways identified in our study will contribute to a better understanding of the metabolic and immune system in the digestive gland of C. farreri. In addition, the transcriptomic data will provide a comprehensive resource that may contribute to the understanding of molecular mechanisms that respond to marine pollutants in bivalves.
Stevens, Rebecca G.; Baldet, Pierre; Bouchet, Jean-Paul; Causse, Mathilde; Deborde, Catherine; Deschodt, Claire; Faurobert, Mireille; Garchery, Cécile; Garcia, Virginie; Gautier, Hélène; Gouble, Barbara; Maucourt, Mickaël; Moing, Annick; Page, David; Petit, Johann; Poëssel, Jean-Luc; Truffault, Vincent; Rothan, Christophe
2018-01-01
Changing the balance between ascorbate, monodehydroascorbate, and dehydroascorbate in plant cells by manipulating the activity of enzymes involved in ascorbate synthesis or recycling of oxidized and reduced forms leads to multiple phenotypes. A systems biology approach including network analysis of the transcriptome, proteome and metabolites of RNAi lines for ascorbate oxidase, monodehydroascorbate reductase and galactonolactone dehydrogenase has been carried out in orange fruit pericarp of tomato (Solanum lycopersicum). The transcriptome of the RNAi ascorbate oxidase lines is inversed compared to the monodehydroascorbate reductase and galactonolactone dehydrogenase lines. Differentially expressed genes are involved in ribosome biogenesis and translation. This transcriptome inversion is also seen in response to different stresses in Arabidopsis. The transcriptome response is not well correlated with the proteome which, with the metabolites, are correlated to the activity of the ascorbate redox enzymes—ascorbate oxidase and monodehydroascorbate reductase. Differentially accumulated proteins include metacaspase, protein disulphide isomerase, chaperone DnaK and carbonic anhydrase and the metabolites chlorogenic acid, dehydroascorbate and alanine. The hub genes identified from the network analysis are involved in signaling, the heat-shock response and ribosome biogenesis. The results from this study therefore reveal one or several putative signals from the ascorbate pool which modify the transcriptional response and elements downstream. PMID:29491875
Grace, Peter M; Hurley, Daniel; Barratt, Daniel T; Tsykin, Anna; Watkins, Linda R; Rolan, Paul E; Hutchinson, Mark R
2012-09-01
A quantitative, peripherally accessible biomarker for neuropathic pain has great potential to improve clinical outcomes. Based on the premise that peripheral and central immunity contribute to neuropathic pain mechanisms, we hypothesized that biomarkers could be identified from the whole blood of adult male rats, by integrating graded chronic constriction injury (CCI), ipsilateral lumbar dorsal quadrant (iLDQ) and whole blood transcriptomes, and pathway analysis with pain behavior. Correlational bioinformatics identified a range of putative biomarker genes for allodynia intensity, many encoding for proteins with a recognized role in immune/nociceptive mechanisms. A selection of these genes was validated in a separate replication study. Pathway analysis of the iLDQ transcriptome identified Fcγ and Fcε signaling pathways, among others. This study is the first to employ the whole blood transcriptome to identify pain biomarker panels. The novel correlational bioinformatics, developed here, selected such putative biomarkers based on a correlation with pain behavior and formation of signaling pathways with iLDQ genes. Future studies may demonstrate the predictive ability of these biomarker genes across other models and additional variables. © 2012 The Authors. Journal of Neurochemistry © 2012 International Society for Neurochemistry.
Transgenerational Epigenetic Programming of the Brain Transcriptome and Anxiety Behavior
Skinner, Michael K.; Anway, Matthew D.; Savenkova, Marina I.; Gore, Andrea C.; Crews, David
2008-01-01
Embryonic exposure to the endocrine disruptor vinclozolin during gonadal sex determination promotes an epigenetic reprogramming of the male germ-line that is associated with transgenerational adult onset disease states. Further analysis of this transgenerational phenotype on the brain demonstrated reproducible changes in the brain transcriptome three generations (F3) removed from the exposure. The transgenerational alterations in the male and female brain transcriptomes were distinct. In the males, the expression of 92 genes in the hippocampus and 276 genes in the amygdala were transgenerationally altered. In the females, the expression of 1,301 genes in the hippocampus and 172 genes in the amygdala were transgenerationally altered. Analysis of specific gene sets demonstrated that several brain signaling pathways were influenced including those involved in axon guidance and long-term potentiation. An investigation of behavior demonstrated that the vinclozolin F3 generation males had a decrease in anxiety-like behavior, while the females had an increase in anxiety-like behavior. These observations demonstrate that an embryonic exposure to an environmental compound appears to promote a reprogramming of brain development that correlates with transgenerational sex-specific alterations in the brain transcriptomes and behavior. Observations are discussed in regards to environmental and transgenerational influences on the etiology of brain disease. PMID:19015723
Drew, Damian Paul; Dueholm, Bjørn; Weitzel, Corinna; Zhang, Ye; Sensen, Christoph W.; Simonsen, Henrik Toft
2013-01-01
Thapsia laciniata Rouy (Apiaceae) produces irregular and regular sesquiterpenoids with thapsane and guaiene carbon skeletons, as found in other Apiaceae species. A transcriptomic analysis utilizing Illumina next-generation sequencing enabled the identification of novel genes involved in the biosynthesis of terpenoids in Thapsia. From 66.78 million HQ paired-end reads obtained from T. laciniata roots, 64.58 million were assembled into 76,565 contigs (N50: 1261 bp). Seventeen contigs were annotated as terpene synthases and five of these were predicted to be sesquiterpene synthases. Of the 67 contigs annotated as cytochromes P450, 18 of these are part of the CYP71 clade that primarily performs hydroxylations of specialized metabolites. Three contigs annotated as aldehyde dehydrogenases grouped phylogenetically with the characterized ALDH1 from Artemisia annua and three contigs annotated as alcohol dehydrogenases grouped with the recently described ADH1 from A. annua. ALDH1 and ADH1 were characterized as part of the artemisinin biosynthesis. We have produced a comprehensive EST dataset for T. laciniata roots, which contains a large sample of the T. laciniata transcriptome. These transcriptome data provide the foundation for future research into the molecular basis for terpenoid biosynthesis in Thapsia and on the evolution of terpenoids in Apiaceae. PMID:23698765
Multiplexed transcriptome analysis to detect ALK, ROS1 and RET rearrangements in lung cancer
Rogers, Toni-Maree; Arnau, Gisela Mir; Ryland, Georgina L.; Huang, Stephen; Lira, Maruja E.; Emmanuel, Yvette; Perez, Omar D.; Irwin, Darryl; Fellowes, Andrew P.; Wong, Stephen Q.; Fox, Stephen B.
2017-01-01
ALK, ROS1 and RET gene fusions are important predictive biomarkers for tyrosine kinase inhibitors in lung cancer. Currently, the gold standard method for gene fusion detection is Fluorescence In Situ Hybridization (FISH) and while highly sensitive and specific, it is also labour intensive, subjective in analysis, and unable to screen a large numbers of gene fusions. Recent developments in high-throughput transcriptome-based methods may provide a suitable alternative to FISH as they are compatible with multiplexing and diagnostic workflows. However, the concordance between these different methods compared with FISH has not been evaluated. In this study we compared the results from three transcriptome-based platforms (Nanostring Elements, Agena LungFusion panel and ThermoFisher NGS fusion panel) to those obtained from ALK, ROS1 and RET FISH on 51 clinical specimens. Overall agreement of results ranged from 86–96% depending on the platform used. While all platforms were highly sensitive, both the Agena panel and Thermo Fisher NGS fusion panel reported minor fusions that were not detectable by FISH. Our proof–of–principle study illustrates that transcriptome-based analyses are sensitive and robust methods for detecting actionable gene fusions in lung cancer and could provide a robust alternative to FISH testing in the diagnostic setting. PMID:28181564
Revealing the transcriptomic complexity of switchgrass by PacBio long-read sequencing.
Zuo, Chunman; Blow, Matthew; Sreedasyam, Avinash; Kuo, Rita C; Ramamoorthy, Govindarajan Kunde; Torres-Jerez, Ivone; Li, Guifen; Wang, Mei; Dilworth, David; Barry, Kerrie; Udvardi, Michael; Schmutz, Jeremy; Tang, Yuhong; Xu, Ying
2018-01-01
Switchgrass ( Panicum virgatum L.) is an important bioenergy crop widely used for lignocellulosic research. While extensive transcriptomic analyses have been conducted on this species using short read-based sequencing techniques, very little has been reliably derived regarding alternatively spliced (AS) transcripts. We present an analysis of transcriptomes of six switchgrass tissue types pooled together, sequenced using Pacific Biosciences (PacBio) single-molecular long-read technology. Our analysis identified 105,419 unique transcripts covering 43,570 known genes and 8795 previously unknown genes. 45,168 are novel transcripts of known genes. A total of 60,096 AS transcripts are identified, 45,628 being novel. We have also predicted 1549 transcripts of genes involved in cell wall construction and remodeling, 639 being novel transcripts of known cell wall genes. Most of the predicted transcripts are validated against Illumina-based short reads. Specifically, 96% of the splice junction sites in all the unique transcripts are validated by at least five Illumina reads. Comparisons between genes derived from our identified transcripts and the current genome annotation revealed that among the gene set predicted by both analyses, 16,640 have different exon-intron structures. Overall, substantial amount of new information is derived from the PacBio RNA data regarding both the transcriptome and the genome of switchgrass.
Ochsner, Scott A; Watkins, Christopher M; LaGrone, Benjamin S; Steffen, David L; McKenna, Neil J
2010-10-01
Nuclear receptors (NRs) are ligand-regulated transcription factors that recruit coregulators and other transcription factors to gene promoters to effect regulation of tissue-specific transcriptomes. The prodigious rate at which the NR signaling field has generated high content gene expression and, more recently, genome-wide location analysis datasets has not been matched by a committed effort to archiving this information for routine access by bench and clinical scientists. As a first step towards this goal, we searched the MEDLINE database for studies, which referenced either expression microarray and/or genome-wide location analysis datasets in which a NR or NR ligand was an experimental variable. A total of 1122 studies encompassing 325 unique organs, tissues, primary cells, and cell lines, 35 NRs, and 91 NR ligands were retrieved and annotated. The data were incorporated into a new section of the Nuclear Receptor Signaling Atlas Molecule Pages, Transcriptomics and Cistromics, for which we designed an intuitive, freely accessible user interface to browse the studies. Each study links to an abstract, the MEDLINE record, and, where available, Gene Expression Omnibus and ArrayExpress records. The resource will be updated on a regular basis to provide a current and comprehensive entrez into the sum of transcriptomic and cistromic research in this field.
Cavaiuolo, Marina; Cocetta, Giacomo; Spadafora, Natasha Damiana; Müller, Carsten T.; Rogers, Hilary J.
2017-01-01
Diplotaxis tenuifolia L. is of important economic value in the fresh-cut industry for its nutraceutical and sensorial properties. However, information on the molecular mechanisms conferring tolerance of harvested leaves to pre- and postharvest stresses during processing and shelf-life have never been investigated. Here, we provide the first transcriptomic resource of rocket by de novo RNA sequencing assembly, functional annotation and stress-induced expression analysis of 33874 transcripts. Transcriptomic changes in leaves subjected to commercially-relevant pre-harvest (salinity, heat and nitrogen starvation) and postharvest stresses (cold, dehydration, dark, wounding) known to affect quality and shelf-life were analysed 24h after stress treatment, a timing relevant to subsequent processing of salad leaves. Transcription factors and genes involved in plant growth regulator signaling, autophagy, senescence and glucosinolate metabolism were the most affected by the stresses. Hundreds of genes with unknown function but uniquely expressed under stress were identified, providing candidates to investigate stress responses in rocket. Dehydration and wounding had the greatest effect on the transcriptome and different stresses elicited changes in the expression of genes related to overlapping groups of hormones. These data will allow development of approaches targeted at improving stress tolerance, quality and shelf-life of rocket with direct applications in the fresh-cut industries. PMID:28558066
Cavaiuolo, Marina; Cocetta, Giacomo; Spadafora, Natasha Damiana; Müller, Carsten T; Rogers, Hilary J; Ferrante, Antonio
2017-01-01
Diplotaxis tenuifolia L. is of important economic value in the fresh-cut industry for its nutraceutical and sensorial properties. However, information on the molecular mechanisms conferring tolerance of harvested leaves to pre- and postharvest stresses during processing and shelf-life have never been investigated. Here, we provide the first transcriptomic resource of rocket by de novo RNA sequencing assembly, functional annotation and stress-induced expression analysis of 33874 transcripts. Transcriptomic changes in leaves subjected to commercially-relevant pre-harvest (salinity, heat and nitrogen starvation) and postharvest stresses (cold, dehydration, dark, wounding) known to affect quality and shelf-life were analysed 24h after stress treatment, a timing relevant to subsequent processing of salad leaves. Transcription factors and genes involved in plant growth regulator signaling, autophagy, senescence and glucosinolate metabolism were the most affected by the stresses. Hundreds of genes with unknown function but uniquely expressed under stress were identified, providing candidates to investigate stress responses in rocket. Dehydration and wounding had the greatest effect on the transcriptome and different stresses elicited changes in the expression of genes related to overlapping groups of hormones. These data will allow development of approaches targeted at improving stress tolerance, quality and shelf-life of rocket with direct applications in the fresh-cut industries.
Cabrera, Ana R; Donohue, Kevin V; Khalil, Sayed M S; Scholl, Elizabeth; Opperman, Charles; Sonenshine, Daniel E; Roe, R Michael
2011-01-01
Many species of mites and ticks are of agricultural and medical importance. Much can be learned from the study of transcriptomes of acarines which can generate DNA-sequence information of potential target genes for the control of acarine pests. High throughput transcriptome sequencing can also yield sequences of genes critical during physiological processes poorly understood in acarines, i.e., the regulation of female reproduction in mites. The predatory mite, Phytoseiulus persimilis, was selected to conduct a transcriptome analysis using 454 pyrosequencing. The objective of this project was to obtain DNA-sequence information of expressed genes from P. persimilis with special interest in sequences corresponding to vitellogenin (Vg) and the vitellogenin receptor (VgR). These genes are critical to the understanding of vitellogenesis, and they will facilitate the study of the regulation of mite female reproduction. A total of 12,556 contiguous sequences (contigs) were assembled with an average size of 935bp. From these sequences, the putative translated peptides of 11 contigs were similar in amino acid sequences to other arthropod Vgs, while 6 were similar to VgRs. We selected some of these sequences to conduct stage-specific expression studies to further determine their function. 2010 Elsevier Ltd. All rights reserved.
2014-01-01
Background The lined sea anemone Edwardsiella lineata is an informative model system for evolutionary-developmental studies of parasitism. In this species, it is possible to compare alternate developmental pathways leading from a larva to either a free-living polyp or a vermiform parasite that inhabits the mesoglea of a ctenophore host. Additionally, E. lineata is confamilial with the model cnidarian Nematostella vectensis, providing an opportunity for comparative genomic, molecular and organismal studies. Description We generated a reference transcriptome for E. lineata via high-throughput sequencing of RNA isolated from five developmental stages (parasite; parasite-to-larva transition; larva; larva-to-adult transition; adult). The transcriptome comprises 90,440 contigs assembled from >15 billion nucleotides of DNA sequence. Using a molecular clock approach, we estimated the divergence between E. lineata and N. vectensis at 215–364 million years ago. Based on gene ontology and metabolic pathway analyses and gene family surveys (bHLH-PAS, deiodinases, Fox genes, LIM homeodomains, minicollagens, nuclear receptors, Sox genes, and Wnts), the transcriptome of E. lineata is comparable in depth and completeness to N. vectensis. Analyses of protein motifs and revealed extensive conservation between the proteins of these two edwardsiid anemones, although we show the NF-κB protein of E. lineata reflects the ancestral structure, while the NF-κB protein of N. vectensis has undergone a split that separates the DNA-binding domain from the inhibitory domain. All contigs have been deposited in a public database (EdwardsiellaBase), where they may be searched according to contig ID, gene ontology, protein family motif (Pfam), enzyme commission number, and BLAST. The alignment of the raw reads to the contigs can also be visualized via JBrowse. Conclusions The transcriptomic data and database described here provide a platform for studying the evolutionary developmental genomics of a derived parasitic life cycle. In addition, these data from E. lineata will aid in the interpretation of evolutionary novelties in gene sequence or structure that have been reported for the model cnidarian N. vectensis (e.g., the split NF-κB locus). Finally, we include custom computational tools to facilitate the annotation of a transcriptome based on high-throughput sequencing data obtained from a “non-model system.” PMID:24467778
Stefanik, Derek J; Lubinski, Tristan J; Granger, Brian R; Byrd, Allyson L; Reitzel, Adam M; DeFilippo, Lukas; Lorenc, Allison; Finnerty, John R
2014-01-28
The lined sea anemone Edwardsiella lineata is an informative model system for evolutionary-developmental studies of parasitism. In this species, it is possible to compare alternate developmental pathways leading from a larva to either a free-living polyp or a vermiform parasite that inhabits the mesoglea of a ctenophore host. Additionally, E. lineata is confamilial with the model cnidarian Nematostella vectensis, providing an opportunity for comparative genomic, molecular and organismal studies. We generated a reference transcriptome for E. lineata via high-throughput sequencing of RNA isolated from five developmental stages (parasite; parasite-to-larva transition; larva; larva-to-adult transition; adult). The transcriptome comprises 90,440 contigs assembled from >15 billion nucleotides of DNA sequence. Using a molecular clock approach, we estimated the divergence between E. lineata and N. vectensis at 215-364 million years ago. Based on gene ontology and metabolic pathway analyses and gene family surveys (bHLH-PAS, deiodinases, Fox genes, LIM homeodomains, minicollagens, nuclear receptors, Sox genes, and Wnts), the transcriptome of E. lineata is comparable in depth and completeness to N. vectensis. Analyses of protein motifs and revealed extensive conservation between the proteins of these two edwardsiid anemones, although we show the NF-κB protein of E. lineata reflects the ancestral structure, while the NF-κB protein of N. vectensis has undergone a split that separates the DNA-binding domain from the inhibitory domain. All contigs have been deposited in a public database (EdwardsiellaBase), where they may be searched according to contig ID, gene ontology, protein family motif (Pfam), enzyme commission number, and BLAST. The alignment of the raw reads to the contigs can also be visualized via JBrowse. The transcriptomic data and database described here provide a platform for studying the evolutionary developmental genomics of a derived parasitic life cycle. In addition, these data from E. lineata will aid in the interpretation of evolutionary novelties in gene sequence or structure that have been reported for the model cnidarian N. vectensis (e.g., the split NF-κB locus). Finally, we include custom computational tools to facilitate the annotation of a transcriptome based on high-throughput sequencing data obtained from a "non-model system."
Transcriptome instability as a molecular pan-cancer characteristic of carcinomas.
Sveen, Anita; Johannessen, Bjarne; Teixeira, Manuel R; Lothe, Ragnhild A; Skotheim, Rolf I
2014-08-10
We have previously proposed transcriptome instability as a genome-wide, pre-mRNA splicing-related characteristic of colorectal cancer. Here, we explore the hypothesis of transcriptome instability being a general characteristic of cancer. Exon-level microarray expression data from ten cancer datasets were analyzed, including breast cancer, cervical cancer, colorectal cancer, gastric cancer, lung cancer, neuroblastoma, and prostate cancer (555 samples), as well as paired normal tissue samples from the colon, lung, prostate, and stomach (93 samples). Based on alternative splicing scores across the genomes, we calculated sample-wise relative amounts of aberrant exon skipping and inclusion. Strong and non-random (P < 0.001) correlations between these estimates and the expression levels of splicing factor genes (n = 280) were found in most cancer types analyzed (breast-, cervical-, colorectal-, lung- and prostate cancer). This suggests a biological explanation for the splicing variation. Surprisingly, these associations prevailed in pan-cancer analyses. This is in contrast to the tissue and cancer specific patterns observed in comparisons across healthy tissue samples from the colon, lung, prostate, and stomach, and between paired cancer-normal samples from the same four tissue types. Based on exon-level expression profiling and computational analyses of alternative splicing, we propose transcriptome instability as a molecular pan-cancer characteristic. The affected cancers show strong and non-random associations between low expression levels of splicing factor genes, and high amounts of aberrant exon skipping and inclusion, and vice versa, on a genome-wide scale.
Workflow and web application for annotating NCBI BioProject transcriptome data
Vera Alvarez, Roberto; Medeiros Vidal, Newton; Garzón-Martínez, Gina A.; Barrero, Luz S.; Landsman, David
2017-01-01
Abstract The volume of transcriptome data is growing exponentially due to rapid improvement of experimental technologies. In response, large central resources such as those of the National Center for Biotechnology Information (NCBI) are continually adapting their computational infrastructure to accommodate this large influx of data. New and specialized databases, such as Transcriptome Shotgun Assembly Sequence Database (TSA) and Sequence Read Archive (SRA), have been created to aid the development and expansion of centralized repositories. Although the central resource databases are under continual development, they do not include automatic pipelines to increase annotation of newly deposited data. Therefore, third-party applications are required to achieve that aim. Here, we present an automatic workflow and web application for the annotation of transcriptome data. The workflow creates secondary data such as sequencing reads and BLAST alignments, which are available through the web application. They are based on freely available bioinformatics tools and scripts developed in-house. The interactive web application provides a search engine and several browser utilities. Graphical views of transcript alignments are available through SeqViewer, an embedded tool developed by NCBI for viewing biological sequence data. The web application is tightly integrated with other NCBI web applications and tools to extend the functionality of data processing and interconnectivity. We present a case study for the species Physalis peruviana with data generated from BioProject ID 67621. Database URL: http://www.ncbi.nlm.nih.gov/projects/physalis/ PMID:28605765
Ma, Chuang; Wang, Xiangfeng
2012-09-01
One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey's biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses.
Ma, Chuang; Wang, Xiangfeng
2012-01-01
One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey’s biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses. PMID:22797655
NASA Astrophysics Data System (ADS)
Liu, Chengzhang; Wang, Xia; Xiang, Jianhai; Li, Fuhua
2012-09-01
Pacific white shrimp has become a major aquaculture and fishery species worldwide. Although a large scale EST resource has been publicly available since 2008, the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure. In this study, a set of 155 411 expressed sequence tags (ESTs) from the NCBI database were computationally analyzed and 17 225 single nucleotide polymorphisms (SNPs) were predicted, including 9 546 transitions, 5 124 transversions and 2 481 indels. Among the 7 298 SNP substitutions located in functionally annotated contigs, 58.4% (4 262) are non-synonymous SNPs capable of introducing amino acid mutations. Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding. Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous, suggesting negative selection. Distribution of non-synonymous to synonymous substitutions (Ka/Ks) ratio ranges from 0 to 4.01, (average 0.42, median 0.26), suggesting that the majority of the affected genes are under purifying selection. Enrichment analysis identified multiple gene ontology categories under positive or negative selection. Categories involved in innate immune response and male gamete generation are rich in positively selected genes, which is similar to reports in Drosophila and primates. This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species. The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.
2012-01-01
Introduction Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. Results cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. Conclusions We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene fragments) and protein families for ten newly sequenced non-model organisms, some of commercial importance (i.e., Octopus vulgaris). These comprehensive sets of genes can be readily used for phylogenetic analysis, gene expression profiling, developmental analysis, and can also be a powerful resource for gene discovery. The characterization of the transcriptomes of such a diverse array of animal species permitted the comparison of sequencing depth, functional annotation, and efficiency of genomic sampling using the same pipelines, which proved to be similar for all considered species. In addition, the datasets revealed their potential as a resource for paralogue detection, a recurrent concern in various aspects of biological inquiry, including phylogenetics, molecular evolution, development, and cellular biochemistry. PMID:23190771
The transcriptional landscape of age in human peripheral blood
Peters, Marjolein J.; Joehanes, Roby; Pilling, Luke C.; Schurmann, Claudia; Conneely, Karen N.; Powell, Joseph; Reinmaa, Eva; Sutphin, George L.; Zhernakova, Alexandra; Schramm, Katharina; Wilson, Yana A.; Kobes, Sayuko; Tukiainen, Taru; Nalls, Michael A.; Hernandez, Dena G.; Cookson, Mark R.; Gibbs, Raphael J.; Hardy, John; Ramasamy, Adaikalavan; Zonderman, Alan B.; Dillman, Allissa; Traynor, Bryan; Smith, Colin; Longo, Dan L.; Trabzuni, Daniah; Troncoso, Juan; van der Brug, Marcel; Weale, Michael E.; O'Brien, Richard; Johnson, Robert; Walker, Robert; Zielke, Ronald H.; Arepalli, Sampath; Ryten, Mina; Singleton, Andrew B.; Ramos, Yolande F.; Göring, Harald H. H.; Fornage, Myriam; Liu, Yongmei; Gharib, Sina A.; Stranger, Barbara E.; De Jager, Philip L.; Aviv, Abraham; Levy, Daniel; Murabito, Joanne M.; Munson, Peter J.; Huan, Tianxiao; Hofman, Albert; Uitterlinden, André G.; Rivadeneira, Fernando; van Rooij, Jeroen; Stolk, Lisette; Broer, Linda; Verbiest, Michael M. P. J.; Jhamai, Mila; Arp, Pascal; Metspalu, Andres; Tserel, Liina; Milani, Lili; Samani, Nilesh J.; Peterson, Pärt; Kasela, Silva; Codd, Veryan; Peters, Annette; Ward-Caviness, Cavin K.; Herder, Christian; Waldenberger, Melanie; Roden, Michael; Singmann, Paula; Zeilinger, Sonja; Illig, Thomas; Homuth, Georg; Grabe, Hans-Jörgen; Völzke, Henry; Steil, Leif; Kocher, Thomas; Murray, Anna; Melzer, David; Yaghootkar, Hanieh; Bandinelli, Stefania; Moses, Eric K.; Kent, Jack W.; Curran, Joanne E.; Johnson, Matthew P.; Williams-Blangero, Sarah; Westra, Harm-Jan; McRae, Allan F.; Smith, Jennifer A.; Kardia, Sharon L. R.; Hovatta, Iiris; Perola, Markus; Ripatti, Samuli; Salomaa, Veikko; Henders, Anjali K.; Martin, Nicholas G.; Smith, Alicia K.; Mehta, Divya; Binder, Elisabeth B.; Nylocks, K Maria; Kennedy, Elizabeth M.; Klengel, Torsten; Ding, Jingzhong; Suchy-Dicey, Astrid M.; Enquobahrie, Daniel A.; Brody, Jennifer; Rotter, Jerome I.; Chen, Yii-Der I.; Houwing-Duistermaat, Jeanine; Kloppenburg, Margreet; Slagboom, P. Eline; Helmer, Quinta; den Hollander, Wouter; Bean, Shannon; Raj, Towfique; Bakhshi, Noman; Wang, Qiao Ping; Oyston, Lisa J.; Psaty, Bruce M.; Tracy, Russell P.; Montgomery, Grant W.; Turner, Stephen T.; Blangero, John; Meulenbelt, Ingrid; Ressler, Kerry J.; Yang, Jian; Franke, Lude; Kettunen, Johannes; Visscher, Peter M.; Neely, G. Gregory; Korstanje, Ron; Hanson, Robert L.; Prokisch, Holger; Ferrucci, Luigi; Esko, Tonu; Teumer, Alexander; van Meurs, Joyce B. J.; Johnson, Andrew D.
2015-01-01
Disease incidences increase with age, but the molecular characteristics of ageing that lead to increased disease susceptibility remain inadequately understood. Here we perform a whole-blood gene expression meta-analysis in 14,983 individuals of European ancestry (including replication) and identify 1,497 genes that are differentially expressed with chronological age. The age-associated genes do not harbor more age-associated CpG-methylation sites than other genes, but are instead enriched for the presence of potentially functional CpG-methylation sites in enhancer and insulator regions that associate with both chronological age and gene expression levels. We further used the gene expression profiles to calculate the ‘transcriptomic age' of an individual, and show that differences between transcriptomic age and chronological age are associated with biological features linked to ageing, such as blood pressure, cholesterol levels, fasting glucose, and body mass index. The transcriptomic prediction model adds biological relevance and complements existing epigenetic prediction models, and can be used by others to calculate transcriptomic age in external cohorts. PMID:26490707
Faria-Blanc, Nuno; Mortimer, Jenny C.; Dupree, Paul
2018-01-01
Yeast have long been known to possess a cell wall integrity (CWI) system, and recently an analogous system has been described for the primary walls of plants (PCWI) that leads to changes in plant growth and cell wall composition. A similar system has been proposed to exist for secondary cell walls (SCWI). However, there is little data to support this. Here, we analyzed the stem transcriptome of a set of cell wall biosynthetic mutants in order to investigate whether cell wall damage, in this case caused by aberrant xylan synthesis, activates a signaling cascade or changes in cell wall synthesis gene expression. Our data revealed remarkably few changes to the transcriptome. We hypothesize that this is because cells undergoing secondary cell wall thickening have entered a committed programme leading to cell death, and therefore a SCWI system would have limited impact. The absence of transcriptomic responses to secondary cell wall alterations may facilitate engineering of the secondary cell wall of plants. PMID:29636762
Faria-Blanc, Nuno; Mortimer, Jenny C; Dupree, Paul
2018-01-01
Yeast have long been known to possess a cell wall integrity (CWI) system, and recently an analogous system has been described for the primary walls of plants (PCWI) that leads to changes in plant growth and cell wall composition. A similar system has been proposed to exist for secondary cell walls (SCWI). However, there is little data to support this. Here, we analyzed the stem transcriptome of a set of cell wall biosynthetic mutants in order to investigate whether cell wall damage, in this case caused by aberrant xylan synthesis, activates a signaling cascade or changes in cell wall synthesis gene expression. Our data revealed remarkably few changes to the transcriptome. We hypothesize that this is because cells undergoing secondary cell wall thickening have entered a committed programme leading to cell death, and therefore a SCWI system would have limited impact. The absence of transcriptomic responses to secondary cell wall alterations may facilitate engineering of the secondary cell wall of plants.
Using single nuclei for RNA-seq to capture the transcriptome of postmortem neurons
Krishnaswami, Suguna Rani; Grindberg, Rashel V; Novotny, Mark; Venepally, Pratap; Lacar, Benjamin; Bhutani, Kunal; Linker, Sara B; Pham, Son; Erwin, Jennifer A; Miller, Jeremy A; Hodge, Rebecca; McCarthy, James K; Kelder, Martin; McCorrison, Jamison; Aevermann, Brian D; Fuertes, Francisco Diez; Scheuermann, Richard H; Lee, Jun; Lein, Ed S; Schork, Nicholas; McConnell, Michael J; Gage, Fred H; Lasken, Roger S
2016-01-01
A protocol is described for sequencing the transcriptome of a cell nucleus. Nuclei are isolated from specimens and sorted by FACS, cDNA libraries are constructed and RNA-seq is performed, followed by data analysis. Some steps follow published methods (Smart-seq2 for cDNA synthesis and Nextera XT barcoded library preparation) and are not described in detail here. Previous single-cell approaches for RNA-seq from tissues include cell dissociation using protease treatment at 30 °C, which is known to alter the transcriptome. We isolate nuclei at 4 °C from tissue homogenates, which cause minimal damage. Nuclear transcriptomes can be obtained from postmortem human brain tissue stored at −80 °C, making brain archives accessible for RNA-seq from individual neurons. The method also allows investigation of biological features unique to nuclei, such as enrichment of certain transcripts and precursors of some noncoding RNAs. By following this procedure, it takes about 4 d to construct cDNA libraries that are ready for sequencing. PMID:26890679
Guarnieri, Michael T.; Nag, Ambarish; Smolinski, Sharon L.; Darzins, Al; Seibert, Michael; Pienkos, Philip T.
2011-01-01
Biofuels derived from algal lipids represent an opportunity to dramatically impact the global energy demand for transportation fuels. Systems biology analyses of oleaginous algae could greatly accelerate the commercialization of algal-derived biofuels by elucidating the key components involved in lipid productivity and leading to the initiation of hypothesis-driven strain-improvement strategies. However, higher-level systems biology analyses, such as transcriptomics and proteomics, are highly dependent upon available genomic sequence data, and the lack of these data has hindered the pursuit of such analyses for many oleaginous microalgae. In order to examine the triacylglycerol biosynthetic pathway in the unsequenced oleaginous microalga, Chlorella vulgaris, we have established a strategy with which to bypass the necessity for genomic sequence information by using the transcriptome as a guide. Our results indicate an upregulation of both fatty acid and triacylglycerol biosynthetic machinery under oil-accumulating conditions, and demonstrate the utility of a de novo assembled transcriptome as a search model for proteomic analysis of an unsequenced microalga. PMID:22043295
Luo, C; Zhang, Q L; Luo, Z R
2014-04-16
Oriental persimmon (Diospyros kaki Thunb.) (2n = 6x = 90) is a major commercial and deciduous fruit tree that is believed to have originated in China. However, rare transcriptomic and genomic information on persimmon is available. Using Roche 454 sequencing technology, the transcriptome from RNA of the flowers of D. kaki was analyzed. A total of 1,250,893 reads were generated and 83,898 unigenes were assembled. A total of 42,711 SSR loci were identified from 23,494 unigenes and 289 polymerase chain reaction primer pairs were designed. Of these 289 primers, 155 (53.6%) showed robust PCR amplification and 98 revealed polymorphism between 15 persimmon genotypes, indicating a polymorphic rate of 63.23% of the productive primers for characterization and genotyping of the genus Diospyros. Transcriptome sequence data generated from next-generation sequencing technology to identify microsatellite loci appears to be rapid and cost-efficient, particularly for species with no genomic sequence information available.
Reefgenomics.Org - a repository for marine genomics data.
Liew, Yi Jin; Aranda, Manuel; Voolstra, Christian R
2016-01-01
Over the last decade, technological advancements have substantially decreased the cost and time of obtaining large amounts of sequencing data. Paired with the exponentially increased computing power, individual labs are now able to sequence genomes or transcriptomes to investigate biological questions of interest. This has led to a significant increase in available sequence data. Although the bulk of data published in articles are stored in public sequence databases, very often, only raw sequencing data are available; miscellaneous data such as assembled transcriptomes, genome annotations etc. are not easily obtainable through the same means. Here, we introduce our website (http://reefgenomics.org) that aims to centralize genomic and transcriptomic data from marine organisms. Besides providing convenient means to download sequences, we provide (where applicable) a genome browser to explore available genomic features, and a BLAST interface to search through the hosted sequences. Through the interface, multiple datasets can be queried simultaneously, allowing for the retrieval of matching sequences from organisms of interest. The minimalistic, no-frills interface reduces visual clutter, making it convenient for end-users to search and explore processed sequence data. DATABASE URL: http://reefgenomics.org. © The Author(s) 2016. Published by Oxford University Press.
Hsu, Chi-Lin; Chou, Chih-Hsuan; Huang, Shih-Chuan; Lin, Chia-Yi; Lin, Meng-Ying; Tung, Chun-Che; Lin, Chun-Yen; Lai, Ivan Pochou; Zou, Yan-Fang; Youngson, Neil A; Lin, Shau-Ping; Yang, Chang-Hao; Chen, Shih-Kuo; Gau, Susan Shur-Fen; Huang, Hsien-Sung
2018-03-15
Visual system development is light-experience dependent, which strongly implicates epigenetic mechanisms in light-regulated maturation. Among many epigenetic processes, genomic imprinting is an epigenetic mechanism through which monoallelic gene expression occurs in a parent-of-origin-specific manner. It is unknown if genomic imprinting contributes to visual system development. We profiled the transcriptome and imprintome during critical periods of mouse visual system development under normal- and dark-rearing conditions using B6/CAST F1 hybrid mice. We identified experience-regulated, isoform-specific and brain-region-specific imprinted genes. We also found imprinted microRNAs were predominantly clustered into the Dlk1-Dio3 imprinted locus with light experience affecting some imprinted miRNA expression. Our findings provide the first comprehensive analysis of light-experience regulation of the transcriptome and imprintome during critical periods of visual system development. Our results may contribute to therapeutic strategies for visual impairments and circadian rhythm disorders resulting from a dysfunctional imprintome.
Gehan, Malia A; Mockler, Todd C; Weinig, Cynthia; Ewers, Brent E
2017-01-01
The dynamics of local climates make development of agricultural strategies challenging. Yield improvement has progressed slowly, especially in drought-prone regions where annual crop production suffers from episodic aridity. Underlying drought responses are circadian and diel control of gene expression that regulate daily variations in metabolic and physiological pathways. To identify transcriptomic changes that occur in the crop Brassica rapa during initial perception of drought, we applied a co-expression network approach to associate rhythmic gene expression changes with physiological responses. Coupled analysis of transcriptome and physiological parameters over a two-day time course in control and drought-stressed plants provided temporal resolution necessary for correlation of network modules with dynamic changes in stomatal conductance, photosynthetic rate, and photosystem II efficiency. This approach enabled the identification of drought-responsive genes based on their differential rhythmic expression profiles in well-watered versus droughted networks and provided new insights into the dynamic physiological changes that occur during drought. PMID:28826479
Haas, Brian J; Papanicolaou, Alexie; Yassour, Moran; Grabherr, Manfred; Blood, Philip D; Bowden, Joshua; Couger, Matthew Brian; Eccles, David; Li, Bo; Lieber, Matthias; MacManes, Matthew D; Ott, Michael; Orvis, Joshua; Pochet, Nathalie; Strozzi, Francesco; Weeks, Nathan; Westerman, Rick; William, Thomas; Dewey, Colin N; Henschel, Robert; LeDuc, Richard D; Friedman, Nir; Regev, Aviv
2013-08-01
De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.
Yang, Shiyong; Cao, Depan; Wang, Guirong; Liu, Yang
2017-09-20
Perception of environmental and habitat cues is of significance for insect survival and reproduction. Odor detection in insects is mediated by a number of proteins in antennae such as odorant receptors (ORs), ionotropic receptors (IRs), odorant binding proteins (OBPs), chemosensory proteins (CSPs), sensory neuron membrane proteins (SNMPs) and odorant degrading enzymes. In this study, we sequenced and assembled the adult male and female antennal transcriptomes of a destructive agricultural pest, the diamondback moth Plutella xyllostella. In these transcriptomes, we identified transcripts belonging to 6 chemoreception gene families related to ordor detection, including 54 ORs, 16 IRs, 7 gustatory receptors (GRs), 15 CSPs, 24 OBPs and 2 SNMPs. Semi-quantitative reverse transcription PCR analysis of expression patterns indicated that some of these ORs and IRs have clear sex-biased and tissue-specific expression patterns. Our results lay the foundation for future characterization of the functions of these P. xyllostella chemosensory receptors at the molecular level and development of novel semiochemicals for integrated control of this agricultural pest.
Xu, Jiajia; Li, Yuanyuan; Ma, Xiuling; Ding, Jianfeng; Wang, Kai; Wang, Sisi; Tian, Ye; Zhang, Hui; Zhu, Xin-Guang
2013-09-01
Setaria viridis is an emerging model species for genetic studies of C4 photosynthesis. Many basic molecular resources need to be developed to support for this species. In this paper, we performed a comprehensive transcriptome analysis from multiple developmental stages and tissues of S. viridis using next-generation sequencing technologies. Sequencing of the transcriptome from multiple tissues across three developmental stages (seed germination, vegetative growth, and reproduction) yielded a total of 71 million single end 100 bp long reads. Reference-based assembly using Setaria italica genome as a reference generated 42,754 transcripts. De novo assembly generated 60,751 transcripts. In addition, 9,576 and 7,056 potential simple sequence repeats (SSRs) covering S. viridis genome were identified when using the reference based assembled transcripts and the de novo assembled transcripts, respectively. This identified transcripts and SSR provided by this study can be used for both reverse and forward genetic studies based on S. viridis.
Transcriptomic Analysis of the Salivary Glands of an Invasive Whitefly
Su, Yun-Lin; Li, Jun-Min; Li, Meng; Luan, Jun-Bo; Ye, Xiao-Dong; Wang, Xiao-Wei; Liu, Shu-Sheng
2012-01-01
Background Some species of the whitefly Bemisia tabaci complex cause tremendous losses to crops worldwide through feeding directly and virus transmission indirectly. The primary salivary glands of whiteflies are critical for their feeding and virus transmission. However, partly due to their tiny size, research on whitefly salivary glands is limited and our knowledge on these glands is scarce. Methodology/Principal Findings We sequenced the transcriptome of the primary salivary glands of the Mediterranean species of B. tabaci complex using an effective cDNA amplification method in combination with short read sequencing (Illumina). In a single run, we obtained 13,615 unigenes. The quantity of the unigenes obtained from the salivary glands of the whitefly is at least four folds of the salivary gland genes from other plant-sucking insects. To reveal the functions of the primary glands, sequence similarity search and comparisons with the whole transcriptome of the whitefly were performed. The results demonstrated that the genes related to metabolism and transport were significantly enriched in the primary salivary glands. Furthermore, we found that a number of highly expressed genes in the salivary glands might be involved in secretory protein processing, secretion and virus transmission. To identify potential proteins of whitefly saliva, the translated unigenes were put into secretory protein prediction. Finally, 295 genes were predicted to encode secretory proteins and some of them might play important roles in whitefly feeding. Conclusions/Significance: The combined method of cDNA amplification, Illumina sequencing and de novo assembly is suitable for transcriptomic analysis of tiny organs in insects. Through analysis of the transcriptome, genomic features of the primary salivary glands were dissected and biologically important proteins, especially secreted proteins, were predicted. Our findings provide substantial sequence information for the primary salivary glands of whiteflies and will be the basis for future studies on whitefly-plant interactions and virus transmission. PMID:22745728
Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun
2015-01-01
Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with other avian species as useful tools to understand the goose immune system. PMID:25816068
Rossouw, Debra; Næs, Tormod; Bauer, Florian F
2008-01-01
Background 'Omics' tools provide novel opportunities for system-wide analysis of complex cellular functions. Secondary metabolism is an example of a complex network of biochemical pathways, which, although well mapped from a biochemical point of view, is not well understood with regards to its physiological roles and genetic and biochemical regulation. Many of the metabolites produced by this network such as higher alcohols and esters are significant aroma impact compounds in fermentation products, and different yeast strains are known to produce highly divergent aroma profiles. Here, we investigated whether we can predict the impact of specific genes of known or unknown function on this metabolic network by combining whole transcriptome and partial exo-metabolome analysis. Results For this purpose, the gene expression levels of five different industrial wine yeast strains that produce divergent aroma profiles were established at three different time points of alcoholic fermentation in synthetic wine must. A matrix of gene expression data was generated and integrated with the concentrations of volatile aroma compounds measured at the same time points. This relatively unbiased approach to the study of volatile aroma compounds enabled us to identify candidate genes for aroma profile modification. Five of these genes, namely YMR210W, BAT1, AAD10, AAD14 and ACS1 were selected for overexpression in commercial wine yeast, VIN13. Analysis of the data show a statistically significant correlation between the changes in the exo-metabome of the overexpressing strains and the changes that were predicted based on the unbiased alignment of transcriptomic and exo-metabolomic data. Conclusion The data suggest that a comparative transcriptomics and metabolomics approach can be used to identify the metabolic impacts of the expression of individual genes in complex systems, and the amenability of transcriptomic data to direct applications of biotechnological relevance. PMID:18990252
Transcriptomic Immune Response of Tenebrio molitor Pupae to Parasitization by Scleroderma guani
Zhu, Jia-Ying; Yang, Pu; Zhang, Zhong; Wu, Guo-Xing; Yang, Bin
2013-01-01
Background Host and parasitoid interaction is one of the most fascinating relationships of insects, which is currently receiving an increasing interest. Understanding the mechanisms evolved by the parasitoids to evade or suppress the host immune system is important for dissecting this interaction, while it was still poorly known. In order to gain insight into the immune response of Tenebrio molitor to parasitization by Scleroderma guani, the transcriptome of T. molitor pupae was sequenced with focus on immune-related gene, and the non-parasitized and parasitized T. molitor pupae were analyzed by digital gene expression (DGE) analysis with special emphasis on parasitoid-induced immune-related genes using Illumina sequencing. Methodology/Principal Findings In a single run, 264,698 raw reads were obtained. De novo assembly generated 71,514 unigenes with mean length of 424 bp. Of those unigenes, 37,373 (52.26%) showed similarity to the known proteins in the NCBI nr database. Via analysis of the transcriptome data in depth, 430 unigenes related to immunity were identified. DGE analysis revealed that parasitization by S. guani had considerable impacts on the transcriptome profile of T. molitor pupae, as indicated by the significant up- or down-regulation of 3,431 parasitism-responsive transcripts. The expression of a total of 74 unigenes involved in immune response of T. molitor was significantly altered after parasitization. Conclusions/Significance obtained T. molitor transcriptome, in addition to establishing a fundamental resource for further research on functional genomics, has allowed the discovery of a large group of immune genes that might provide a meaningful framework to better understand the immune response in this species and other beetles. The DGE profiling data provides comprehensive T. molitor immune gene expression information at the transcriptional level following parasitization, and sheds valuable light on the molecular understanding of the host-parasitoid interaction. PMID:23342153
Kunnath-Velayudhan, Shajo; Goldberg, Michael F; Saini, Neeraj K; Johndrow, Christopher T; Ng, Tony W; Johnson, Alison J; Xu, Jiayong; Chan, John; Jacobs, William R; Porcelli, Steven A
2017-10-01
Analysis of Ag-specific CD4 + T cells in mycobacterial infections at the transcriptome level is informative but technically challenging. Although several methods exist for identifying Ag-specific T cells, including intracellular cytokine staining, cell surface cytokine-capture assays, and staining with peptide:MHC class II multimers, all of these have significant technical constraints that limit their usefulness. Measurement of activation-induced expression of CD154 has been reported to detect live Ag-specific CD4 + T cells, but this approach remains underexplored and, to our knowledge, has not previously been applied in mycobacteria-infected animals. In this article, we show that CD154 expression identifies adoptively transferred or endogenous Ag-specific CD4 + T cells induced by Mycobacterium bovis bacillus Calmette-Guérin vaccination. We confirmed that Ag-specific cytokine production was positively correlated with CD154 expression by CD4 + T cells from bacillus Calmette-Guérin-vaccinated mice and show that high-quality microarrays can be performed from RNA isolated from CD154 + cells purified by cell sorting. Analysis of microarray data demonstrated that the transcriptome of CD4 + CD154 + cells was distinct from that of CD154 - cells and showed major enrichment of transcripts encoding multiple cytokines and pathways of cellular activation. One notable finding was the identification of a previously unrecognized subset of mycobacteria-specific CD4 + T cells that is characterized by the production of IL-3. Our results support the use of CD154 expression as a practical and reliable method to isolate live Ag-specific CD4 + T cells for transcriptomic analysis and potentially for a range of other studies in infected or previously immunized hosts. Copyright © 2017 by The American Association of Immunologists, Inc.
Hayakawa-Yano, Yoshika; Suyama, Satoshi; Nogami, Masahiro; Yugami, Masato; Koya, Ikuko; Furukawa, Takako; Zhou, Li; Abe, Manabu; Sakimura, Kenji; Takebayashi, Hirohide; Nakanishi, Atsushi; Okano, Hideyuki; Yano, Masato
2017-09-15
Cell type-specific transcriptomes are enabled by the action of multiple regulators, which are frequently expressed within restricted tissue regions. In the present study, we identify one such regulator, Quaking 5 (Qki5), as an RNA-binding protein (RNABP) that is expressed in early embryonic neural stem cells and subsequently down-regulated during neurogenesis. mRNA sequencing analysis in neural stem cell culture indicates that Qki proteins play supporting roles in the neural stem cell transcriptome and various forms of mRNA processing that may result from regionally restricted expression and subcellular localization. Also, our in utero electroporation gain-of-function study suggests that the nuclear-type Qki isoform Qki5 supports the neural stem cell state. We next performed in vivo transcriptome-wide protein-RNA interaction mapping to search for direct targets of Qki5 and elucidate how Qki5 regulates neural stem cell function. Combined with our transcriptome analysis, this mapping analysis yielded a bona fide map of Qki5-RNA interaction at single-nucleotide resolution, the identification of 892 Qki5 direct target genes, and an accurate Qki5-dependent alternative splicing rule in the developing brain. Last, our target gene list provides the first compelling evidence that Qki5 is associated with specific biological events; namely, cell-cell adhesion. This prediction was confirmed by histological analysis of mice in which Qki proteins were genetically ablated, which revealed disruption of the apical surface of the lateral wall in the developing brain. These data collectively indicate that Qki5 regulates communication between neural stem cells by mediating numerous RNA processing events and suggest new links between splicing regulation and neural stem cell states. © 2017 Hayakawa-Yano et al.; Published by Cold Spring Harbor Laboratory Press.
Venkataramanan, Keerthi P; Min, Lie; Hou, Shuyu; Jones, Shawn W; Ralston, Matthew T; Lee, Kelvin H; Papoutsakis, E Terry
2015-01-01
Clostridium acetobutylicum is a model organism for both clostridial biology and solvent production. The organism is exposed to its own toxic metabolites butyrate and butanol, which trigger an adaptive stress response. Integrative analysis of proteomic and RNAseq data may provide novel insights into post-transcriptional regulation. The identified iTRAQ-based quantitative stress proteome is made up of 616 proteins with a 15 % genome coverage. The differentially expressed proteome correlated poorly with the corresponding differential RNAseq transcriptome. Up to 31 % of the differentially expressed proteins under stress displayed patterns opposite to those of the transcriptome, thus suggesting significant post-transcriptional regulation. The differential proteome of the translation machinery suggests that cells employ a different subset of ribosomal proteins under stress. Several highly upregulated proteins but with low mRNA levels possessed mRNAs with long 5'UTRs and strong RBS scores, thus supporting the argument that regulatory elements on the long 5'UTRs control their translation. For example, the oxidative stress response rubrerythrin was upregulated only at the protein level up to 40-fold without significant mRNA changes. We also identified many leaderless transcripts, several displaying different transcriptional start sites, thus suggesting mRNA-trimming mechanisms under stress. Downregulation of Rho and partner proteins pointed to changes in transcriptional elongation and termination under stress. The integrative proteomic-transcriptomic analysis demonstrated complex expression patterns of a large fraction of the proteome. Such patterns could not have been detected with one or the other omic analyses. Our analysis proposes the involvement of specific molecular mechanisms of post-transcriptional regulation to explain the observed complex stress response.
Zhu, Youyin; Li, Yongqiang; Xin, Dedong; Chen, Wenrong; Shao, Xu; Wang, Yue; Guo, Weidong
2015-01-25
Bud dormancy is a critical biological process allowing Chinese cherry (Prunus pseudocerasus) to survive in winter. Due to the lake of genomic information, molecular mechanisms triggering endodormancy release in flower buds have remained unclear. Hence, we used Illumina RNA-Seq technology to carry out de novo transcriptome assembly and digital gene expression profiling of flower buds. Approximately 47million clean reads were assembled into 50,604 sequences with an average length of 837bp. A total of 37,650 unigene sequences were successfully annotated. 128 pathways were annotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and metabolic, biosynthesis of second metabolite and plant hormone signal transduction accounted for higher percentage in flower bud. In critical period of endodormancy release, 1644, significantly differentially expressed genes (DEGs) were identified from expression profile. DEGs related to oxidoreductase activity were especially abundant in Gene Ontology (GO) molecular function category. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis demonstrated that DEGs were involved in various metabolic processes, including phytohormone metabolism. Quantitative real-time PCR (qRT-PCR) analysis indicated that levels of DEGs for abscisic acid and gibberellin biosynthesis decreased while the abundance of DEGs encoding their degradation enzymes increased and GID1 was down-regulated. Concomitant with endodormancy release, MADS-box transcription factors including P. pseudocerasus dormancy-associated MADS-box (PpcDAM), Agamous-like2, and APETALA3-like genes, shown remarkably epigenetic roles. The newly generated transcriptome and gene expression profiling data provide valuable genetic information for revealing transcriptomic variation during bud dormancy in Chinese cherry. The uncovered data should be useful for future studies of bud dormancy in Prunus fruit trees lacking genomic information. Copyright © 2014 Elsevier B.V. All rights reserved.
A whole-blood transcriptome meta-analysis identifies gene expression signatures of cigarette smoking
Huan, Tianxiao; Joehanes, Roby; Schurmann, Claudia; Schramm, Katharina; Pilling, Luke C.; Peters, Marjolein J.; Mägi, Reedik; DeMeo, Dawn; O'Connor, George T.; Ferrucci, Luigi; Teumer, Alexander; Homuth, Georg; Biffar, Reiner; Völker, Uwe; Herder, Christian; Waldenberger, Melanie; Peters, Annette; Zeilinger, Sonja; Metspalu, Andres; Hofman, Albert; Uitterlinden, André G.; Hernandez, Dena G.; Singleton, Andrew B.; Bandinelli, Stefania; Munson, Peter J.; Lin, Honghuang; Benjamin, Emelia J.; Esko, Tõnu; Grabe, Hans J.; Prokisch, Holger; van Meurs, Joyce B.J.; Melzer, David; Levy, Daniel
2016-01-01
Abstract Cigarette smoking is a leading modifiable cause of death worldwide. We hypothesized that cigarette smoking induces extensive transcriptomic changes that lead to target-organ damage and smoking-related diseases. We performed a meta-analysis of transcriptome-wide gene expression using whole blood-derived RNA from 10,233 participants of European ancestry in six cohorts (including 1421 current and 3955 former smokers) to identify associations between smoking and altered gene expression levels. At a false discovery rate (FDR) <0.1, we identified 1270 differentially expressed genes in current vs. never smokers, and 39 genes in former vs. never smokers. Expression levels of 12 genes remained elevated up to 30 years after smoking cessation, suggesting that the molecular consequence of smoking may persist for decades. Gene ontology analysis revealed enrichment of smoking-related genes for activation of platelets and lymphocytes, immune response, and apoptosis. Many of the top smoking-related differentially expressed genes, including LRRN3 and GPR15, have DNA methylation loci in promoter regions that were recently reported to be hypomethylated among smokers. By linking differential gene expression with smoking-related disease phenotypes, we demonstrated that stroke and pulmonary function show enrichment for smoking-related gene expression signatures. Mediation analysis revealed the expression of several genes (e.g. ALAS2) to be putative mediators of the associations between smoking and inflammatory biomarkers (IL6 and C-reactive protein levels). Our transcriptomic study provides potential insights into the effects of cigarette smoking on gene expression in whole blood and their relations to smoking-related diseases. The results of such analyses may highlight attractive targets for treating or preventing smoking-related health effects. PMID:28158590
The Human Pancreas Proteome Defined by Transcriptomics and Antibody-Based Profiling
Fagerberg, Linn; Hallström, Björn M.; Schwenk, Jochen M.; Uhlén, Mathias; Korsgren, Olle; Lindskog, Cecilia
2014-01-01
The pancreas is composed of both exocrine glands and intermingled endocrine cells to execute its diverse functions, including enzyme production for digestion of nutrients and hormone secretion for regulation of blood glucose levels. To define the molecular constituents with elevated expression in the human pancreas, we employed a genome-wide RNA sequencing analysis of the human transcriptome to identify genes with elevated expression in the human pancreas. This quantitative transcriptomics data was combined with immunohistochemistry-based protein profiling to allow mapping of the corresponding proteins to different compartments and specific cell types within the pancreas down to the single cell level. Analysis of whole pancreas identified 146 genes with elevated expression levels, of which 47 revealed a particular higher expression as compared to the other analyzed tissue types, thus termed pancreas enriched. Extended analysis of in vitro isolated endocrine islets identified an additional set of 42 genes with elevated expression in these specialized cells. Although only 0.7% of all genes showed an elevated expression level in the pancreas, this fraction of transcripts, in most cases encoding secreted proteins, constituted 68% of the total mRNA in pancreas. This demonstrates the extreme specialization of the pancreas for production of secreted proteins. Among the elevated expression profiles, several previously not described proteins were identified, both in endocrine cells (CFC1, FAM159B, RBPJL and RGS9) and exocrine glandular cells (AQP12A, DPEP1, GATM and ERP27). In summary, we provide a global analysis of the pancreas transcriptome and proteome with a comprehensive list of genes and proteins with elevated expression in pancreas. This list represents an important starting point for further studies of the molecular repertoire of pancreatic cells and their relation to disease states or treatment effects. PMID:25546435
Zeng, Tao; Zhang, Liping; Li, Jinjun; Wang, Deqian; Tian, Yong; Lu, Lizhi
2015-05-01
High temperature is a major abiotic stress limiting animal growth and productivity worldwide. The Muscovy duck (Cairina moschata), sometimes called the Barbary drake, is a type of duck with a fairly unusual domestication history. In Southeast Asia, duck meat is one of the top meats consumed, and as such, the production of the meat is an important topic of research. The transcriptomic and genomic data presently available are insufficient to understanding the molecular mechanism underlying the heat tolerance of Muscovy ducks. Thus, transcriptome and expression profiling data for this species are required as important resource for identifying genes and developing molecular marker. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. More than 225 million clean reads were generated and assembled into 36,903 unique transcripts with an average length of 1,135 bp. A total of 21,221 (57.50 %) unigenes were annotated. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with transcription, signal transduction, and apoptosis. We also performed gene expression profiling analysis upon heat treatment in Muscovy ducks and identified 470 heat-response unique transcripts. GO term enrichment showed that protein folding and chaperone binding were significant enrichment, whereas KEGG pathway analyses showed that Ras and MAPKs were activated after heat stress in Muscovy ducks. Our research enriched sequences information of Muscovy duck, provided novel insights into responses to heat stress in these ducks, and serve as candidate genes or markers that can be used to guide future efforts to breed heat-tolerant duck strains.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ellinger-Ziegelbauer, Heidrun, E-mail: heidrun.ellinger-ziegelbauer@bayerhealthcare.com; Adler, Melanie; Amberg, Alexander
2011-04-15
The InnoMed PredTox consortium was formed to evaluate whether conventional preclinical safety assessment can be significantly enhanced by incorporation of molecular profiling ('omics') technologies. In short-term toxicological studies in rats, transcriptomics, proteomics and metabolomics data were collected and analyzed in relation to routine clinical chemistry and histopathology. Four of the sixteen hepato- and/or nephrotoxicants given to rats for 1, 3, or 14 days at two dose levels induced similar histopathological effects. These were characterized by bile duct necrosis and hyperplasia and/or increased bilirubin and cholestasis, in addition to hepatocyte necrosis and regeneration, hepatocyte hypertrophy, and hepatic inflammation. Combined analysis ofmore » liver transcriptomics data from these studies revealed common gene expression changes which allowed the development of a potential sequence of events on a mechanistic level in accordance with classical endpoint observations. This included genes implicated in early stress responses, regenerative processes, inflammation with inflammatory cell immigration, fibrotic processes, and cholestasis encompassing deregulation of certain membrane transporters. Furthermore, a preliminary classification analysis using transcriptomics data suggested that prediction of cholestasis may be possible based on gene expression changes seen at earlier time-points. Targeted bile acid analysis, based on LC-MS metabonomics data demonstrating increased levels of conjugated or unconjugated bile acids in response to individual compounds, did not provide earlier detection of toxicity as compared to conventional parameters, but may allow distinction of different types of hepatobiliary toxicity. Overall, liver transcriptomics data delivered mechanistic and molecular details in addition to the classical endpoint observations which were further enhanced by targeted bile acid analysis using LC/MS metabonomics.« less
Nejat, Naghmeh; Cahill, David M; Vadamalai, Ganesan; Ziemann, Mark; Rookes, James; Naderali, Neda
2015-10-01
Invasive phytoplasmas wreak havoc on coconut palms worldwide, leading to high loss of income, food insecurity and extreme poverty of farmers in producing countries. Phytoplasmas as strictly biotrophic insect-transmitted bacterial pathogens instigate distinct changes in developmental processes and defence responses of the infected plants and manipulate plants to their own advantage; however, little is known about the cellular and molecular mechanisms underlying host-phytoplasma interactions. Further, phytoplasma-mediated transcriptional alterations in coconut palm genes have not yet been identified. This study evaluated the whole transcriptome profiles of naturally infected leaves of Cocos nucifera ecotype Malayan Red Dwarf in response to yellow decline phytoplasma from group 16SrXIV, using RNA-Seq technique. Transcriptomics-based analysis reported here identified genes involved in coconut innate immunity. The number of down-regulated genes in response to phytoplasma infection exceeded the number of genes up-regulated. Of the 39,873 differentially expressed unigenes, 21,860 unigenes were suppressed and 18,013 were induced following infection. Comparative analysis revealed that genes associated with defence signalling against biotic stimuli were significantly overexpressed in phytoplasma-infected leaves versus healthy coconut leaves. Genes involving cell rescue and defence, cellular transport, oxidative stress, hormone stimulus and metabolism, photosynthesis reduction, transcription and biosynthesis of secondary metabolites were differentially represented. Our transcriptome analysis unveiled a core set of genes associated with defence of coconut in response to phytoplasma attack, although several novel defence response candidate genes with unknown function have also been identified. This study constitutes valuable sequence resource for uncovering the resistance genes and/or susceptibility genes which can be used as genetic tools in disease resistance breeding.
Artico, Sinara; Ribeiro-Alves, Marcelo; Oliveira-Neto, Osmundo Brilhante; de Macedo, Leonardo Lima Pepino; Silveira, Sylvia; Grossi-de-Sa, Maria Fátima; Martinelli, Adriana Pinheiro; Alves-Ferreira, Marcio
2014-10-04
Cotton is a major fibre crop grown worldwide that suffers extensive damage from chewing insects, including the cotton boll weevil larvae (Anthonomus grandis). Transcriptome analysis was performed to understand the molecular interactions between Gossypium hirsutum L. and cotton boll weevil larvae. The Illumina HiSeq 2000 platform was used to sequence the transcriptome of cotton flower buds infested with boll weevil larvae. The analysis generated a total of 327,489,418 sequence reads that were aligned to the G. hirsutum reference transcriptome. The total number of expressed genes was over 21,697 per sample with an average length of 1,063 bp. The DEGseq analysis identified 443 differentially expressed genes (DEG) in cotton flower buds infected with boll weevil larvae. Among them, 402 (90.7%) were up-regulated, 41 (9.3%) were down-regulated and 432 (97.5%) were identified as orthologues of A. thaliana genes using Blastx. Mapman analysis of DEG indicated that many genes were involved in the biotic stress response spanning a range of functions, from a gene encoding a receptor-like kinase to genes involved in triggering defensive responses such as MAPK, transcription factors (WRKY and ERF) and signalling by ethylene (ET) and jasmonic acid (JA) hormones. Furthermore, the spatial expression pattern of 32 of the genes responsive to boll weevil larvae feeding was determined by "in situ" qPCR analysis from RNA isolated from two flower structures, the stamen and the carpel, by laser microdissection (LMD). A large number of cotton transcripts were significantly altered upon infestation by larvae. Among the changes in gene expression, we highlighted the transcription of receptors/sensors that recognise chitin or insect oral secretions; the altered regulation of transcripts encoding enzymes related to kinase cascades, transcription factors, Ca2+ influxes, and reactive oxygen species; and the modulation of transcripts encoding enzymes from phytohormone signalling pathways. These data will aid in the selection of target genes to genetically engineer cotton to control the cotton boll weevil.
Marisch, Karoline; Bayer, Karl; Scharl, Theresa; Mairhofer, Juergen; Krempl, Peter M.; Hummel, Karin; Razzazi-Fazeli, Ebrahim; Striedner, Gerald
2013-01-01
Escherichia coli K–12 and B strains are among the most frequently used bacterial hosts for production of recombinant proteins on an industrial scale. To improve existing processes and to accelerate bioprocess development, we performed a detailed host analysis. We investigated the different behaviors of the E. coli production strains BL21, RV308, and HMS174 in response to high-glucose concentrations. Tightly controlled cultivations were conducted under defined environmental conditions for the in-depth analysis of physiological behavior. In addition to acquisition of standard process parameters, we also used DNA microarray analysis and differential gel electrophoresis (EttanTM DIGE). Batch cultivations showed different yields of the distinct strains for cell dry mass and growth rate, which were highest for BL21. In addition, production of acetate, triggered by excess glucose supply, was much higher for the K–12 strains compared to the B strain. Analysis of transcriptome data showed significant alteration in 347 of 3882 genes common among all three hosts. These differentially expressed genes included, for example, those involved in transport, iron acquisition, and motility. The investigation of proteome patterns additionally revealed a high number of differentially expressed proteins among the investigated hosts. The subsequently selected 38 spots included proteins involved in transport and motility. The results of this comprehensive analysis delivered a full genomic picture of the three investigated strains. Differentially expressed groups for targeted host modification were identified like glucose transport or iron acquisition, enabling potential optimization of strains to improve yield and process quality. Dissimilar growth profiles of the strains confirm different genotypes. Furthermore, distinct transcriptome patterns support differential regulation at the genome level. The identified proteins showed high agreement with the transcriptome data and suggest similar regulation within a host at both levels for the identified groups. Such host attributes need to be considered in future process design and operation. PMID:23950949
Marisch, Karoline; Bayer, Karl; Scharl, Theresa; Mairhofer, Juergen; Krempl, Peter M; Hummel, Karin; Razzazi-Fazeli, Ebrahim; Striedner, Gerald
2013-01-01
Escherichia coli K-12 and B strains are among the most frequently used bacterial hosts for production of recombinant proteins on an industrial scale. To improve existing processes and to accelerate bioprocess development, we performed a detailed host analysis. We investigated the different behaviors of the E. coli production strains BL21, RV308, and HMS174 in response to high-glucose concentrations. Tightly controlled cultivations were conducted under defined environmental conditions for the in-depth analysis of physiological behavior. In addition to acquisition of standard process parameters, we also used DNA microarray analysis and differential gel electrophoresis (Ettan(TM) DIGE). Batch cultivations showed different yields of the distinct strains for cell dry mass and growth rate, which were highest for BL21. In addition, production of acetate, triggered by excess glucose supply, was much higher for the K-12 strains compared to the B strain. Analysis of transcriptome data showed significant alteration in 347 of 3882 genes common among all three hosts. These differentially expressed genes included, for example, those involved in transport, iron acquisition, and motility. The investigation of proteome patterns additionally revealed a high number of differentially expressed proteins among the investigated hosts. The subsequently selected 38 spots included proteins involved in transport and motility. The results of this comprehensive analysis delivered a full genomic picture of the three investigated strains. Differentially expressed groups for targeted host modification were identified like glucose transport or iron acquisition, enabling potential optimization of strains to improve yield and process quality. Dissimilar growth profiles of the strains confirm different genotypes. Furthermore, distinct transcriptome patterns support differential regulation at the genome level. The identified proteins showed high agreement with the transcriptome data and suggest similar regulation within a host at both levels for the identified groups. Such host attributes need to be considered in future process design and operation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dimopoulou, Myrto, E-mail: myrto.dimopoulou@wur.nl
Differential gene expression analysis in the rat whole embryo culture (WEC) assay provides mechanistic insight into the embryotoxicity of test compounds. In our study, we hypothesized that comparative analysis of the transcriptomes of rat embryos exposed to six azoles (flusilazole, triadimefon, ketoconazole, miconazole, difenoconazole and prothioconazole) could lead to a better mechanism-based understanding of their embryotoxicity and pharmacological action. For evaluating embryotoxicity, we applied the total morphological scoring system (TMS) in embryos exposed for 48 h. The compounds tested showed embryotoxicity in a dose-response fashion. Functional analysis of differential gene expression after 4 h exposure at the ID{sub 10} (effectivemore » dose for 10% decreased TMS), revealed the sterol biosynthesis pathway and embryonic development genes, dominated by genes in the retinoic acid (RA) pathway, albeit in a differential way. Flusilazole, ketoconazole and triadimefon were the most potent compounds affecting the RA pathway, while in terms of regulation of sterol function, difenoconazole and ketoconazole showed the most pronounced effects. Dose-dependent analysis of the effects of flusilazole revealed that the RA pathway related genes were already differentially expressed at low dose levels while the sterol pathway showed strong regulation at higher embryotoxic doses, suggesting that this pathway is less predictive for the observed embryotoxicity. A similar analysis at the 24-hour time point indicated an additional time-dependent difference in the aforementioned pathways regulated by flusilazole. In summary, the rat WEC assay in combination with transcriptomics could add a mechanistic insight into the embryotoxic potency ranking and pharmacological mode of action of the tested compounds. - Highlights: • Embryonic exposure to azoles revealed concentration-dependent malformations. • Transcriptomics could enhance the mechanistic knowledge of embryotoxicants. • Retinoic acid gene set identifies early embryotoxic responses to azoles. • Toxic versus pharmacologic potency determines functional efficacy.« less
Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W.; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G.
2013-01-01
Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus. PMID:23671567
Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing.
Zhang, Jin; Ruhlman, Tracey A; Mower, Jeffrey P; Jansen, Robert K
2013-12-29
Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with broad coverage. In addition, results indicated no major improvements in breadth of coverage with data sets larger than six billion nucleotides or when sampling RNA from four tissue types rather than from a single tissue. Finally, this work demonstrates the power of cross-compartmental genomic analyses to deepen our understanding of the correlated evolution of the nuclear, plastid, and mitochondrial genomes in plants.
Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing
2013-01-01
Background Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Results Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. Conclusions The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with broad coverage. In addition, results indicated no major improvements in breadth of coverage with data sets larger than six billion nucleotides or when sampling RNA from four tissue types rather than from a single tissue. Finally, this work demonstrates the power of cross-compartmental genomic analyses to deepen our understanding of the correlated evolution of the nuclear, plastid, and mitochondrial genomes in plants. PMID:24373163
Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.
Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M
2010-12-15
Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.
NASA Astrophysics Data System (ADS)
Grange, Pascal
2015-09-01
The Allen Brain Atlas of the adult mouse (ABA) consists of digitized expression profiles of thousands of genes in the mouse brain, co-registered to a common three-dimensional template (the Allen Reference Atlas).This brain-wide, genome-wide data set has triggered a renaissance in neuroanatomy. Its voxelized version (with cubic voxels of side 200 microns) is available for desktop computation in MATLAB. On the other hand, brain cells exhibit a great phenotypic diversity (in terms of size, shape and electrophysiological activity), which has inspired the names of some well-studied cell types, such as granule cells and medium spiny neurons. However, no exhaustive taxonomy of brain cell is available. A genetic classification of brain cells is being undertaken, and some cell types have been chraracterized by their transcriptome profiles. However, given a cell type characterized by its transcriptome, it is not clear where else in the brain similar cells can be found. The ABA can been used to solve this region-specificity problem in a data-driven way: rewriting the brain-wide expression profiles of all genes in the atlas as a sum of cell-type-specific transcriptome profiles is equivalent to solving a quadratic optimization problem at each voxel in the brain. However, the estimated brain-wide densities of 64 cell types published recently were based on one series of co-registered coronal in situ hybridization (ISH) images per gene, whereas the online ABA contains several image series per gene, including sagittal ones. In the presented work, we simulate the variability of cell-type densities in a Monte Carlo way by repeatedly drawing a random image series for each gene and solving the optimization problem. This yields error bars on the region-specificity of cell types.
The Air Force In Silico -- Computational Biology in 2025
2007-11-01
and chromosome) these new fields are commonly referred to as “~omics.” Proteomics , transcriptomics, metabolomics , epigenomics, physiomics... Bioinformatics , 2006, http://journal.imbio.de/ http://www-bm.ipk-gatersleben.de/stable/php/ journal /articles/pdf/jib-22.pdf (accessed 30 September...Chirino, G. Tansley and I. Dryden, “The implications for Bioinformatics of integration across physical scales,” Journal of Integrative Bioinformatics
Molecular characteristics of the KCNJ5 mutated aldosterone-producing adenomas.
Murakami, Masanori; Yoshimoto, Takanobu; Nakabayashi, Kazuhiko; Nakano, Yujiro; Fukaishi, Takahiro; Tsuchiya, Kyoichiro; Minami, Isao; Bouchi, Ryotaro; Okamura, Kohji; Fujii, Yasuhisa; Hashimoto, Koshi; Hata, Ken-Ichiro; Kihara, Kazunori; Ogawa, Yoshihiro
2017-10-01
The pathophysiology of aldosterone-producing adenomas (APAs) has been investigated via genetic approaches and the pathogenic significance of a series of somatic mutations, including KCNJ5 , has been uncovered. However, how the mutational status of an APA is associated with its molecular characteristics, including its transcriptome and methylome, has not been fully understood. This study was undertaken to explore the molecular characteristics of APAs, specifically focusing on APAs with KCNJ5 mutations as opposed to those without KCNJ5 mutations, by comparing their transcriptome and methylome status. Cortisol-producing adenomas (CPAs) were used as reference. We conducted transcriptome and methylome analyses of 29 APAs with KCNJ5 mutations, 8 APAs without KCNJ5 mutations and 5 CPAs. Genome-wide gene expression and CpG methylation profiles were obtained from RNA and DNA samples extracted from these 42 adrenal tumors. Cluster analysis of the transcriptome and methylome revealed molecular heterogeneity in APAs depending on their mutational status. DNA hypomethylation and gene expression changes in Wnt signaling and inflammatory response pathways were characteristic of APAs with KCNJ5 mutations. Comparisons between transcriptome data from our APAs and that from normal adrenal cortex obtained from the Gene Expression Omnibus suggested similarities between APAs with KCNJ5 mutations and zona glomerulosa. The present study, which is based on transcriptome and methylome analyses, indicates the molecular heterogeneity of APAs depends on their mutational status. Here, we report the unique characteristics of APAs with KCNJ5 mutations. © 2017 Society for Endocrinology.
Biologic Phenotyping of the Human Small Airway Epithelial Response to Cigarette Smoking
Tilley, Ann E.; O'Connor, Timothy P.; Hackett, Neil R.; Strulovici-Barel, Yael; Salit, Jacqueline; Amoroso, Nancy; Zhou, Xi Kathy; Raman, Tina; Omberg, Larsson; Clark, Andrew; Mezey, Jason; Crystal, Ronald G.
2011-01-01
Background The first changes associated with smoking are in the small airway epithelium (SAE). Given that smoking alters SAE gene expression, but only a fraction of smokers develop chronic obstructive pulmonary disease (COPD), we hypothesized that assessment of SAE genome-wide gene expression would permit biologic phenotyping of the smoking response, and that a subset of healthy smokers would have a “COPD-like” SAE transcriptome. Methodology/Principal Findings SAE (10th–12th generation) was obtained via bronchoscopy of healthy nonsmokers, healthy smokers and COPD smokers and microarray analysis was used to identify differentially expressed genes. Individual responsiveness to smoking was quantified with an index representing the % of smoking-responsive genes abnormally expressed (ISAE), with healthy smokers grouped into “high” and “low” responders based on the proportion of smoking-responsive genes up- or down-regulated in each smoker. Smokers demonstrated significant variability in SAE transcriptome with ISAE ranging from 2.9 to 51.5%. While the SAE transcriptome of “low” responder healthy smokers differed from both “high” responders and smokers with COPD, the transcriptome of the “high” responder healthy smokers was indistinguishable from COPD smokers. Conclusion/Significance The SAE transcriptome can be used to classify clinically healthy smokers into subgroups with lesser and greater responses to cigarette smoking, even though these subgroups are indistinguishable by clinical criteria. This identifies a group of smokers with a “COPD-like” SAE transcriptome. PMID:21829517
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schwender, Jorg; Konig, Christina; Klapperstuck, Matthias
An attempt has been made to define the extent to which metabolic flux in central plant metabolism is reflected by changes in the transcriptome and metabolome, based on an analysis of in vitro cultured immature embryos of two oilseed rape (Brassica napus) accessions which contrast for seed lipid accumulation. Metabolic flux analysis (MFA) was used to constrain a flux balance metabolic model which included 671 biochemical and transport reactions within the central metabolism. This highly confident flux information was eventually used for comparative analysis of flux vs. transcript (metabolite). Metabolite profiling succeeded in identifying 79 intermediates within the central metabolism,more » some of which differed quantitatively between the two accessions and displayed a significant shift corresponding to flux. An RNA-Seq based transcriptome analysis revealed a large number of genes which were differentially transcribed in the two accessions, including some enzymes/proteins active in major metabolic pathways. With a few exceptions, differential activity in the major pathways (glycolysis, TCA cycle, amino acid, and fatty acid synthesis) was not reflected in contrasting abundances of the relevant transcripts. The conclusion was that transcript abundance on its own cannot be used to infer metabolic activity/fluxes in central plant metabolism. Lastly, this limitation needs to be borne in mind in evaluating transcriptome data and designing metabolic engineering experiments.« less
Xu, Jinhua; Zhang, Man; Liu, Guang; Yang, Xingping; Hou, Xilin
2016-12-01
Rootstock grafting may improve the resistance of watermelon plants to low temperatures. However, information regarding the molecular responses of rootstock grafted plants to chilling stress is limited. To elucidate the molecular mechanisms of chilling tolerance in grafted plants, the transcriptomic responses of grafted watermelon under chilling stress were analyzed using RNA-seq analysis. Sequencing data were used for digital gene expression (DGE) analysis to characterize the transcriptomic responses in grafted watermelon seedlings. A total of 702 differentially-expressed genes (DEGs) were found in rootstock grafted (RG) watermelon relative to self-grafted (SG) watermelon; among these genes, 522 genes were up-regulated and 180 were down-regulated. Additionally, 164 and 953 genes were found to specifically expressed in RG and SG seedlings under chilling stress, respectively. Functional annotations revealed that up-regulated DEGs are involved in protein processing, plant-pathogen interaction and the spliceosome, whereas down-regulated DEGs are associated with photosynthesis. Moreover, 13 DEGs were randomly selected for quantitative real time PCR (qRT-PCR) analysis. The expression profiles of these 13 DEGs were consistent with those detected by the DGE analysis, supporting the reliability of the DGE data. This work provides additional insight into the molecular basis of grafted watermelon responses to chilling stress. Copyright © 2016. Published by Elsevier Masson SAS.
Peng, Chuanhua; Wang, Xiaoping; Li, Fei; Lin, Yongjun
2012-01-01
The rice stem borer, Chilo suppressalis (Walker) (Lepidoptera: Pyralidae), is one of the most detrimental pests affecting rice crops. The use of Bacillus thuringiensis (Bt) toxins has been explored as a means to control this pest, but the potential for C. suppressalis to develop resistance to Bt toxins makes this approach problematic. Few C. suppressalis gene sequences are known, which makes in-depth study of gene function difficult. Herein, we sequenced the midgut transcriptome of the rice stem borer. In total, 37,040 contigs were obtained, with a mean size of 497 bp. As expected, the transcripts of C. suppressalis shared high similarity with arthropod genes. Gene ontology and KEGG analysis were used to classify the gene functions in C. suppressalis. Using the midgut transcriptome data, we conducted a proteome analysis to identify proteins expressed abundantly in the brush border membrane vesicles (BBMV). Of the 100 top abundant proteins that were excised and subjected to mass spectrometry analysis, 74 share high similarity with known proteins. Among these proteins, Western blot analysis showed that Aminopeptidase N and EH domain-containing protein have the binding activities with Bt-toxin Cry1Ac. These data provide invaluable information about the gene sequences of C. suppressalis and the proteins that bind with Cry1Ac. PMID:22666467
2010-01-01
Background Identification of genes with invariant levels of gene expression is a prerequisite for validating transcriptomic changes accompanying development. Ideally expression of these genes should be independent of the morphogenetic process or environmental condition tested as well as the methods used for RNA purification and analysis. Results In an effort to identify endogenous genes meeting these criteria nine reference genes (RG) were tested in two Petunia lines (Mitchell and V30). Growth conditions differed in Mitchell and V30, and different methods were used for RNA isolation and analysis. Four different software tools were employed to analyze the data. We merged the four outputs by means of a non-weighted unsupervised rank aggregation method. The genes identified as optimal for transcriptomic analysis of Mitchell and V30 were EF1α in Mitchell and CYP in V30, whereas the least suitable gene was GAPDH in both lines. Conclusions The least adequate gene turned out to be GAPDH indicating that it should be rejected as reference gene in Petunia. The absence of correspondence of the best-suited genes suggests that assessing reference gene stability is needed when performing normalization of data from transcriptomic analysis of flower and leaf development. PMID:20056000
Chatterjee, Shatakshee; Verma, Srikant Prasad; Pandey, Priyanka
2017-09-05
Initiation and progression of fluid filled cysts mark Autosomal Dominant Polycystic Kidney Disease (ADPKD). Thus, improved therapeutics targeting cystogenesis remains a constant challenge. Microarray studies in single ADPKD animal models species with limited sample sizes tend to provide scattered views on underlying ADPKD pathogenesis. Thus we aim to perform a cross species meta-analysis to profile conserved biological pathways that might be key targets for therapy. Nine ADPKD microarray datasets on rat, mice and human fulfilled our study criteria and were chosen. Intra-species combined analysis was performed after considering removal of batch effect. Significantly enriched GO biological processes and KEGG pathways were computed and their overlap was observed. For the conserved pathways, biological modules and gene regulatory networks were observed. Additionally, Gene Set Enrichment Analysis (GSEA) using Molecular Signature Database (MSigDB) was performed for genes found in conserved pathways. We obtained 28 modules of significantly enriched GO processes and 5 major functional categories from significantly enriched KEGG pathways conserved in human, mice and rats that in turn suggest a global transcriptomic perturbation affecting cyst - formation, growth and progression. Significantly enriched pathways obtained from up-regulated genes such as Genomic instability, Protein localization in ER and Insulin Resistance were found to regulate cyst formation and growth whereas cyst progression due to increased cell adhesion and inflammation was suggested by perturbations in Angiogenesis, TGF-beta, CAMs, and Infection related pathways. Additionally, networks revealed shared genes among pathways e.g. SMAD2 and SMAD7 in Endocytosis and TGF-beta. Our study suggests cyst formation and progression to be an outcome of interplay between a set of several key deregulated pathways. Thus, further translational research is warranted focusing on developing a combinatorial therapeutic approach for ADPKD redressal. Copyright © 2017 Elsevier B.V. All rights reserved.
2011-01-01
Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
van der Meulen, Sjoerd B; de Jong, Anne; Kok, Jan
2016-01-01
RNA sequencing has revolutionized genome-wide transcriptome analyses, and the identification of non-coding regulatory RNAs in bacteria has thus increased concurrently. Here we reveal the transcriptome map of the lactic acid bacterial paradigm Lactococcus lactis MG1363 by employing differential RNA sequencing (dRNA-seq) and a combination of manual and automated transcriptome mining. This resulted in a high-resolution genome annotation of L. lactis and the identification of 60 cis-encoded antisense RNAs (asRNAs), 186 trans-encoded putative regulatory RNAs (sRNAs) and 134 novel small ORFs. Based on the putative targets of asRNAs, a novel classification is proposed. Several transcription factor DNA binding motifs were identified in the promoter sequences of (a)sRNAs, providing insight in the interplay between lactococcal regulatory RNAs and transcription factors. The presence and lengths of 14 putative sRNAs were experimentally confirmed by differential Northern hybridization, including the abundant RNA 6S that is differentially expressed depending on the available carbon source. For another sRNA, LLMGnc_147, functional analysis revealed that it is involved in carbon uptake and metabolism. L. lactis contains 13% leaderless mRNAs (lmRNAs) that, from an analysis of overrepresentation in GO classes, seem predominantly involved in nucleotide metabolism and DNA/RNA binding. Moreover, an A-rich sequence motif immediately following the start codon was uncovered, which could provide novel insight in the translation of lmRNAs. Altogether, this first experimental genome-wide assessment of the transcriptome landscape of L. lactis and subsequent sRNA studies provide an extensive basis for the investigation of regulatory RNAs in L. lactis and related lactococcal species.
Transcriptome and proteomic analysis of mango (Mangifera indica Linn) fruits.
Wu, Hong-xia; Jia, Hui-min; Ma, Xiao-wei; Wang, Song-biao; Yao, Quan-sheng; Xu, Wen-tian; Zhou, Yi-gang; Gao, Zhong-shan; Zhan, Ru-lin
2014-06-13
Here we used Illumina RNA-seq technology for transcriptome sequencing of a mixed fruit sample from 'Zill' mango (Mangifera indica Linn) fruit pericarp and pulp during the development and ripening stages. RNA-seq generated 68,419,722 sequence reads that were assembled into 54,207 transcripts with a mean length of 858bp, including 26,413 clusters and 27,794 singletons. A total of 42,515(78.43%) transcripts were annotated using public protein databases, with a cut-off E-value above 10(-5), of which 35,198 and 14,619 transcripts were assigned to gene ontology terms and clusters of orthologous groups respectively. Functional annotation against the Kyoto Encyclopedia of Genes and Genomes database identified 23,741(43.79%) transcripts which were mapped to 128 pathways. These pathways revealed many previously unknown transcripts. We also applied mass spectrometry-based transcriptome data to characterize the proteome of ripe fruit. LC-MS/MS analysis of the mango fruit proteome was using tandem mass spectrometry (MS/MS) in an LTQ Orbitrap Velos (Thermo) coupled online to the HPLC. This approach enabled the identification of 7536 peptides that matched 2754 proteins. Our study provides a comprehensive sequence for a systemic view of transcriptome during mango fruit development and the most comprehensive fruit proteome to date, which are useful for further genomics research and proteomic studies. Our study provides a comprehensive sequence for a systemic view of both the transcriptome and proteome of mango fruit, and a valuable reference for further research on gene expression and protein identification. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.
De novo Assembly of Leaf Transcriptome in the Medicinal Plant Andrographis paniculata
Cherukupalli, Neeraja; Divate, Mayur; Mittapelli, Suresh R.; Khareedu, Venkateswara R.; Vudem, Dashavantha R.
2016-01-01
Andrographis paniculata is an important medicinal plant containing various bioactive terpenoids and flavonoids. Despite its importance in herbal medicine, no ready-to-use transcript sequence information of this plant is made available in the public data base, this study mainly deals with the sequencing of RNA from A. paniculata leaf using Illumina HiSeq™ 2000 platform followed by the de novo transcriptome assembly. A total of 189.22 million high quality paired reads were generated and 1,70,724 transcripts were predicted in the primary assembly. Secondary assembly generated a transcriptome size of ~88 Mb with 83,800 clustered transcripts. Based on the similarity searches against plant non-redundant protein database, gene ontology, and eukaryotic orthologous groups, 49,363 transcripts were annotated constituting upto 58.91% of the identified unigenes. Annotation of transcripts—using kyoto encyclopedia of genes and genomes database—revealed 5606 transcripts plausibly involved in 140 pathways including biosynthesis of terpenoids and other secondary metabolites. Transcription factor analysis showed 6767 unique transcripts belonging to 97 different transcription factor families. A total number of 124 CYP450 transcripts belonging to seven divergent clans have been identified. Transcriptome revealed 146 different transcripts coding for enzymes involved in the biosynthesis of terpenoids of which 35 contained terpene synthase motifs. This study also revealed 32,341 simple sequence repeats (SSRs) in 23,168 transcripts. Assembled sequences of transcriptome of A. paniculata generated in this study are made available, for the first time, in the TSA database, which provides useful information for functional and comparative genomic analysis besides identification of key enzymes involved in the various pathways of secondary metabolism. PMID:27582746
Genomic and transcriptomic predictors of triglyceride response to regular exercise
Sarzynski, Mark A; Davidsen, Peter K; Sung, Yun Ju; Hesselink, Matthijs K C; Schrauwen, Patrick; Rice, Treva K; Rao, D C; Falciani, Francesco; Bouchard, Claude
2015-01-01
Aim We performed genome-wide and transcriptome-wide profiling to identify genes and single nucleotide polymorphisms (SNPs) associated with the response of triglycerides (TG) to exercise training. Methods Plasma TG levels were measured before and after a 20-week endurance training programme in 478 white participants from the HERITAGE Family Study. Illumina HumanCNV370-Quad v3.0 BeadChips were genotyped using the Illumina BeadStation 500GX platform. Affymetrix HG-U133+2 arrays were used to quantitate gene expression levels from baseline muscle biopsies of a subset of participants (N=52). Genome-wide association study (GWAS) analysis was performed using MERLIN, while transcriptomic predictor models were developed using the R-package GALGO. Results The GWAS results showed that eight SNPs were associated with TG training-response (ΔTG) at p<9.9×10−6, while another 31 SNPs showed p values <1×10−4. In multivariate regression models, the top 10 SNPs explained 32.0% of the variance in ΔTG, while conditional heritability analysis showed that four SNPs statistically accounted for all of the heritability of ΔTG. A molecular signature based on the baseline expression of 11 genes predicted 27% of ΔTG in HERITAGE, which was validated in an independent study. A composite SNP score based on the top four SNPs, each from the genomic and transcriptomic analyses, was the strongest predictor of ΔTG (R2=0.14, p=3.0×10−68). Conclusions Our results indicate that skeletal muscle transcript abundance at 11 genes and SNPs at a number of loci contribute to TG response to exercise training. Combining data from genomics and transcriptomics analyses identified a SNP-based gene signature that should be further tested in independent samples. PMID:26491034
Almazan, Eugene Matthew P.; Lesko, Sydney L.; Markey, Michael P.; Rouhana, Labib
2017-01-01
Planarian flatworms are popular models for the study of regeneration and stem cell biology in vivo. Technical advances and increased availability of genetic information have fueled the discovery of molecules responsible for stem cell pluripotency and regeneration in flatworms. Unfortunately, most of the planarian research performed worldwide utilizes species that are not natural habitants of North America, which limits their availability to newcomer laboratories and impedes their distribution for educational activities. In order to circumvent these limitations and increase the genetic information available for comparative studies, we sequenced the transcriptome of Girardia dorotocephala, a planarian species pandemic and commercially available in North America. A total of 254,802,670 paired sequence reads were obtained from RNA extracted from intact individuals, regenerating fragments, as well as freshly excised auricles of a clonal line of G. dorotocephala (MA-C2), and used for de novo assembly of its transcriptome. The resulting transcriptome draft was validated through functional analysis of genetic markers of stem cells and their progeny in G. dorotocephala. Akin to orthologs in other planarian species, G. dorotocephala Piwi1 (GdPiwi1) was found to be a robust marker of the planarian stem cell population and GdPiwi2 an essential component for stem cell-driven regeneration. Identification of G. dorotocephala homologs of the early stem cell descendent marker PROG-1 revealed a family of lysine-rich proteins expressed during epithelial cell differentiation. Sequences from the MA-C2 transcriptome were found to be 98–99% identical to nucleotide sequences from G. dorotocephala populations with different chromosomal number, demonstrating strong conservation regardless of karyotype evolution. Altogether, this work establishes G. dorotocephala as a viable and accessible option for analysis of gene function in North America. PMID:28774726
Transcriptome analysis of Pinus halepensis under drought stress and during recovery
Fox, Hagar; Doron-Faigenboim, Adi; Kelly, Gilor; Bourstein, Ronny; Attia, Ziv; Zhou, Jing; Moshe, Yosef; Moshelion, Menachem; David-Schwartz, Rakefet
2018-01-01
Abstract Forest trees use various strategies to cope with drought stress and these strategies involve complex molecular mechanisms. Pinus halepensis Miller (Aleppo pine) is found throughout the Mediterranean basin and is one of the most drought-tolerant pine species. In order to decipher the molecular mechanisms that P. halepensis uses to withstand drought, we performed large-scale physiological and transcriptome analyses. We selected a mature tree from a semi-arid area with suboptimal growth conditions for clonal propagation through cuttings. We then used a high-throughput experimental system to continuously monitor whole-plant transpiration rates, stomatal conductance and the vapor pressure deficit. The transcriptomes of plants were examined at six physiological stages: pre-stomatal response, partial stomatal closure, minimum transpiration, post-irrigation, partial recovery and full recovery. At each stage, data from plants exposed to the drought treatment were compared with data collected from well-irrigated control plants. A drought-stressed P. halepensis transcriptome was created using paired-end RNA-seq. In total, ~6000 differentially expressed, non-redundant transcripts were identified between drought-treated and control trees. Cluster analysis has revealed stress-induced down-regulation of transcripts related to photosynthesis, reactive oxygen species (ROS)-scavenging through the ascorbic acid (AsA)-glutathione cycle, fatty acid and cell wall biosynthesis, stomatal activity, and the biosynthesis of flavonoids and terpenoids. Up-regulated processes included chlorophyll degradation, ROS-scavenging through AsA-independent thiol-mediated pathways, abscisic acid response and accumulation of heat shock proteins, thaumatin and exordium. Recovery from drought induced strong transcription of retrotransposons, especially the retrovirus-related transposon Tnt1-94. The drought-related transcriptome illustrates this species’ dynamic response to drought and recovery and unravels novel mechanisms. PMID:29177514
Lemaître, Chloé; Bidet, Philippe; Bingen, Edouard; Bonacorsi, Stéphane
2012-06-21
The sequenced O45:K1:H7 Escherichia coli meningitis strain S88 harbors a large virulence plasmid. To identify possible genetic determinants of pS88 virulence, we examined the transcriptomes of 88 plasmidic ORFs corresponding to known and putative virulence genes, and 35 ORFs of unknown function. Quantification of plasmidic transcripts was obtained by quantitative real-time reverse transcription of extracted RNA, normalized on three housekeeping genes. The transcriptome of E. coli strain S88 grown in human serum and urine ex vivo were compared to that obtained during growth in Luria Bertani broth, with and without iron depletion. We also analyzed the transcriptome of a pS88-like plasmid recovered from a neonate with urinary tract infection. The transcriptome obtained after ex vivo growth in serum and urine was very similar to those obtained in iron-depleted LB broth. Genes encoding iron acquisition systems were strongly upregulated. ShiF and ORF 123, two ORFs encoding protein with hypothetical function and physically linked to aerobactin and salmochelin loci, respectively, were also highly expressed in iron-depleted conditions and may correspond to ancillary iron acquisition genes. Four ORFs were induced ex vivo, independently of the iron concentration. Other putative virulence genes such as iss, etsC, ompTp and hlyF were not upregulated in any of the conditions studied. Transcriptome analysis of the pS88-like plasmid recovered in vivo showed a similar pattern of induction but at much higher levels. We identify new pS88 genes potentially involved in the growth of E. coli meningitis strain S88 in human serum and urine.
Transcriptome analysis of Pinus halepensis under drought stress and during recovery.
Fox, Hagar; Doron-Faigenboim, Adi; Kelly, Gilor; Bourstein, Ronny; Attia, Ziv; Zhou, Jing; Moshe, Yosef; Moshelion, Menachem; David-Schwartz, Rakefet
2018-03-01
Forest trees use various strategies to cope with drought stress and these strategies involve complex molecular mechanisms. Pinus halepensis Miller (Aleppo pine) is found throughout the Mediterranean basin and is one of the most drought-tolerant pine species. In order to decipher the molecular mechanisms that P. halepensis uses to withstand drought, we performed large-scale physiological and transcriptome analyses. We selected a mature tree from a semi-arid area with suboptimal growth conditions for clonal propagation through cuttings. We then used a high-throughput experimental system to continuously monitor whole-plant transpiration rates, stomatal conductance and the vapor pressure deficit. The transcriptomes of plants were examined at six physiological stages: pre-stomatal response, partial stomatal closure, minimum transpiration, post-irrigation, partial recovery and full recovery. At each stage, data from plants exposed to the drought treatment were compared with data collected from well-irrigated control plants. A drought-stressed P. halepensis transcriptome was created using paired-end RNA-seq. In total, ~6000 differentially expressed, non-redundant transcripts were identified between drought-treated and control trees. Cluster analysis has revealed stress-induced down-regulation of transcripts related to photosynthesis, reactive oxygen species (ROS)-scavenging through the ascorbic acid (AsA)-glutathione cycle, fatty acid and cell wall biosynthesis, stomatal activity, and the biosynthesis of flavonoids and terpenoids. Up-regulated processes included chlorophyll degradation, ROS-scavenging through AsA-independent thiol-mediated pathways, abscisic acid response and accumulation of heat shock proteins, thaumatin and exordium. Recovery from drought induced strong transcription of retrotransposons, especially the retrovirus-related transposon Tnt1-94. The drought-related transcriptome illustrates this species' dynamic response to drought and recovery and unravels novel mechanisms.
2011-01-01
Background The carnivorous plant Utricularia gibba (bladderwort) is remarkable in having a minute genome, which at ca. 80 megabases is approximately half that of Arabidopsis. Bladderworts show an incredible diversity of forms surrounding a defined theme: tiny, bladder-like suction traps on terrestrial, epiphytic, or aquatic plants with a diversity of unusual vegetative forms. Utricularia plants, which are rootless, are also anomalous in physiological features (respiration and carbon distribution), and highly enhanced molecular evolutionary rates in chloroplast, mitochondrial and nuclear ribosomal sequences. Despite great interest in the genus, no genomic resources exist for Utricularia, and the substitution rate increase has received limited study. Results Here we describe the sequencing and analysis of the Utricularia gibba transcriptome. Three different organs were surveyed, the traps, the vegetative shoot bodies, and the inflorescence stems. We also examined the bladderwort transcriptome under diverse stress conditions. We detail aspects of functional classification, tissue similarity, nitrogen and phosphorus metabolism, respiration, DNA repair, and detoxification of reactive oxygen species (ROS). Long contigs of plastid and mitochondrial genomes, as well as sequences for 100 individual nuclear genes, were compared with those of other plants to better establish information on molecular evolutionary rates. Conclusion The Utricularia transcriptome provides a detailed genomic window into processes occurring in a carnivorous plant. It contains a deep representation of the complex metabolic pathways that characterize a putative minimal plant genome, permitting its use as a source of genomic information to explore the structural, functional, and evolutionary diversity of the genus. Vegetative shoots and traps are the most similar organs by functional classification of their transcriptome, the traps expressing hydrolytic enzymes for prey digestion that were previously thought to be encoded by bacteria. Supporting physiological data, global gene expression analysis shows that traps significantly over-express genes involved in respiration and that phosphate uptake might occur mainly in traps, whereas nitrogen uptake could in part take place in vegetative parts. Expression of DNA repair and ROS detoxification enzymes may be indicative of a response to increased respiration. Finally, evidence from the bladderwort transcriptome, direct measurement of ROS in situ, and cross-species comparisons of organellar genomes and multiple nuclear genes supports the hypothesis that increased nucleotide substitution rates throughout the plant may be due to the mutagenic action of amplified ROS production. PMID:21639913
paraGSEA: a scalable approach for large-scale gene expression profiling
Peng, Shaoliang; Yang, Shunyun
2017-01-01
Abstract More studies have been conducted using gene expression similarity to identify functional connections among genes, diseases and drugs. Gene Set Enrichment Analysis (GSEA) is a powerful analytical method for interpreting gene expression data. However, due to its enormous computational overhead in the estimation of significance level step and multiple hypothesis testing step, the computation scalability and efficiency are poor on large-scale datasets. We proposed paraGSEA for efficient large-scale transcriptome data analysis. By optimization, the overall time complexity of paraGSEA is reduced from O(mn) to O(m+n), where m is the length of the gene sets and n is the length of the gene expression profiles, which contributes more than 100-fold increase in performance compared with other popular GSEA implementations such as GSEA-P, SAM-GS and GSEA2. By further parallelization, a near-linear speed-up is gained on both workstations and clusters in an efficient manner with high scalability and performance on large-scale datasets. The analysis time of whole LINCS phase I dataset (GSE92742) was reduced to nearly half hour on a 1000 node cluster on Tianhe-2, or within 120 hours on a 96-core workstation. The source code of paraGSEA is licensed under the GPLv3 and available at http://github.com/ysycloud/paraGSEA. PMID:28973463
Systems biology of cancer biomarker detection.
Mitra, Sanga; Das, Smarajit; Chakrabarti, Jayprokas
2013-01-01
Cancer systems-biology is an ever-growing area of research due to explosion of data; how to mine these data and extract useful information is the problem. To have an insight on carcinogenesis one need to systematically mine several resources, such as databases, microarray and next-generation sequences. This review encompasses management and analysis of cancer data, databases construction and data deposition, whole transcriptome and genome comparison, analysing results from high throughput experiments to uncover cellular pathways and molecular interactions, and the design of effective algorithms to identify potential biomarkers. Recent technical advances such as ChIP-on-chip, ChIP-seq and RNA-seq can be applied to get epigenetic information transformed into a high-throughput endeavour to which systems biology and bioinformatics are making significant inroads. The data from ENCODE and GENCODE projects available through UCSC genome browser can be considered as benchmark for comparison and meta-analysis. A pipeline for integrating next generation sequencing data, microarray data, and putting them together with the existing database is discussed. The understanding of cancer genomics is changing the way we approach cancer diagnosis and treatment. To give a better understanding of utilizing available resources' we have chosen oral cancer to show how and what kind of analysis can be done. This review is a computational genomic primer that provides a bird's eye view of computational and bioinformatics' tools currently available to perform integrated genomic and system biology analyses of several carcinoma.
2013-01-01
Background S. erythraea is a Gram-positive filamentous bacterium used for the industrial-scale production of erythromycin A which is of high clinical importance. In this work, we sequenced the whole genome of a high-producing strain (E3) obtained by random mutagenesis and screening from the wild-type strain NRRL23338, and examined time-series expression profiles of both E3 and NRRL23338. Based on the genomic data and transcriptpmic data of these two strains, we carried out comparative analysis of high-producing strain and wild-type strain at both the genomic level and the transcriptomic level. Results We observed a large number of genetic variants including 60 insertions, 46 deletions and 584 single nucleotide variations (SNV) in E3 in comparison with NRRL23338, and the analysis of time series transcriptomic data indicated that the genes involved in erythromycin biosynthesis and feeder pathways were significantly up-regulated during the 60 hours time-course. According to our data, BldD, a previously identified ery cluster regulator, did not show any positive correlations with the expression of ery cluster, suggesting the existence of alternative regulation mechanisms of erythromycin synthesis in S. erythraea. Several potential regulators were then proposed by integration analysis of genomic and transcriptomic data. Conclusion This is a demonstration of the functional comparative genomics between an industrial S. erythraea strain and the wild-type strain. These findings help to understand the global regulation mechanisms of erythromycin biosynthesis in S. erythraea, providing useful clues for genetic and metabolic engineering in the future. PMID:23902230
2012-01-01
Background The use of growth-promoters in beef cattle, despite the EU ban, remains a frequent practice. The use of transcriptomic markers has already proposed to identify indirect evidence of anabolic hormone treatment. So far, such approach has been tested in experimentally treated animals. Here, for the first time commercial samples were analyzed. Results Quantitative determination of Dexamethasone (DEX) residues in the urine collected at the slaughterhouse was performed by Liquid Chromatography-Mass Spectrometry (LC-MS). DNA-microarray technology was used to obtain transcriptomic profiles of skeletal muscle in commercial samples and negative controls. LC-MS confirmed the presence of low level of DEX residues in the urine of the commercial samples suspect for histological classification. Principal Component Analysis (PCA) on microarray data identified two clusters of samples. One cluster included negative controls and a subset of commercial samples, while a second cluster included part of the specimens collected at the slaughterhouse together with positives for corticosteroid treatment based on thymus histology and LC-MS. Functional analysis of the differentially expressed genes (3961) between the two groups provided further evidence that animals clustering with positive samples might have been treated with corticosteroids. These suspect samples could be reliably classified with a specific classification tool (Prediction Analysis of Microarray) using just two genes. Conclusions Despite broad variation observed in gene expression profiles, the present study showed that DNA-microarrays can be used to find transcriptomic signatures of putative anabolic treatments and that gene expression markers could represent a useful screening tool. PMID:23110699
Wang, Guanglu; Shi, Ting; Chen, Tao; Wang, Xiaoyue; Wang, Yongcheng; Liu, Dingyu; Guo, Jiaxin; Fu, Jing; Feng, Lili; Wang, Zhiwen; Zhao, Xueming
2018-06-02
Commercial riboflavin production with Bacillus subtilis has been developed by combining rational and classical strain development for almost two decades, but how an improved riboflavin producer can be created rationally is still not completely understood. In this study, we demonstrate the combined use of integrated genomic and transcriptomic analysis of the genetic basis for riboflavin over-production in B. subtilis. This methodology succeeded in discerning the positive mutations in the mutagenesis derived riboflavin producer B. subtilis 24/pMX45 through whole-genome sequencing and transcriptome sequencing. These included RibC (G199D), ribD + (G+39A), PurA (P242L), CcpN(A44S), YvrH (R222Q) and two nonsense mutations YhcF (R90*) and YwaA (Q68*). Reintroducing these specific mutations into the wild-type strain recovered the riboflavin overproduction phenotype and subsequent metabolic engineering greatly improved riboflavin production, achieving an up to 3.4-fold increase of the riboflavin titer over the sequenced producer. A novel mutation, YvrH (R222Q), involved in a typical two-component regulatory system deregulated the purine de novo synthesis pathway and increased the pool of intracellular purine metabolites, which in turn increased riboflavin production. Taken together, we present a case study of combining genome and transcriptome analysis to elucidate the genetic underpinnings of a complex cellular property, which enabled the transfer of beneficial mutations to engineer a reference strain into an overproducer. Copyright © 2018 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Kervezee, Laura; Cuesta, Marc; Cermakian, Nicolas; Boivin, Diane B
2018-05-22
Misalignment of the endogenous circadian timing system leads to disruption of physiological rhythms and may contribute to the development of the deleterious health effects associated with night shift work. However, the molecular underpinnings remain to be elucidated. Here, we investigated the effect of a 4-day simulated night shift work protocol on the circadian regulation of the human transcriptome. Repeated blood samples were collected over two 24-hour measurement periods from eight healthy subjects under highly controlled laboratory conditions before and 4 days after a 10-hour delay of their habitual sleep period. RNA was extracted from peripheral blood mononuclear cells to obtain transcriptomic data. Cosinor analysis revealed a marked reduction of significantly rhythmic transcripts in the night shift condition compared with baseline at group and individual levels. Subsequent analysis using a mixed-effects model selection approach indicated that this decrease is mainly due to dampened rhythms rather than to a complete loss of rhythmicity: 73% of transcripts rhythmically expressed at baseline remained rhythmic during the night shift condition with a similar phase relative to habitual bedtimes, but with lower amplitudes. Functional analysis revealed that key biological processes are affected by the night shift protocol, most notably the natural killer cell-mediated immune response and Jun/AP1 and STAT pathways. These results show that 4 days of simulated night shifts leads to a loss in temporal coordination between the human circadian transcriptome and the external environment and impacts biological processes related to the adverse health effects associated to night shift work.
Transcriptome analysis of sika deer in China.
Jia, Bo-Yin; Ba, Heng-Xing; Wang, Gui-Wu; Yang, Ying; Cui, Xue-Zhe; Peng, Ying-Hua; Zheng, Jun-Jun; Xing, Xiu-Mei; Yang, Fu-He
2016-10-01
Sika deer is of great commercial value because their antlers are used in tonics and alternative medicine and their meat is healthy and delicious. The goal of this study was to generate transcript sequences from sika deer for functional genomic analyses and to identify the transcripts that demonstrate tissue-specific, age-dependent differential expression patterns. These sequences could enhance our understanding of the molecular mechanisms underlying sika deer growth and development. In the present study, we performed de novo transcriptome assembly and profiling analysis across ten tissue types and four developmental stages (juvenile, adolescent, adult, and aged) of sika deer, using Illumina paired-end tag (PET) sequencing technology. A total of 1,752,253 contigs with an average length of 799 bp were generated, from which 1,348,618 unigenes with an average length of 590 bp were defined. Approximately 33.2 % of these (447,931 unigenes) were then annotated in public protein databases. Many sika deer tissue-specific, age-dependent unigenes were identified. The testes have the largest number of tissue-enriched unigenes, and some of them were prone to develop new functions for other tissues. Additionally, our transcriptome revealed that the juvenile-adolescent transition was the most complex and important stage of the sika deer life cycle. The present work represents the first multiple tissue transcriptome analysis of sika deer across four developmental stages. The generated data not only provide a functional genomics resource for future biological research on sika deer but also guide the selection and manipulation of genes controlling growth and development.
Dissecting the Root Nodule Transcriptome of Chickpea (Cicer arietinum L.)
Kant, Chandra; Pradhan, Seema; Bhatia, Sabhyata
2016-01-01
A hallmark trait of chickpea (Cicer arietinum L.), like other legumes, is the capability to convert atmospheric nitrogen (N2) into ammonia (NH3) in symbiotic association with Mesorhizobium ciceri. However, the complexity of molecular networks associated with the dynamics of nodule development in chickpea need to be analyzed in depth. Hence, in order to gain insights into the chickpea nodule development, the transcriptomes of nodules at early, middle and late stages of development were sequenced using the Roche 454 platform. This generated 490.84 Mb sequence data comprising 1,360,251 reads which were assembled into 83,405 unigenes. Transcripts were annotated using Gene Ontology (GO), Cluster of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways analysis. Differential expression analysis revealed that a total of 3760 transcripts were differentially expressed in at least one of three stages, whereas 935, 117 and 2707 transcripts were found to be differentially expressed in the early, middle and late stages of nodule development respectively. MapMan analysis revealed enrichment of metabolic pathways such as transport, protein synthesis, signaling and carbohydrate metabolism during root nodulation. Transcription factors were predicted and analyzed for their differential expression during nodule development. Putative nodule specific transcripts were identified and enriched for GO categories using BiNGO which revealed many categories to be enriched during nodule development, including transcription regulators and transporters. Further, the assembled transcriptome was also used to mine for genic SSR markers. In conclusion, this study will help in enriching the transcriptomic resources implicated in understanding of root nodulation events in chickpea. PMID:27348121
Bielecka, Monika; Watanabe, Mutsumi; Morcuende, Rosa; Scheible, Wolf-Rüdiger; Hawkesford, Malcolm J.; Hesse, Holger; Hoefgen, Rainer
2015-01-01
Sulfur is an essential macronutrient for plant growth and development. Reaching a thorough understanding of the molecular basis for changes in plant metabolism depending on the sulfur-nutritional status at the systems level will advance our basic knowledge and help target future crop improvement. Although the transcriptional responses induced by sulfate starvation have been studied in the past, knowledge of the regulation of sulfur metabolism is still fragmentary. This work focuses on the discovery of candidates for regulatory genes such as transcription factors (TFs) using ‘omics technologies. For this purpose a short term sulfate-starvation/re-supply approach was used. ATH1 microarray studies and metabolite determinations yielded 21 TFs which responded more than 2-fold at the transcriptional level to sulfate starvation. Categorization by response behaviors under sulfate-starvation/re-supply and other nutrient starvations such as nitrate and phosphate allowed determination of whether the TF genes are specific for or common between distinct mineral nutrient depletions. Extending this co-behavior analysis to the whole transcriptome data set enabled prediction of putative downstream genes. Additionally, combinations of transcriptome and metabolome data allowed identification of relationships between TFs and downstream responses, namely, expression changes in biosynthetic genes and subsequent metabolic responses. Effect chains on glucosinolate and polyamine biosynthesis are discussed in detail. The knowledge gained from this study provides a blueprint for an integrated analysis of transcriptomics and metabolomics and application for the identification of uncharacterized genes. PMID:25674096
Allen, Alexandra M; Lexer, Christian; Hiscock, Simon J
2010-11-01
Fertilization in angiosperms depends on a complex cellular "courtship" between haploid pollen and diploid pistil. These pollen-pistil interactions are regulated by a diversity of molecules, many of which remain to be identified and characterized. Thus, it is unclear to what extent these processes are conserved among angiosperms, a fact confounded by limited sampling across taxa. Here, we report the analysis of pistil-expressed genes in Senecio squalidus (Asteraceae), a species from euasterid II, a major clade for which there are currently no data on pistil-expressed genes. Species from the Asteraceae characteristically have a "semidry stigma," intermediate between the "wet" and "dry" stigmas typical of the majority of angiosperms. Construction of pistil-enriched cDNA libraries for S. squalidus allowed us to address two hypotheses: (1) stigmas of S. squalidus will express genes common to wet and dry stigmas and genes specific to the semidry stigma characteristic of the Asteraceae; and (2) genes potentially essential for pistil function will be conserved between diverse angiosperm groups and therefore common to all currently available pistil transcriptome data sets, including S. squalidus. Our data support both these hypotheses. The S. squalidus pistil transcriptome contains novel genes and genes previously identified in pistils of species with dry stigmas and wet stigmas. Comparative analysis of the five pistil transcriptomes currently available (Oryza sativa, Crocus sativus, Arabidopsis thaliana, Nicotiana tabacum, and S. squalidus), representing four major angiosperm clades and the three stigma states, identified novel genes and conserved genes potentially regulating pollen-pistil interaction pathways common to monocots and eudicots.
Gallardo-Escárate, Cristian; Valenzuela-Muñoz, Valentina; Nuñez-Acuña, Gustavo
2014-01-01
Despite the economic and environmental impacts that sea lice infestations have on salmon farming worldwide, genomic data generated by high-throughput transcriptome sequencing for different developmental stages, sexes, and strains of sea lice is still limited or unknown. In this study, RNA-seq analysis was performed using de novo transcriptome assembly as a reference for evidenced transcriptional changes from six developmental stages of the salmon louse Caligus rogercresseyi. EST-datasets were generated from the nauplius I, nauplius II, copepodid and chalimus stages and from female and male adults using MiSeq Illumina sequencing. A total of 151,788,682 transcripts were yielded, which were assembled into 83,444 high quality contigs and subsequently annotated into roughly 24,000 genes based on known proteins. To identify differential transcription patterns among salmon louse stages, cluster analyses were performed using normalized gene expression values. Herein, four clusters were differentially expressed between nauplius I–II and copepodid stages (604 transcripts), five clusters between copepodid and chalimus stages (2,426 transcripts), and six clusters between female and male adults (2,478 transcripts). Gene ontology analysis revealed that the nauplius I–II, copepodid and chalimus stages are mainly annotated to aminoacid transfer/repair/breakdown, metabolism, molting cycle, and nervous system development. Additionally, genes showing differential transcription in female and male adults were highly related to cytoskeletal and contractile elements, reproduction, cell development, morphogenesis, and transcription-translation processes. The data presented in this study provides the most comprehensive transcriptome resource available for C. rogercresseyi, which should be used for future genomic studies linked to host-parasite interactions. PMID:24691066
Gallardo-Escárate, Cristian; Valenzuela-Muñoz, Valentina; Nuñez-Acuña, Gustavo
2014-01-01
Despite the economic and environmental impacts that sea lice infestations have on salmon farming worldwide, genomic data generated by high-throughput transcriptome sequencing for different developmental stages, sexes, and strains of sea lice is still limited or unknown. In this study, RNA-seq analysis was performed using de novo transcriptome assembly as a reference for evidenced transcriptional changes from six developmental stages of the salmon louse Caligus rogercresseyi. EST-datasets were generated from the nauplius I, nauplius II, copepodid and chalimus stages and from female and male adults using MiSeq Illumina sequencing. A total of 151,788,682 transcripts were yielded, which were assembled into 83,444 high quality contigs and subsequently annotated into roughly 24,000 genes based on known proteins. To identify differential transcription patterns among salmon louse stages, cluster analyses were performed using normalized gene expression values. Herein, four clusters were differentially expressed between nauplius I-II and copepodid stages (604 transcripts), five clusters between copepodid and chalimus stages (2,426 transcripts), and six clusters between female and male adults (2,478 transcripts). Gene ontology analysis revealed that the nauplius I-II, copepodid and chalimus stages are mainly annotated to aminoacid transfer/repair/breakdown, metabolism, molting cycle, and nervous system development. Additionally, genes showing differential transcription in female and male adults were highly related to cytoskeletal and contractile elements, reproduction, cell development, morphogenesis, and transcription-translation processes. The data presented in this study provides the most comprehensive transcriptome resource available for C. rogercresseyi, which should be used for future genomic studies linked to host-parasite interactions.
Gu, Li; Zhang, Zhong-Yi; Quan, Hong; Li, Ming-Jie; Zhao, Fang-Yu; Xu, Yuan-Jiang; Liu, Jiang; Sai, Man; Zheng, Wei-Lie; Lan, Xiao-Zhong
2018-06-01
Mirabilis himalaica (Edgew.) Heimerl is among the most important genuine medicinal plants in Tibet. However, the biosynthesis mechanisms of the active compounds in this species are unclear, severely limiting its application. To clarify the molecular biosynthesis mechanism of the key representative active compounds, specifically rotenoid, which is of special medicinal value for M. himalaica, RNA sequencing and TOF-MS technologies were used to construct transcriptomic and metabolomic libraries from the roots, stems, and leaves of M. himalaica plants collected from their natural habitat. As a result, each of the transcriptomic libraries from the different tissues was sequenced, generating more than 10 Gb of clean data ultimately assembled into 147,142 unigenes. In the three tissues, metabolomic analysis identified 522 candidate compounds, of which 170 metabolites involved in 114 metabolic pathways were mapped to the KEGG. Of these genes, 61 encoding enzymes were identified to function at key steps of the pathways related to rotenoid biosynthesis, where 14 intermediate metabolites were also located. An integrated analysis of metabolic and transcriptomic data revealed that most of the intermediate metabolites and enzymes related to rotenoid biosynthesis were synthesized in the roots, stems and leaves of M. himalaica, which suggested that the use of non-medicinal tissues to extract compounds was feasible. In addition, the CHS and CHI genes were found to play important roles in rotenoid biosynthesis, especially, since CHS might be an important rate-limiting enzyme. This study provides a hypothetical basis for the screening of new active metabolites and the metabolic engineering of rotenoid in M. himalaica.
Transcriptome and proteome analysis of Eucalyptus infected with Calonectria pseudoreteaudii.
Chen, Quanzhu; Guo, Wenshuo; Feng, Lizhen; Ye, Xiaozhen; Xie, Wanfeng; Huang, Xiuping; Liu, Jinyan
2015-02-06
Cylindrocladium leaf blight is one of the most severe diseases in Eucalyptus plantations and nurseries. There are Eucalyptus cultivars with resistance to the disease. However, little is known about the defense mechanism of resistant cultivars. Here, we investigated the transcriptome and proteome of Eucalyptus leaves (E. urophylla×E. tereticornis M1), infected or not with Calonectria pseudoreteaudii. A total of 8585 differentially expressed genes (|log2 ratio| ≥1, FDR ≤0.001) at 12 and 24hours post-inoculation were detected using RNA-seq. Transcriptional changes for five genes were further confirmed by qRT-PCR. A total of 3680 proteins at the two time points were identified using iTRAQ technique.The combined transcriptome and proteome analysis revealed that the shikimate/phenylpropanoid pathway, terpenoid biosynthesis, signalling pathway (jasmonic acid and sugar) were activated. The data also showed that some proteins (WRKY33 and PR proteins) which have been reported to involve in plant defense response were up-regulated. However, photosynthesis, nucleic acid metabolism and protein metabolism were impaired by the infection of C. pseudoreteaudii. This work will facilitate the identification of defense related genes and provide insights into Eucalyptus defense responses to Cylindrocladium leaf blight. In this study, a total of 130 proteins and genes involved in the shikimate/phenylpropanoid pathway, terpenoid biosynthesis, signalling pathway, cell transport, carbohydrate and energy metabolism, nucleic acid metabolism and protein metabolism in Eucalyptus leaves after infected with C. pseudoreteaudii were identified. This is the first report of a comprehensive transcriptomic and proteomic analysis of Eucalyptus in response to Calonectria sp. Copyright © 2014 Elsevier B.V. All rights reserved.