large-scale transcriptome analysis: Topics by Science.gov

Sample records for large-scale transcriptome analysis

Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.)

USDA-ARS?s Scientific Manuscript database

This study reports generation of large-scale genomic resources for pigeonpea, a so-called ‘orphan crop species’ of the semi-arid tropic regions. Roche FLX/454 sequencing was carried out on a normalized cDNA pool prepared from 31 tissues produced 494,353 short transcript reads (STRs). Cluster analysi...
A large-scale full-length cDNA analysis to explore the budding yeast transcriptome

PubMed Central

Miura, Fumihito; Kawaguchi, Noriko; Sese, Jun; Toyoda, Atsushi; Hattori, Masahira; Morishita, Shinichi; Ito, Takashi

2006-01-01

We performed a large-scale cDNA analysis to explore the transcriptome of the budding yeast Saccharomyces cerevisiae. We sequenced two cDNA libraries, one from the cells exponentially growing in a minimal medium and the other from meiotic cells. Both libraries were generated by using a vector-capping method that allows the accurate mapping of transcription start sites (TSSs). Consequently, we identified 11,575 TSSs associated with 3,638 annotated genomic features, including 3,599 ORFs, to suggest that most yeast genes have two or more TSSs. In addition, we identified 45 previously undescribed introns, including those affecting current ORF annotations and those spliced alternatively. Furthermore, the analysis revealed 667 transcription units in the intergenic regions and transcripts derived from antisense strands of 367 known features. We also found that 348 ORFs carry TSSs in their 3′-halves to generate sense transcripts starting from inside the ORFs. These results indicate that the budding yeast transcriptome is considerably more complex than previously thought, and it shares many recently revealed characteristics with the transcriptomes of mammals and other higher eukaryotes. Thus, the genome-wide active transcription that generates novel classes of transcripts appears to be an intrinsic feature of the eukaryotic cells. The budding yeast will serve as a versatile model for the studies on these aspects of transcriptome, and the full-length cDNA clones can function as an invaluable resource in such studies. PMID:17101987
BLIND ordering of large-scale transcriptomic developmental timecourses.

PubMed

Anavy, Leon; Levin, Michal; Khair, Sally; Nakanishi, Nagayasu; Fernandez-Valverde, Selene L; Degnan, Bernard M; Yanai, Itai

2014-03-01

RNA-Seq enables the efficient transcriptome sequencing of many samples from small amounts of material, but the analysis of these data remains challenging. In particular, in developmental studies, RNA-Seq is challenged by the morphological staging of samples, such as embryos, since these often lack clear markers at any particular stage. In such cases, the automatic identification of the stage of a sample would enable previously infeasible experimental designs. Here we present the 'basic linear index determination of transcriptomes' (BLIND) method for ordering samples comprising different developmental stages. The method is an implementation of a traveling salesman algorithm to order the transcriptomes according to their inter-relationships as defined by principal components analysis. To establish the direction of the ordered samples, we show that an appropriate indicator is the entropy of transcriptomic gene expression levels, which increases over developmental time. Using BLIND, we correctly recover the annotated order of previously published embryonic transcriptomic timecourses for frog, mosquito, fly and zebrafish. We further demonstrate the efficacy of BLIND by collecting 59 embryos of the sponge Amphimedon queenslandica and ordering their transcriptomes according to developmental stage. BLIND is thus useful in establishing the temporal order of samples within large datasets and is of particular relevance to the study of organisms with asynchronous development and when morphological staging is difficult.
A pipeline for the de novo assembly of the Themira biloba (Sepsidae: Diptera) transcriptome using a multiple k-mer length approach.

PubMed

Melicher, Dacotah; Torson, Alex S; Dworkin, Ian; Bowsher, Julia H

2014-03-12

The Sepsidae family of flies is a model for investigating how sexual selection shapes courtship and sexual dimorphism in a comparative framework. However, like many non-model systems, there are few molecular resources available. Large-scale sequencing and assembly have not been performed in any sepsid, and the lack of a closely related genome makes investigation of gene expression challenging. Our goal was to develop an automated pipeline for de novo transcriptome assembly, and to use that pipeline to assemble and analyze the transcriptome of the sepsid Themira biloba. Our bioinformatics pipeline uses cloud computing services to assemble and analyze the transcriptome with off-site data management, processing, and backup. It uses a multiple k-mer length approach combined with a second meta-assembly to extend transcripts and recover more bases of transcript sequences than standard single k-mer assembly. We used 454 sequencing to generate 1.48 million reads from cDNA generated from embryo, larva, and pupae of T. biloba and assembled a transcriptome consisting of 24,495 contigs. Annotation identified 16,705 transcripts, including those involved in embryogenesis and limb patterning. We assembled transcriptomes from an additional three non-model organisms to demonstrate that our pipeline assembled a higher-quality transcriptome than single k-mer approaches across multiple species. The pipeline we have developed for assembly and analysis increases contig length, recovers unique transcripts, and assembles more base pairs than other methods through the use of a meta-assembly. The T. biloba transcriptome is a critical resource for performing large-scale RNA-Seq investigations of gene expression patterns, and is the first transcriptome sequenced in this Dipteran family.
A Normalization-Free and Nonparametric Method Sharpens Large-Scale Transcriptome Analysis and Reveals Common Gene Alteration Patterns in Cancers.

PubMed

Li, Qi-Gang; He, Yong-Han; Wu, Huan; Yang, Cui-Ping; Pu, Shao-Yan; Fan, Song-Qing; Jiang, Li-Ping; Shen, Qiu-Shuo; Wang, Xiao-Xiong; Chen, Xiao-Qiong; Yu, Qin; Li, Ying; Sun, Chang; Wang, Xiangting; Zhou, Jumin; Li, Hai-Peng; Chen, Yong-Bin; Kong, Qing-Peng

2017-01-01

Heterogeneity in transcriptional data hampers the identification of differentially expressed genes (DEGs) and understanding of cancer, essentially because current methods rely on cross-sample normalization and/or distribution assumption-both sensitive to heterogeneous values. Here, we developed a new method, Cross-Value Association Analysis (CVAA), which overcomes the limitation and is more robust to heterogeneous data than the other methods. Applying CVAA to a more complex pan-cancer dataset containing 5,540 transcriptomes discovered numerous new DEGs and many previously rarely explored pathways/processes; some of them were validated, both in vitro and in vivo , to be crucial in tumorigenesis, e.g., alcohol metabolism ( ADH1B ), chromosome remodeling ( NCAPH ) and complement system ( Adipsin ). Together, we present a sharper tool to navigate large-scale expression data and gain new mechanistic insights into tumorigenesis.
Enabling large-scale next-generation sequence assembly with Blacklight

PubMed Central

Couger, M. Brian; Pipes, Lenore; Squina, Fabio; Prade, Rolf; Siepel, Adam; Palermo, Robert; Katze, Michael G.; Mason, Christopher E.; Blood, Philip D.

2014-01-01

Summary A variety of extremely challenging biological sequence analyses were conducted on the XSEDE large shared memory resource Blacklight, using current bioinformatics tools and encompassing a wide range of scientific applications. These include genomic sequence assembly, very large metagenomic sequence assembly, transcriptome assembly, and sequencing error correction. The data sets used in these analyses included uncategorized fungal species, reference microbial data, very large soil and human gut microbiome sequence data, and primate transcriptomes, composed of both short-read and long-read sequence data. A new parallel command execution program was developed on the Blacklight resource to handle some of these analyses. These results, initially reported previously at XSEDE13 and expanded here, represent significant advances for their respective scientific communities. The breadth and depth of the results achieved demonstrate the ease of use, versatility, and unique capabilities of the Blacklight XSEDE resource for scientific analysis of genomic and transcriptomic sequence data, and the power of these resources, together with XSEDE support, in meeting the most challenging scientific problems. PMID:25294974
CGDV: a webtool for circular visualization of genomics and transcriptomics data.

PubMed

Jha, Vineet; Singh, Gulzar; Kumar, Shiva; Sonawane, Amol; Jere, Abhay; Anamika, Krishanpal

2017-10-24

Interpretation of large-scale data is very challenging and currently there is scarcity of web tools which support automated visualization of a variety of high throughput genomics and transcriptomics data and for a wide variety of model organisms along with user defined karyotypes. Circular plot provides holistic visualization of high throughput large scale data but it is very complex and challenging to generate as most of the available tools need informatics expertise to install and run them. We have developed CGDV (Circos for Genomics and Transcriptomics Data Visualization), a webtool based on Circos, for seamless and automated visualization of a variety of large scale genomics and transcriptomics data. CGDV takes output of analyzed genomics or transcriptomics data of different formats, such as vcf, bed, xls, tab limited matrix text file, CNVnator raw output and Gene fusion raw output, to plot circular view of the sample data. CGDV take cares of generating intermediate files required for circos. CGDV is freely available at https://cgdv-upload.persistent.co.in/cgdv/ . The circular plot for each data type is tailored to gain best biological insights into the data. The inter-relationship between data points, homologous sequences, genes involved in fusion events, differential expression pattern, sequencing depth, types and size of variations and enrichment of DNA binding proteins can be seen using CGDV. CGDV thus helps biologists and bioinformaticians to visualize a variety of genomics and transcriptomics data seamlessly.
Digital Marine Bioprospecting: Mining New Neurotoxin Drug Candidates from the Transcriptomes of Cold-Water Sea Anemones

PubMed Central

Urbarova, Ilona; Karlsen, Bård Ove; Okkenhaug, Siri; Seternes, Ole Morten; Johansen, Steinar D.; Emblem, Åse

2012-01-01

Marine bioprospecting is the search for new marine bioactive compounds and large-scale screening in extracts represents the traditional approach. Here, we report an alternative complementary protocol, called digital marine bioprospecting, based on deep sequencing of transcriptomes. We sequenced the transcriptomes from the adult polyp stage of two cold-water sea anemones, Bolocera tuediae and Hormathia digitata. We generated approximately 1.1 million quality-filtered sequencing reads by 454 pyrosequencing, which were assembled into approximately 120,000 contigs and 220,000 single reads. Based on annotation and gene ontology analysis we profiled the expressed mRNA transcripts according to known biological processes. As a proof-of-concept we identified polypeptide toxins with a potential blocking activity on sodium and potassium voltage-gated channels from digital transcriptome libraries. PMID:23170083
Large-scale atlas of microarray data reveals biological landscape of gene expression in Arabidopsis

USDA-ARS?s Scientific Manuscript database

Transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metad...
De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum)

PubMed Central

2011-01-01

Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
Draft De Novo Transcriptome of the Rat Kangaroo Potorous tridactylus as a Tool for Cell Biology

PubMed Central

Udy, Dylan B.; Voorhies, Mark; Chan, Patricia P.; Lowe, Todd M.; Dumont, Sophie

2015-01-01

The rat kangaroo (long-nosed potoroo, Potorous tridactylus) is a marsupial native to Australia. Cultured rat kangaroo kidney epithelial cells (PtK) are commonly used to study cell biological processes. These mammalian cells are large, adherent, and flat, and contain large and few chromosomes—and are thus ideal for imaging intra-cellular dynamics such as those of mitosis. Despite this, neither the rat kangaroo genome nor transcriptome have been sequenced, creating a challenge for probing the molecular basis of these cellular dynamics. Here, we present the sequencing, assembly and annotation of the draft rat kangaroo de novo transcriptome. We sequenced 679 million reads that mapped to 347,323 Trinity transcripts and 20,079 Unigenes. We present statistics emerging from transcriptome-wide analyses, and analyses suggesting that the transcriptome covers full-length sequences of most genes, many with multiple isoforms. We also validate our findings with a proof-of-concept gene knockdown experiment. We expect that this high quality transcriptome will make rat kangaroo cells a more tractable system for linking molecular-scale function and cellular-scale dynamics. PMID:26252667
Draft De Novo Transcriptome of the Rat Kangaroo Potorous tridactylus as a Tool for Cell Biology.

PubMed

Udy, Dylan B; Voorhies, Mark; Chan, Patricia P; Lowe, Todd M; Dumont, Sophie

2015-01-01

The rat kangaroo (long-nosed potoroo, Potorous tridactylus) is a marsupial native to Australia. Cultured rat kangaroo kidney epithelial cells (PtK) are commonly used to study cell biological processes. These mammalian cells are large, adherent, and flat, and contain large and few chromosomes-and are thus ideal for imaging intra-cellular dynamics such as those of mitosis. Despite this, neither the rat kangaroo genome nor transcriptome have been sequenced, creating a challenge for probing the molecular basis of these cellular dynamics. Here, we present the sequencing, assembly and annotation of the draft rat kangaroo de novo transcriptome. We sequenced 679 million reads that mapped to 347,323 Trinity transcripts and 20,079 Unigenes. We present statistics emerging from transcriptome-wide analyses, and analyses suggesting that the transcriptome covers full-length sequences of most genes, many with multiple isoforms. We also validate our findings with a proof-of-concept gene knockdown experiment. We expect that this high quality transcriptome will make rat kangaroo cells a more tractable system for linking molecular-scale function and cellular-scale dynamics.
De novo sequencing, assembly and analysis of eight different transcriptomes from the Malayan pangolin

PubMed Central

Mohamed Yusoff, Aini; Tan, Tze King; Hari, Ranjeev; Koepfli, Klaus-Peter; Wee, Wei Yee; Antunes, Agostinho; Sitam, Frankie Thomas; Rovie-Ryan, Jeffrine Japning; Karuppannan, Kayal Vizi; Wong, Guat Jah; Lipovich, Leonard; Warren, Wesley C.; O’Brien, Stephen J.; Choo, Siew Woh

2016-01-01

Pangolins are scale-covered mammals, containing eight endangered species. Maintaining pangolins in captivity is a significant challenge, in part because little is known about their genetics. Here we provide the first large-scale sequencing of the critically endangered Manis javanica transcriptomes from eight different organs using Illumina HiSeq technology, yielding ~75 Giga bases and 89,754 unigenes. We found some unigenes involved in the insect hormone biosynthesis pathway and also 747 lipids metabolism-related unigenes that may be insightful to understand the lipid metabolism system in pangolins. Comparative analysis between M. javanica and other mammals revealed many pangolin-specific genes significantly over-represented in stress-related processes, cell proliferation and external stimulus, probably reflecting the traits and adaptations of the analyzed pregnant female M. javanica. Our study provides an invaluable resource for future functional works that may be highly relevant for the conservation of pangolins. PMID:27618997
Transcriptomic analysis of grain amaranth (Amaranthus hypochondriacus) using 454 pyrosequencing: comparison with A. tuberculatus, expression profiling in stems and in response to biotic and abiotic stress

PubMed Central

2011-01-01

Background Amaranthus hypochondriacus, a grain amaranth, is a C4 plant noted by its ability to tolerate stressful conditions and produce highly nutritious seeds. These possess an optimal amino acid balance and constitute a rich source of health-promoting peptides. Although several recent studies, mostly involving subtractive hybridization strategies, have contributed to increase the relatively low number of grain amaranth expressed sequence tags (ESTs), transcriptomic information of this species remains limited, particularly regarding tissue-specific and biotic stress-related genes. Thus, a large scale transcriptome analysis was performed to generate stem- and (a)biotic stress-responsive gene expression profiles in grain amaranth. Results A total of 2,700,168 raw reads were obtained from six 454 pyrosequencing runs, which were assembled into 21,207 high quality sequences (20,408 isotigs + 799 contigs). The average sequence length was 1,064 bp and 930 bp for isotigs and contigs, respectively. Only 5,113 singletons were recovered after quality control. Contigs/isotigs were further incorporated into 15,667 isogroups. All unique sequences were queried against the nr, TAIR, UniRef100, UniRef50 and Amaranthaceae EST databases for annotation. Functional GO annotation was performed with all contigs/isotigs that produced significant hits with the TAIR database. Only 8,260 sequences were found to be homologous when the transcriptomes of A. tuberculatus and A. hypochondriacus were compared, most of which were associated with basic house-keeping processes. Digital expression analysis identified 1,971 differentially expressed genes in response to at least one of four stress treatments tested. These included several multiple-stress-inducible genes that could represent potential candidates for use in the engineering of stress-resistant plants. The transcriptomic data generated from pigmented stems shared similarity with findings reported in developing stems of Arabidopsis and black cottonwood (Populus trichocarpa). Conclusions This study represents the first large-scale transcriptomic analysis of A. hypochondriacus, considered to be a highly nutritious and stress-tolerant crop. Numerous genes were found to be induced in response to (a)biotic stress, many of which could further the understanding of the mechanisms that contribute to multiple stress-resistance in plants, a trait that has potential biotechnological applications in agriculture. PMID:21752295
Transcriptome analysis of tube foot and large scale marker discovery in sea cucumber, Apostichopus japonicus.

PubMed

Zhou, Xiaoxu; Wang, Hongdi; Cui, Jun; Qiu, Xuemei; Chang, Yaqing; Wang, Xiuli

2016-12-01

Tube foot as one of the ambulacral appendages types in Aspidochirote holothurioids, is known for their functions in locomotion, feeding, chemoreception, light sensitivity and respiration. In this study, we explored the characteristic of transcriptome in the tube foot of sea cucumber (Apostichopus japonicus). Our results showed that among 390 unigenes which specifically expressed in the tube foot, 190 of them were annotated. Based on the assembly transcriptome, we found 219,860 SNPs from 34,749 unigenes, 97,683, 53,624, 27,767 and 40,786 were located in CDSs, 5'-UTRs, 3'-UTRs and non-CDS separately. Furthermore, 12,114 SSRs were detected from 7394 unigenes. Target genes of four specifically expressed miRNAs (miR-29a, miR-29b, miR-278-3p and miR-2005) in tube foot were also predicted based on the transcriptome, which contain immune-related factors (MBL, VLRA, AjC3, MyD88, CFB), skin pigmentation (MITF), candidate regeneration factor (TRP) and holothurians autolysis-related factor (CL). These results develop a relatively large number of molecular markers and transcriptome resources, and will provide a foundation for further analyses on the function and molecular mechanisms underlying A. japonicas tube foot. Copyright © 2016 Elsevier Inc. All rights reserved.
Transcriptome profiles link environmental variation and physiological response of Mytilus californianus between Pacific tides

PubMed Central

Place, Sean P.; Menge, Bruce A.; Hofmann, Gretchen E.

2011-01-01

Summary The marine intertidal zone is characterized by large variation in temperature, pH, dissolved oxygen and the supply of nutrients and food on seasonal and daily time scales. These oceanic fluctuations drive of ecological processes such as recruitment, competition and consumer-prey interactions largely via physiological mehcanisms. Thus, to understand coastal ecosystem dynamics and responses to climate change, it is crucial to understand these mechanisms. Here we utilize transcriptome analysis of the physiological response of the mussel Mytilus californianus at different spatial scales to gain insight into these mechanisms. We used mussels inhabiting different vertical locations within Strawberry Hill on Cape Perpetua, OR and Boiler Bay on Cape Foulweather, OR to study inter- and intra-site variation of gene expression. The results highlight two distinct gene expression signatures related to the cycling of metabolic activity and perturbations to cellular homeostasis. Intermediate spatial scales show a strong influence of oceanographic differences in food and stress environments between sites separated by ~65 km. Together, these new insights into environmental control of gene expression may allow understanding of important physiological drivers within and across populations. PMID:22563136
Large-scale transcriptome analysis reveals arabidopsis metabolic pathways are frequently influenced by different pathogens.

PubMed

Jiang, Zhenhong; He, Fei; Zhang, Ziding

2017-07-01

Through large-scale transcriptional data analyses, we highlighted the importance of plant metabolism in plant immunity and identified 26 metabolic pathways that were frequently influenced by the infection of 14 different pathogens. Reprogramming of plant metabolism is a common phenomenon in plant defense responses. Currently, a large number of transcriptional profiles of infected tissues in Arabidopsis (Arabidopsis thaliana) have been deposited in public databases, which provides a great opportunity to understand the expression patterns of metabolic pathways during plant defense responses at the systems level. Here, we performed a large-scale transcriptome analysis based on 135 previously published expression samples, including 14 different pathogens, to explore the expression pattern of Arabidopsis metabolic pathways. Overall, metabolic genes are significantly changed in expression during plant defense responses. Upregulated metabolic genes are enriched on defense responses, and downregulated genes are enriched on photosynthesis, fatty acid and lipid metabolic processes. Gene set enrichment analysis (GSEA) identifies 26 frequently differentially expressed metabolic pathways (FreDE_Paths) that are differentially expressed in more than 60% of infected samples. These pathways are involved in the generation of energy, fatty acid and lipid metabolism as well as secondary metabolite biosynthesis. Clustering analysis based on the expression levels of these 26 metabolic pathways clearly distinguishes infected and control samples, further suggesting the importance of these metabolic pathways in plant defense responses. By comparing with FreDE_Paths from abiotic stresses, we find that the expression patterns of 26 FreDE_Paths from biotic stresses are more consistent across different infected samples. By investigating the expression correlation between transcriptional factors (TFs) and FreDE_Paths, we identify several notable relationships. Collectively, the current study will deepen our understanding of plant metabolism in plant immunity and provide new insights into disease-resistant crop improvement.
Transcriptional profiling of CD31(+) cells isolated from murine embryonic stem cells.

PubMed

Mariappan, Devi; Winkler, Johannes; Chen, Shuhua; Schulz, Herbert; Hescheler, Jürgen; Sachinidis, Agapios

2009-02-01

Identification of genes involved in endothelial differentiation is of great interest for the understanding of the cellular and molecular mechanisms involved in the development of new blood vessels. Mouse embryonic stem (mES) cells serve as a potential source of endothelial cells for transcriptomic analysis. We isolated endothelial cells from 8-days old embryoid bodies by immuno-magnetic separation using platelet endothelial cell adhesion molecule-1 (also known as CD31) expressed on both early and mature endothelial cells. CD31(+) cells exhibit endothelial-like behavior by being able to incorporate DiI-labeled acetylated low-density lipoprotein as well as form tubular structures on matrigel. Quantitative and semi-quantitative PCR analysis further demonstrated the increased expression of endothelial transcripts. To ascertain the specific transcriptomic identity of the CD31(+) cells, large-scale microarray analysis was carried out. Comparative bioinformatic analysis reveals an enrichment of the gene ontology categories angiogenesis, blood vessel morphogenesis, vasculogenesis and blood coagulation in the CD31(+) cell population. Based on the transcriptomic signatures of the CD31(+) cells, we conclude that this ES cell-derived population contains endothelial-like cells expressing a mesodermal marker BMP2 and possess an angiogenic potential. The transcriptomic characterization of CD31(+) cells enables an in vitro functional genomic model to identify genes required for angiogenesis.
A Systems Biology Methodology Combining Transcriptome and Interactome Datasets to Assess the Implications of Cytokinin Signaling for Plant Immune Networks.

PubMed

Kunz, Meik; Dandekar, Thomas; Naseem, Muhammad

2017-01-01

Cytokinins (CKs) play an important role in plant growth and development. Also, several studies highlight the modulatory implications of CKs for plant-pathogen interaction. However, the underlying mechanisms of CK mediating immune networks in plants are still not fully understood. A detailed analysis of high-throughput transcriptome (RNA-Seq and microarrays) datasets under modulated conditions of plant CKs and its mergence with cellular interactome (large-scale protein-protein interaction data) has the potential to unlock the contribution of CKs to plant defense. Here, we specifically describe a detailed systems biology methodology pertinent to the acquisition and analysis of various omics datasets that delineate the role of plant CKs in impacting immune pathways in Arabidopsis.
Identification of Putative Precursor Genes for the Biosynthesis of Cannabinoid-Like Compound in Radula marginata

PubMed Central

Hussain, Tajammul; Plunkett, Blue; Ejaz, Mahwish; Espley, Richard V.; Kayser, Oliver

2018-01-01

The liverwort Radula marginata belongs to the bryophyte division of land plants and is a prospective alternate source of cannabinoid-like compounds. However, mechanistic insights into the molecular pathways directing the synthesis of these cannabinoid-like compounds have been hindered due to the lack of genetic information. This prompted us to do deep sequencing, de novo assembly and annotation of R. marginata transcriptome, which resulted in the identification and validation of the genes for cannabinoid biosynthetic pathway. In total, we have identified 11,421 putative genes encoding 1,554 enzymes from 145 biosynthetic pathways. Interestingly, we have identified all the upstream genes of the central precursor of cannabinoid biosynthesis, cannabigerolic acid (CBGA), including its two first intermediates, stilbene acid (SA) and geranyl diphosphate (GPP). Expression of all these genes was validated using quantitative real-time PCR. We have characterized the protein structure of stilbene synthase (STS), which is considered as a homolog of olivetolic acid in R. marginata. Moreover, the metabolomics approach enabled us to identify CBGA-analogous compounds using electrospray ionization mass spectrometry (ESI-MS/MS) and gas chromatography mass spectrometry (GC-MS). Transcriptomic analysis revealed 1085 transcription factors (TF) from 39 families. Comparative analysis showed that six TF families have been uniquely predicted in R. marginata. In addition, the bioinformatics analysis predicted a large number of simple sequence repeats (SSRs) and non-coding RNAs (ncRNAs). Our results collectively provide mechanistic insights into the putative precursor genes for the biosynthesis of cannabinoid-like compounds and a novel transcriptomic resource for R. marginata. The large-scale transcriptomic resource generated in this study would further serve as a reference transcriptome to explore the Radulaceae family.

How to normalize metatranscriptomic count data for differential expression analysis.

PubMed

Klingenberg, Heiner; Meinicke, Peter

2017-01-01

Differential expression analysis on the basis of RNA-Seq count data has become a standard tool in transcriptomics. Several studies have shown that prior normalization of the data is crucial for a reliable detection of transcriptional differences. Until now it has not been clear whether and how the transcriptomic approach can be used for differential expression analysis in metatranscriptomics. We propose a model for differential expression in metatranscriptomics that explicitly accounts for variations in the taxonomic composition of transcripts across different samples. As a main consequence the correct normalization of metatranscriptomic count data under this model requires the taxonomic separation of the data into organism-specific bins. Then the taxon-specific scaling of organism profiles yields a valid normalization and allows us to recombine the scaled profiles into a metatranscriptomic count matrix. This matrix can then be analyzed with statistical tools for transcriptomic count data. For taxon-specific scaling and recombination of scaled counts we provide a simple R script. When applying transcriptomic tools for differential expression analysis directly to metatranscriptomic data with an organism-independent (global) scaling of counts the resulting differences may be difficult to interpret. The differences may correspond to changing functional profiles of the contributing organisms but may also result from a variation of taxonomic abundances. Taxon-specific scaling eliminates this variation and therefore the resulting differences actually reflect a different behavior of organisms under changing conditions. In simulation studies we show that the divergence between results from global and taxon-specific scaling can be drastic. In particular, the variation of organism abundances can imply a considerable increase of significant differences with global scaling. Also, on real metatranscriptomic data, the predictions from taxon-specific and global scaling can differ widely. Our studies indicate that in real data applications performed with global scaling it might be impossible to distinguish between differential expression in terms of transcriptomic changes and differential composition in terms of changing taxonomic proportions. As in transcriptomics, a proper normalization of count data is also essential for differential expression analysis in metatranscriptomics. Our model implies a taxon-specific scaling of counts for normalization of the data. The application of taxon-specific scaling consequently removes taxonomic composition variations from functional profiles and therefore provides a clear interpretation of the observed functional differences.
Transcriptome characterisation of Pinus tabuliformis and evolution of genes in the Pinus phylogeny

PubMed Central

2013-01-01

Background The Chinese pine (Pinus tabuliformis) is an indigenous conifer species in northern China but is relatively underdeveloped as a genomic resource; thus, limiting gene discovery and breeding. Large-scale transcriptome data were obtained using a next-generation sequencing platform to compensate for the lack of P. tabuliformis genomic information. Results The increasing amount of transcriptome data on Pinus provides an excellent resource for multi-gene phylogenetic analysis and studies on how conserved genes and functions are maintained in the face of species divergence. The first P. tabuliformis transcriptome from a normalised cDNA library of multiple tissues and individuals was sequenced in a full 454 GS-FLX run, producing 911,302 sequencing reads. The high quality overlapping expressed sequence tags (ESTs) were assembled into 46,584 putative transcripts, and more than 700 SSRs and 92,000 SNPs/InDels were characterised. Comparative analysis of the transcriptome of six conifer species yielded 191 orthologues, from which we inferred a phylogenetic tree, evolutionary patterns and calculated rates of gene diversion. We also identified 938 fast evolving sequences that may be useful for identifying genes that perhaps evolved in response to positive selection and might be responsible for speciation in the Pinus lineage. Conclusions A large collection of high-quality ESTs was obtained, de novo assembled and characterised, which represents a dramatic expansion of the current transcript catalogues of P. tabuliformis and which will gradually be applied in breeding programs of P. tabuliformis. Furthermore, these data will facilitate future studies of the comparative genomics of P. tabuliformis and other related species. PMID:23597112
A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms.

PubMed

Sze, Sing-Hoi; Parrott, Jonathan J; Tarone, Aaron M

2017-12-06

While the continued development of high-throughput sequencing has facilitated studies of entire transcriptomes in non-model organisms, the incorporation of an increasing amount of RNA-Seq libraries has made de novo transcriptome assembly difficult. Although algorithms that can assemble a large amount of RNA-Seq data are available, they are generally very memory-intensive and can only be used to construct small assemblies. We develop a divide-and-conquer strategy that allows these algorithms to be utilized, by subdividing a large RNA-Seq data set into small libraries. Each individual library is assembled independently by an existing algorithm, and a merging algorithm is developed to combine these assemblies by picking a subset of high quality transcripts to form a large transcriptome. When compared to existing algorithms that return a single assembly directly, this strategy achieves comparable or increased accuracy as memory-efficient algorithms that can be used to process a large amount of RNA-Seq data, and comparable or decreased accuracy as memory-intensive algorithms that can only be used to construct small assemblies. Our divide-and-conquer strategy allows memory-intensive de novo transcriptome assembly algorithms to be utilized to construct large assemblies.
Microfluidic single-cell whole-transcriptome sequencing.

PubMed

Streets, Aaron M; Zhang, Xiannian; Cao, Chen; Pang, Yuhong; Wu, Xinglong; Xiong, Liang; Yang, Lu; Fu, Yusi; Zhao, Liang; Tang, Fuchou; Huang, Yanyi

2014-05-13

Single-cell whole-transcriptome analysis is a powerful tool for quantifying gene expression heterogeneity in populations of cells. Many techniques have, thus, been recently developed to perform transcriptome sequencing (RNA-Seq) on individual cells. To probe subtle biological variation between samples with limiting amounts of RNA, more precise and sensitive methods are still required. We adapted a previously developed strategy for single-cell RNA-Seq that has shown promise for superior sensitivity and implemented the chemistry in a microfluidic platform for single-cell whole-transcriptome analysis. In this approach, single cells are captured and lysed in a microfluidic device, where mRNAs with poly(A) tails are reverse-transcribed into cDNA. Double-stranded cDNA is then collected and sequenced using a next generation sequencing platform. We prepared 94 libraries consisting of single mouse embryonic cells and technical replicates of extracted RNA and thoroughly characterized the performance of this technology. Microfluidic implementation increased mRNA detection sensitivity as well as improved measurement precision compared with tube-based protocols. With 0.2 M reads per cell, we were able to reconstruct a majority of the bulk transcriptome with 10 single cells. We also quantified variation between and within different types of mouse embryonic cells and found that enhanced measurement precision, detection sensitivity, and experimental throughput aided the distinction between biological variability and technical noise. With this work, we validated the advantages of an early approach to single-cell RNA-Seq and showed that the benefits of combining microfluidic technology with high-throughput sequencing will be valuable for large-scale efforts in single-cell transcriptome analysis.
RNA sequencing: current and prospective uses in metabolic research.

PubMed

Vikman, Petter; Fadista, Joao; Oskolkov, Nikolay

2014-10-01

Previous global RNA analysis was restricted to known transcripts in species with a defined transcriptome. Next generation sequencing has transformed transcriptomics by making it possible to analyse expressed genes with an exon level resolution from any tissue in any species without any a priori knowledge of which genes that are being expressed, splice patterns or their nucleotide sequence. In addition, RNA sequencing is a more sensitive technique compared with microarrays with a larger dynamic range, and it also allows for investigation of imprinting and allele-specific expression. This can be done for a cost that is able to compete with that of a microarray, making RNA sequencing a technique available to most researchers. Therefore RNA sequencing has recently become the state of the art with regards to large-scale RNA investigations and has to a large extent replaced microarrays. The only drawback is the large data amounts produced, which together with the complexity of the data can make a researcher spend far more time on analysis than performing the actual experiment. © 2014 Society for Endocrinology.
The Human Blood Metabolome-Transcriptome Interface

PubMed Central

Schramm, Katharina; Adamski, Jerzy; Gieger, Christian; Herder, Christian; Carstensen, Maren; Peters, Annette; Rathmann, Wolfgang; Roden, Michael; Strauch, Konstantin; Suhre, Karsten; Kastenmüller, Gabi; Prokisch, Holger; Theis, Fabian J.

2015-01-01

Biological systems consist of multiple organizational levels all densely interacting with each other to ensure function and flexibility of the system. Simultaneous analysis of cross-sectional multi-omics data from large population studies is a powerful tool to comprehensively characterize the underlying molecular mechanisms on a physiological scale. In this study, we systematically analyzed the relationship between fasting serum metabolomics and whole blood transcriptomics data from 712 individuals of the German KORA F4 cohort. Correlation-based analysis identified 1,109 significant associations between 522 transcripts and 114 metabolites summarized in an integrated network, the ‘human blood metabolome-transcriptome interface’ (BMTI). Bidirectional causality analysis using Mendelian randomization did not yield any statistically significant causal associations between transcripts and metabolites. A knowledge-based interpretation and integration with a genome-scale human metabolic reconstruction revealed systematic signatures of signaling, transport and metabolic processes, i.e. metabolic reactions mainly belonging to lipid, energy and amino acid metabolism. Moreover, the construction of a network based on functional categories illustrated the cross-talk between the biological layers at a pathway level. Using a transcription factor binding site enrichment analysis, this pathway cross-talk was further confirmed at a regulatory level. Finally, we demonstrated how the constructed networks can be used to gain novel insights into molecular mechanisms associated to intermediate clinical traits. Overall, our results demonstrate the utility of a multi-omics integrative approach to understand the molecular mechanisms underlying both normal physiology and disease. PMID:26086077
Transcriptome sequencing and annotation of the halophytic microalga Dunaliella salina * #

PubMed Central

Hong, Ling; Liu, Jun-li; Midoun, Samira Z.; Miller, Philip C.

2017-01-01

The unicellular green alga Dunaliella salina is well adapted to salt stress and contains compounds (including β-carotene and vitamins) with potential commercial value. A large transcriptome database of D. salina during the adjustment, exponential and stationary growth phases was generated using a high throughput sequencing platform. We characterized the metabolic processes in D. salina with a focus on valuable metabolites, with the aim of manipulating D. salina to achieve greater economic value in large-scale production through a bioengineering strategy. Gene expression profiles under salt stress verified using quantitative polymerase chain reaction (qPCR) implied that salt can regulate the expression of key genes. This study generated a substantial fraction of D. salina transcriptional sequences for the entire growth cycle, providing a basis for the discovery of novel genes. This first full-scale transcriptome study of D. salina establishes a foundation for further comparative genomic studies. PMID:28990374
Large-scale transcriptome sequencing and gene analyses in the crab-eating macaque (Macaca fascicularis) for biomedical research

PubMed Central

2012-01-01

Background As a human replacement, the crab-eating macaque (Macaca fascicularis) is an invaluable non-human primate model for biomedical research, but the lack of genetic information on this primate has represented a significant obstacle for its broader use. Results Here, we sequenced the transcriptome of 16 tissues originated from two individuals of crab-eating macaque (male and female), and identified genes to resolve the main obstacles for understanding the biological response of the crab-eating macaque. From 4 million reads with 1.4 billion base sequences, 31,786 isotigs containing genes similar to those of humans, 12,672 novel isotigs, and 348,160 singletons were identified using the GS FLX sequencing method. Approximately 86% of human genes were represented among the genes sequenced in this study. Additionally, 175 tissue-specific transcripts were identified, 81 of which were experimentally validated. In total, 4,314 alternative splicing (AS) events were identified and analyzed. Intriguingly, 10.4% of AS events were associated with transposable element (TE) insertions. Finally, investigation of TE exonization events and evolutionary analysis were conducted, revealing interesting phenomena of human-specific amplified trends in TE exonization events. Conclusions This report represents the first large-scale transcriptome sequencing and genetic analyses of M. fascicularis and could contribute to its utility for biomedical research and basic biology. PMID:22554259
The Transcriptome Analysis and Comparison Explorer--T-ACE: a platform-independent, graphical tool to process large RNAseq datasets of non-model organisms.

PubMed

Philipp, E E R; Kraemer, L; Mountfort, D; Schilhabel, M; Schreiber, S; Rosenstiel, P

2012-03-15

Next generation sequencing (NGS) technologies allow a rapid and cost-effective compilation of large RNA sequence datasets in model and non-model organisms. However, the storage and analysis of transcriptome information from different NGS platforms is still a significant bottleneck, leading to a delay in data dissemination and subsequent biological understanding. Especially database interfaces with transcriptome analysis modules going beyond mere read counts are missing. Here, we present the Transcriptome Analysis and Comparison Explorer (T-ACE), a tool designed for the organization and analysis of large sequence datasets, and especially suited for transcriptome projects of non-model organisms with little or no a priori sequence information. T-ACE offers a TCL-based interface, which accesses a PostgreSQL database via a php-script. Within T-ACE, information belonging to single sequences or contigs, such as annotation or read coverage, is linked to the respective sequence and immediately accessible. Sequences and assigned information can be searched via keyword- or BLAST-search. Additionally, T-ACE provides within and between transcriptome analysis modules on the level of expression, GO terms, KEGG pathways and protein domains. Results are visualized and can be easily exported for external analysis. We developed T-ACE for laboratory environments, which have only a limited amount of bioinformatics support, and for collaborative projects in which different partners work on the same dataset from different locations or platforms (Windows/Linux/MacOS). For laboratories with some experience in bioinformatics and programming, the low complexity of the database structure and open-source code provides a framework that can be customized according to the different needs of the user and transcriptome project.
Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies

PubMed Central

Zhao, Shanrong; Prenger, Kurt; Smith, Lance

2013-01-01

RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets. PMID:25937948
Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies.

PubMed

Zhao, Shanrong; Prenger, Kurt; Smith, Lance

2013-01-01

RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.
Comparative transcriptomics with self-organizing map reveals cryptic photosynthetic differences between two accessions of North American Lake cress.

PubMed

Nakayama, Hokuto; Sakamoto, Tomoaki; Okegawa, Yuki; Kaminoyama, Kaori; Fujie, Manabu; Ichihashi, Yasunori; Kurata, Tetsuya; Motohashi, Ken; Al-Shehbaz, Ihsan; Sinha, Neelima; Kimura, Seisuke

2018-02-19

Because natural variation in wild species is likely the result of local adaptation, it provides a valuable resource for understanding plant-environmental interactions. Rorippa aquatica (Brassicaceae) is a semi-aquatic North American plant with morphological differences between several accessions, but little information available on any physiological differences. Here, we surveyed the transcriptomes of two R. aquatica accessions and identified cryptic physiological differences between them. We first reconstructed a Rorippa phylogeny to confirm relationships between the accessions. We performed large-scale RNA-seq and de novo assembly; the resulting 87,754 unigenes were then annotated via comparisons to different databases. Between-accession physiological variation was identified with transcriptomes from both accessions. Transcriptome data were analyzed with principal component analysis and self-organizing map. Results of analyses suggested that photosynthetic capability differs between the accessions. Indeed, physiological experiments revealed between-accession variation in electron transport rate and the redox state of the plastoquinone pool. These results indicated that one accession may have adapted to differences in temperature or length of the growing season.
Comprehensive evaluation of AmpliSeq transcriptome, a novel targeted whole transcriptome RNA sequencing methodology for global gene expression analysis.

PubMed

Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich

2015-12-16

Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.
In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development.

PubMed

Ozerov, Ivan V; Lezhnina, Ksenia V; Izumchenko, Evgeny; Artemov, Artem V; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N; Labat, Ivan; West, Michael D; Buzdin, Anton; Cantor, Charles R; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

2016-11-16

Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.
In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development

PubMed Central

Ozerov, Ivan V.; Lezhnina, Ksenia V.; Izumchenko, Evgeny; Artemov, Artem V.; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N.; Labat, Ivan; West, Michael D.; Buzdin, Anton; Cantor, Charles R.; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

2016-01-01

Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy. PMID:27848968
RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome.

PubMed

Wenger, Yvan; Galliot, Brigitte

2013-03-25

Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.
RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome

PubMed Central

2013-01-01

Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871
Understanding and Controlling Sialylation in a CHO Fc-Fusion Process

PubMed Central

Lewis, Amanda M.; Croughan, William D.; Aranibar, Nelly; Lee, Alison G.; Warrack, Bethanne; Abu-Absi, Nicholas R.; Patel, Rutva; Drew, Barry; Borys, Michael C.; Reily, Michael D.; Li, Zheng Jian

2016-01-01

A Chinese hamster ovary (CHO) bioprocess, where the product is a sialylated Fc-fusion protein, was operated at pilot and manufacturing scale and significant variation of sialylation level was observed. In order to more tightly control glycosylation profiles, we sought to identify the cause of variability. Untargeted metabolomics and transcriptomics methods were applied to select samples from the large scale runs. Lower sialylation was correlated with elevated mannose levels, a shift in glucose metabolism, and increased oxidative stress response. Using a 5-L scale model operated with a reduced dissolved oxygen set point, we were able to reproduce the phenotypic profiles observed at manufacturing scale including lower sialylation, higher lactate and lower ammonia levels. Targeted transcriptomics and metabolomics confirmed that reduced oxygen levels resulted in increased mannose levels, a shift towards glycolysis, and increased oxidative stress response similar to the manufacturing scale. Finally, we propose a biological mechanism linking large scale operation and sialylation variation. Oxidative stress results from gas transfer limitations at large scale and the presence of oxygen dead-zones inducing upregulation of glycolysis and mannose biosynthesis, and downregulation of hexosamine biosynthesis and acetyl-CoA formation. The lower flux through the hexosamine pathway and reduced intracellular pools of acetyl-CoA led to reduced formation of N-acetylglucosamine and N-acetylneuraminic acid, both key building blocks of N-glycan structures. This study reports for the first time a link between oxidative stress and mammalian protein sialyation. In this study, process, analytical, metabolomic, and transcriptomic data at manufacturing, pilot, and laboratory scales were taken together to develop a systems level understanding of the process and identify oxygen limitation as the root cause of glycosylation variability. PMID:27310468
Fish-T1K (Transcriptomes of 1,000 Fishes) Project: large-scale transcriptome data for fish evolution studies.

PubMed

Sun, Ying; Huang, Yu; Li, Xiaofeng; Baldwin, Carole C; Zhou, Zhuocheng; Yan, Zhixiang; Crandall, Keith A; Zhang, Yong; Zhao, Xiaomeng; Wang, Min; Wong, Alex; Fang, Chao; Zhang, Xinhui; Huang, Hai; Lopez, Jose V; Kilfoyle, Kirk; Zhang, Yong; Ortí, Guillermo; Venkatesh, Byrappa; Shi, Qiong

2016-01-01

Ray-finned fishes (Actinopterygii) represent more than 50 % of extant vertebrates and are of great evolutionary, ecologic and economic significance, but they are relatively underrepresented in 'omics studies. Increased availability of transcriptome data for these species will allow researchers to better understand changes in gene expression, and to carry out functional analyses. An international project known as the "Transcriptomes of 1,000 Fishes" (Fish-T1K) project has been established to generate RNA-seq transcriptome sequences for 1,000 diverse species of ray-finned fishes. The first phase of this project has produced transcriptomes from more than 180 ray-finned fishes, representing 142 species and covering 51 orders and 109 families. Here we provide an overview of the goals of this project and the work done so far.
An Approach to Function Annotation for Proteins of Unknown Function (PUFs) in the Transcriptome of Indian Mulberry.

PubMed

Dhanyalakshmi, K H; Naika, Mahantesha B N; Sajeevan, R S; Mathew, Oommen K; Shafi, K Mohamed; Sowdhamini, Ramanathan; N Nataraja, Karaba

2016-01-01

The modern sequencing technologies are generating large volumes of information at the transcriptome and genome level. Translation of this information into a biological meaning is far behind the race due to which a significant portion of proteins discovered remain as proteins of unknown function (PUFs). Attempts to uncover the functional significance of PUFs are limited due to lack of easy and high throughput functional annotation tools. Here, we report an approach to assign putative functions to PUFs, identified in the transcriptome of mulberry, a perennial tree commonly cultivated as host of silkworm. We utilized the mulberry PUFs generated from leaf tissues exposed to drought stress at whole plant level. A sequence and structure based computational analysis predicted the probable function of the PUFs. For rapid and easy annotation of PUFs, we developed an automated pipeline by integrating diverse bioinformatics tools, designated as PUFs Annotation Server (PUFAS), which also provides a web service API (Application Programming Interface) for a large-scale analysis up to a genome. The expression analysis of three selected PUFs annotated by the pipeline revealed abiotic stress responsiveness of the genes, and hence their potential role in stress acclimation pathways. The automated pipeline developed here could be extended to assign functions to PUFs from any organism in general. PUFAS web server is available at http://caps.ncbs.res.in/pufas/ and the web service is accessible at http://capservices.ncbs.res.in/help/pufas.

Next-Generation Sequencing of the Chrysanthemum nankingense (Asteraceae) Transcriptome Permits Large-Scale Unigene Assembly and SSR Marker Discovery

PubMed Central

Wang, Haibin; Jiang, Jiafu; Chen, Sumei; Qi, Xiangyu; Peng, Hui; Li, Pirui; Song, Aiping; Guan, Zhiyong; Fang, Weimin; Liao, Yuan; Chen, Fadi

2013-01-01

Background Simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Chrysanthemum is one of the largest genera in the Asteraceae family. Only few Chrysanthemum expressed sequence tag (EST) sequences have been acquired to date, so the number of available EST-SSR markers is very low. Methodology/Principal Findings Illumina paired-end sequencing technology produced over 53 million sequencing reads from C. nankingense mRNA. The subsequent de novo assembly yielded 70,895 unigenes, of which 45,789 (64.59%) unigenes showed similarity to the sequences in NCBI database. Out of 45,789 sequences, 107 have hits to the Chrysanthemum Nr protein database; 679 and 277 sequences have hits to the database of Helianthus and Lactuca species, respectively. MISA software identified a large number of putative EST-SSRs, allowing 1,788 primer pairs to be designed from the de novo transcriptome sequence and a further 363 from archival EST sequence. Among 100 primer pairs randomly chosen, 81 markers have amplicons and 20 are polymorphic for genotypes analysis in Chrysanthemum. The results showed that most (but not all) of the assays were transferable across species and that they exposed a significant amount of allelic diversity. Conclusions/Significance SSR markers acquired by transcriptome sequencing are potentially useful for marker-assisted breeding and genetic analysis in the genus Chrysanthemum and its related genera. PMID:23626799
Transcriptome difference and potential crosstalk between liver and mammary tissue in mid-lactation primiparous dairy cows.

PubMed

Bu, Dengpan; Bionaz, Massimo; Wang, Mengzhi; Nan, Xuemei; Ma, Lu; Wang, Jiaqi

2017-01-01

Liver and mammary gland are among the most important organs during lactation in dairy cows. With the purpose of understanding both the different and the complementary roles and the crosstalk of those two organs during lactation, a transcriptome analysis was performed on liver and mammary tissues of 10 primiparous dairy cows in mid-lactation. The analysis was performed using a 4×44K Bovine Agilent microarray chip. The transcriptome difference between the two tissues was analyzed using SAS JMP Genomics using ANOVA with a false discovery rate correction (FDR). The analysis uncovered >9,000 genes differentially expressed (DEG) between the two tissues with a FDR<0.001. The functional analysis of the DEG uncovered a larger metabolic (especially related to lipid) and inflammatory response capacity in liver compared with mammary tissue while the mammary tissue had a larger protein synthesis and secretion, proliferation/differentiation, signaling, and innate immune system capacity compared with the liver. A plethora of endogenous compounds, cytokines, and transcription factors were estimated to control the DEG between the two tissues. Compared with mammary tissue, the liver transcriptome appeared to be under control of a large array of ligand-dependent nuclear receptors and, among endogenous chemical, fatty acids and bacteria-derived compounds. Compared with liver, the transcriptome of the mammary tissue was potentially under control of a large number of growth factors and miRNA. The in silico crosstalk analysis between the two tissues revealed an overall large communication with a reciprocal control of lipid metabolism, innate immune system adaptation, and proliferation/differentiation. In summary the transcriptome analysis confirmed prior known differences between liver and mammary tissue, especially considering the indication of a larger metabolic activity in liver compared with the mammary tissue and the larger protein synthesis, communication, and proliferative capacity in mammary tissue compared with the liver. Relatively novel is the indication by the data that the transcriptome of the liver is highly regulated by dietary and bacteria-related compounds while the mammary transcriptome is more under control of hormones, growth factors, and miRNA. A large crosstalk between the two tissues with a reciprocal control of metabolism and innate immune-adaptation was indicated by the network analysis that allowed uncovering previously unknown crosstalk between liver and mammary tissue for several signaling molecules.
Transcriptome difference and potential crosstalk between liver and mammary tissue in mid-lactation primiparous dairy cows

PubMed Central

Bu, Dengpan; Bionaz, Massimo; Wang, Mengzhi; Nan, Xuemei; Ma, Lu; Wang, Jiaqi

2017-01-01

Liver and mammary gland are among the most important organs during lactation in dairy cows. With the purpose of understanding both the different and the complementary roles and the crosstalk of those two organs during lactation, a transcriptome analysis was performed on liver and mammary tissues of 10 primiparous dairy cows in mid-lactation. The analysis was performed using a 4×44K Bovine Agilent microarray chip. The transcriptome difference between the two tissues was analyzed using SAS JMP Genomics using ANOVA with a false discovery rate correction (FDR). The analysis uncovered >9,000 genes differentially expressed (DEG) between the two tissues with a FDR<0.001. The functional analysis of the DEG uncovered a larger metabolic (especially related to lipid) and inflammatory response capacity in liver compared with mammary tissue while the mammary tissue had a larger protein synthesis and secretion, proliferation/differentiation, signaling, and innate immune system capacity compared with the liver. A plethora of endogenous compounds, cytokines, and transcription factors were estimated to control the DEG between the two tissues. Compared with mammary tissue, the liver transcriptome appeared to be under control of a large array of ligand-dependent nuclear receptors and, among endogenous chemical, fatty acids and bacteria-derived compounds. Compared with liver, the transcriptome of the mammary tissue was potentially under control of a large number of growth factors and miRNA. The in silico crosstalk analysis between the two tissues revealed an overall large communication with a reciprocal control of lipid metabolism, innate immune system adaptation, and proliferation/differentiation. In summary the transcriptome analysis confirmed prior known differences between liver and mammary tissue, especially considering the indication of a larger metabolic activity in liver compared with the mammary tissue and the larger protein synthesis, communication, and proliferative capacity in mammary tissue compared with the liver. Relatively novel is the indication by the data that the transcriptome of the liver is highly regulated by dietary and bacteria-related compounds while the mammary transcriptome is more under control of hormones, growth factors, and miRNA. A large crosstalk between the two tissues with a reciprocal control of metabolism and innate immune-adaptation was indicated by the network analysis that allowed uncovering previously unknown crosstalk between liver and mammary tissue for several signaling molecules. PMID:28291785
Transcriptome profiling analysis of cultivar-specific apple fruit ripening and texture attributes

USDA-ARS?s Scientific Manuscript database

Molecular events regulating cultivar-specific apple fruit ripening and sensory quality are largely unknown. Such knowledge is essential for genomic-assisted apple breeding and postharvest quality management. In this study, transcriptome profile analysis, scanning electron microscopic examination an...
Analysis of transcriptome in hickory (Carya cathayensis), and uncover the dynamics in the hormonal signaling pathway during graft process.

PubMed

Qiu, Lingling; Jiang, Bo; Fang, Jia; Shen, Yike; Fang, Zhongxiang; Rm, Saravana Kumar; Yi, Keke; Shen, Chenjia; Yan, Daoliang; Zheng, Bingsong

2016-11-17

Hickory (Carya cathayensis), a woody plant with high nutritional and economic value, is widely planted in China. Due to its long juvenile phase, grafting is a useful technique for large-scale cultivation of hickory. To reveal the molecular mechanism during the graft process, we sequenced the transcriptomes of graft union in hickory. In our study, six RNA-seq libraries yielded a total of 83,676,860 clean short reads comprising 4.19 Gb of sequence data. A large number of differentially expressed genes (DEGs) at three time points during the graft process were identified. In detail, 777 DEGs in the 7 d vs 0 d (day after grafting) comparison were classified into 11 enriched Gene Ontology (GO) categories, and 262 DEGs in the 14 d vs 0 d comparison were classified into 15 enriched GO categories. Furthermore, an overview of the PPI network was constructed by these DEGs. In addition, 20 genes related to the auxin-and cytokinin-signaling pathways were identified, and some were validated by qRT-PCR analysis. Our comprehensive analysis provides basic information on the candidate genes and hormone signaling pathways involved in the graft process in hickory and other woody plants.
bigSCale: an analytical framework for big-scale single-cell data.

PubMed

Iacono, Giovanni; Mereu, Elisabetta; Guillaumet-Adkins, Amy; Corominas, Roser; Cuscó, Ivon; Rodríguez-Esteban, Gustavo; Gut, Marta; Pérez-Jurado, Luis Alberto; Gut, Ivo; Heyn, Holger

2018-06-01

Single-cell RNA sequencing (scRNA-seq) has significantly deepened our insights into complex tissues, with the latest techniques capable of processing tens of thousands of cells simultaneously. Analyzing increasing numbers of cells, however, generates extremely large data sets, extending processing time and challenging computing resources. Current scRNA-seq analysis tools are not designed to interrogate large data sets and often lack sensitivity to identify marker genes. With bigSCale, we provide a scalable analytical framework to analyze millions of cells, which addresses the challenges associated with large data sets. To handle the noise and sparsity of scRNA-seq data, bigSCale uses large sample sizes to estimate an accurate numerical model of noise. The framework further includes modules for differential expression analysis, cell clustering, and marker identification. A directed convolution strategy allows processing of extremely large data sets, while preserving transcript information from individual cells. We evaluated the performance of bigSCale using both a biological model of aberrant gene expression in patient-derived neuronal progenitor cells and simulated data sets, which underlines the speed and accuracy in differential expression analysis. To test its applicability for large data sets, we applied bigSCale to assess 1.3 million cells from the mouse developing forebrain. Its directed down-sampling strategy accumulates information from single cells into index cell transcriptomes, thereby defining cellular clusters with improved resolution. Accordingly, index cell clusters identified rare populations, such as reelin ( Reln )-positive Cajal-Retzius neurons, for which we report previously unrecognized heterogeneity associated with distinct differentiation stages, spatial organization, and cellular function. Together, bigSCale presents a solution to address future challenges of large single-cell data sets. © 2018 Iacono et al.; Published by Cold Spring Harbor Laboratory Press.
Transcriptomic Analysis of Paeonia delavayi Wild Population Flowers to Identify Differentially Expressed Genes Involved in Purple-Red and Yellow Petal Pigmentation

PubMed Central

Wang, Yan; Li, Kui; Zheng, Baoqiang; Miao, Kun

2015-01-01

Tree peony (Paeonia suffruticosa Andrews) is a very famous traditional ornamental plant in China. P. delavayi is a species endemic to Southwest China that has aroused great interest from researchers as a precious genetic resource for flower color breeding. However, the current understanding of the molecular mechanisms of flower pigmentation in this plant is limited, hindering the genetic engineering of novel flower color in tree peonies. In this study, we conducted a large-scale transcriptome analysis based on Illumina HiSeq sequencing of cDNA libraries generated from yellow and purple-red P. delavayi petals. A total of 90,202 unigenes were obtained by de novo assembly, with an average length of 721 nt. Using Blastx, 44,811 unigenes (49.68%) were found to have significant similarity to accessions in the NR, NT, and Swiss-Prot databases. We also examined COG, GO and KEGG annotations to better understand the functions of these unigenes. Further analysis of the two digital transcriptomes revealed that 6,855 unigenes were differentially expressed between yellow and purple-red flower petals, with 3,430 up-regulated and 3,425 down-regulated. According to the RNA-Seq data and qRT-PCR analysis, we proposed that four up-regulated key structural genes, including F3H, DFR, ANS and 3GT, might play an important role in purple-red petal pigmentation, while high co-expression of THC2'GT, CHI and FNS II ensures the accumulation of pigments contributing to the yellow color. We also found 50 differentially expressed transcription factors that might be involved in flavonoid biosynthesis. This study is the first to report genetic information for P. delavayi. The large number of gene sequences produced by transcriptome sequencing and the candidate genes identified using pathway mapping and expression profiles will provide a valuable resource for future association studies aimed at better understanding the molecular mechanisms underlying flower pigmentation in tree peonies. PMID:26267644
RNA-seq analysis of Rubus idaeus cv. Nova: transcriptome sequencing and de novo assembly for subsequent functional genomics approaches.

PubMed

Hyun, Tae Kyung; Lee, Sarah; Kumar, Dhinesh; Rim, Yeonggil; Kumar, Ritesh; Lee, Sang Yeol; Lee, Choong Hwan; Kim, Jae-Yean

2014-10-01

Using Illumina sequencing technology, we have generated the large-scale transcriptome sequencing data containing abundant information on genes involved in the metabolic pathways in R. idaeus cv. Nova fruits. Rubus idaeus (Red raspberry) is one of the important economical crops that possess numerous nutrients, micronutrients and phytochemicals with essential health benefits to human. The molecular mechanism underlying the ripening process and phytochemical biosynthesis in red raspberry is attributed to the changes in gene expression, but very limited transcriptomic and genomic information in public databases is available. To address this issue, we generated more than 51 million sequencing reads from R. idaeus cv. Nova fruit using Illumina RNA-Seq technology. After de novo assembly, we obtained 42,604 unigenes with an average length of 812 bp. At the protein level, Nova fruit transcriptome showed 77 and 68 % sequence similarities with Rubus coreanus and Fragaria versa, respectively, indicating the evolutionary relationship between them. In addition, 69 % of assembled unigenes were annotated using public databases including NCBI non-redundant, Cluster of Orthologous Groups and Gene ontology database, suggesting that our transcriptome dataset provides a valuable resource for investigating metabolic processes in red raspberry. To analyze the relationship between several novel transcripts and the amounts of metabolites such as γ-aminobutyric acid and anthocyanins, real-time PCR and target metabolite analysis were performed on two different ripening stages of Nova. This is the first attempt using Illumina sequencing platform for RNA sequencing and de novo assembly of Nova fruit without reference genome. Our data provide the most comprehensive transcriptome resource available for Rubus fruits, and will be useful for understanding the ripening process and for breeding R. idaeus cultivars with improved fruit quality.
Cross-disease transcriptomics: Unique IL-17A signaling in psoriasis lesions and an autoimmune PBMC signature

PubMed Central

Sarkar, Mrinal K.; Liang, Yun; Xing, Xianying; Gudjonsson, Johann E.

2016-01-01

Transcriptome studies of psoriasis have identified robust changes in mRNA expression through large-scale analysis of patient cohorts. These studies, however, have analyzed all mRNA changes in aggregate, without distinguishing between disease-specific and non-specific differentially expressed genes (DEGs). In this study, RNA-seq meta-analysis was used to identify (1) psoriasis-specific DEGs altered in few diseases besides psoriasis and (2) non-specific DEGs similarly altered in many other skin conditions. We show that few cutaneous DEGs are psoriasis-specific and that the two DEG classes differ in their cell type and cytokine associations. Psoriasis-specific DEGs are expressed by keratinocytes and induced by IL-17A, whereas non-specific DEGs are expressed by inflammatory cells and induced by IFN-gamma and TNF. PBMC-derived DEGs were more psoriasis-specific than cutaneous DEGs. Nonetheless, PBMC DEGs associated with MHC class I and NK cells were commonly downregulated in psoriasis and other autoimmune diseases (e.g., multiple sclerosis, sarcoidosis and juvenile rheumatoid arthritis). These findings demonstrate “cross-disease” transcriptomics as an approach to gain insights into the cutaneous and non-cutaneous psoriasis transcriptomes. This highlighted unique contributions of IL-17A to the cytokine network and uncovered a blood-based gene signature that links psoriasis to other diseases of autoimmunity. PMID:27206706
BABAR: an R package to simplify the normalisation of common reference design microarray-based transcriptomic datasets

PubMed Central

2010-01-01

Background The development of DNA microarrays has facilitated the generation of hundreds of thousands of transcriptomic datasets. The use of a common reference microarray design allows existing transcriptomic data to be readily compared and re-analysed in the light of new data, and the combination of this design with large datasets is ideal for 'systems'-level analyses. One issue is that these datasets are typically collected over many years and may be heterogeneous in nature, containing different microarray file formats and gene array layouts, dye-swaps, and showing varying scales of log2- ratios of expression between microarrays. Excellent software exists for the normalisation and analysis of microarray data but many data have yet to be analysed as existing methods struggle with heterogeneous datasets; options include normalising microarrays on an individual or experimental group basis. Our solution was to develop the Batch Anti-Banana Algorithm in R (BABAR) algorithm and software package which uses cyclic loess to normalise across the complete dataset. We have already used BABAR to analyse the function of Salmonella genes involved in the process of infection of mammalian cells. Results The only input required by BABAR is unprocessed GenePix or BlueFuse microarray data files. BABAR provides a combination of 'within' and 'between' microarray normalisation steps and diagnostic boxplots. When applied to a real heterogeneous dataset, BABAR normalised the dataset to produce a comparable scaling between the microarrays, with the microarray data in excellent agreement with RT-PCR analysis. When applied to a real non-heterogeneous dataset and a simulated dataset, BABAR's performance in identifying differentially expressed genes showed some benefits over standard techniques. Conclusions BABAR is an easy-to-use software tool, simplifying the simultaneous normalisation of heterogeneous two-colour common reference design cDNA microarray-based transcriptomic datasets. We show BABAR transforms real and simulated datasets to allow for the correct interpretation of these data, and is the ideal tool to facilitate the identification of differentially expressed genes or network inference analysis from transcriptomic datasets. PMID:20128918
A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

PubMed Central

Hanriot, Lucie; Keime, Céline; Gay, Nadine; Faure, Claudine; Dossat, Carole; Wincker, Patrick; Scoté-Blachon, Céline; Peyron, Christelle; Gandrillon, Olivier

2008-01-01

Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method. PMID:18796152
The developmental transcriptome atlas of the spoon worm Urechis unicinctus (Echiurida: Annelida).

PubMed

Park, Chungoo; Han, Yong-Hee; Lee, Sung-Gwon; Ry, Kyoung-Bin; Oh, Jooseong; Kern, Elizabeth M A; Park, Joong-Ki; Cho, Sung-Jin

2018-03-01

Echiurida is one of the most intriguing major subgroups of annelida because, unlike most other annelids, echiurids lack metameric body segmentation as adults. For this reason, transcriptome analyses from various developmental stages of echiurid species can be of substantial value for understanding precise expression levels and the complex regulatory networks during early and larval development. A total of 914 million raw RNA-Seq reads were produced from 14 developmental stages of Urechis unicinctus and were de novo assembled into contigs spanning 63,928,225 bp with an N50 length of 2700 bp. The resulting comprehensive transcriptome database of the early developmental stages of U. unicinctus consists of 20,305 representative functional protein-coding transcripts. Approximately 66% of unigenes were assigned to superphylum-level taxa, including Lophotrochozoa (40%). The completeness of the transcriptome assembly was assessed using benchmarking universal single-copy orthologs; 75.7% of the single-copy orthologs were presented in our transcriptome database. We observed 3 distinct patterns of global transcriptome profiles from 14 developmental stages and identified 12,705 genes that showed dynamic regulation patterns during the differentiation and maturation of U. unicinctus cells. We present the first large-scale developmental transcriptome dataset of U. unicinctus and provide a general overview of the dynamics of global gene expression changes during its early developmental stages. The analysis of time-course gene expression data is a first step toward understanding the complex developmental gene regulatory networks in U. unicinctus and will furnish a valuable resource for analyzing the functions of gene repertoires in various developmental phases.
Global Analysis of Transcriptome Responses and Gene Expression Profiles to Cold Stress of Jatropha curcas L.

PubMed Central

Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming

2013-01-01

Background Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. Results In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. Conclusions This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas. PMID:24349370
Global analysis of transcriptome responses and gene expression profiles to cold stress of Jatropha curcas L.

PubMed

Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming

2013-01-01

Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas.
paraGSEA: a scalable approach for large-scale gene expression profiling

PubMed Central

Peng, Shaoliang; Yang, Shunyun

2017-01-01

Abstract More studies have been conducted using gene expression similarity to identify functional connections among genes, diseases and drugs. Gene Set Enrichment Analysis (GSEA) is a powerful analytical method for interpreting gene expression data. However, due to its enormous computational overhead in the estimation of significance level step and multiple hypothesis testing step, the computation scalability and efficiency are poor on large-scale datasets. We proposed paraGSEA for efficient large-scale transcriptome data analysis. By optimization, the overall time complexity of paraGSEA is reduced from O(mn) to O(m+n), where m is the length of the gene sets and n is the length of the gene expression profiles, which contributes more than 100-fold increase in performance compared with other popular GSEA implementations such as GSEA-P, SAM-GS and GSEA2. By further parallelization, a near-linear speed-up is gained on both workstations and clusters in an efficient manner with high scalability and performance on large-scale datasets. The analysis time of whole LINCS phase I dataset (GSE92742) was reduced to nearly half hour on a 1000 node cluster on Tianhe-2, or within 120 hours on a 96-core workstation. The source code of paraGSEA is licensed under the GPLv3 and available at http://github.com/ysycloud/paraGSEA. PMID:28973463
De novo transcriptome assembly analysis of weed Apera spica-venti from seven tissues and growth stages.

PubMed

Babineau, Marielle; Mahmood, Khalid; Mathiassen, Solvejg K; Kudsk, Per; Kristensen, Michael

2017-02-06

Loose silky bentgrass (Apera spica-venti) is an important weed in Europe with a recent increase in herbicide resistance cases. The lack of genetic information about this noxious weed limits its biological understanding such as growth, reproduction, genetic variation, molecular ecology and metabolic herbicide resistance. This study produced a reference transcriptome for A. spica-venti from different tissues (leaf, root, stem) and various growth stages (seed at phenological stages 05, 07, 08, 09). The de novo assembly was performed on individual and combined dataset followed by functional annotations. Individual transcripts and gene families involved in metabolic based herbicide resistance were identified. Eight separate transcriptome assemblies were performed and compared. The combined transcriptome assembly consists of 83,349 contigs with an N50 and average contig length of 762 and 658 bp, respectively. This dataset contains 74,724 transcripts consisting of total 54,846,111 bp. Among them 94% had a homologue to UniProtKB, 73% retrieved a GO mapping, and 50% were functionally annotated. Compared with other grass species, A. spica-venti has 26% proteins in common to Brachypodium distachyon, and 41% to Lolium spp. Glycosyltransferases had the highest number of transcripts in each tissue followed by the cytochrome P450s. The GSTF1 and CYP89A2 transcripts were recovered from the majority of tissues and aligned at a maximum of 66 and 30% to proven herbicide resistant allele from Alopecurus myosuroides and Lolium rigidum, respectively. De novo transcriptome assembly enabled the generation of the first reference transcriptome of A. spica-venti. This can serve as stepping stone for understanding the metabolic herbicide resistance as well as the general biology of this problematic weed. Furthermore, this large-scale sequence data is a valuable scientific resource for comparative transcriptome analysis for Poaceae grasses.
Integrated analysis of transcriptome and lipid profiling reveals the co-influences of inositol-choline and Snf1 in controlling lipid biosynthesis in yeast.

PubMed

Chumnanpuen, Pramote; Zhang, Jie; Nookaew, Intawat; Nielsen, Jens

2012-07-01

In the yeast Saccharomyces cerevisiae many genes involved in lipid biosynthesis are transcriptionally controlled by inositol-choline and the protein kinase Snf1. Here we undertook a global study on how inositol-choline and Snf1 interact in controlling lipid metabolism in yeast. Using both a reference strain (CEN.PK113-7D) and a snf1Δ strain cultured at different nutrient limitations (carbon and nitrogen), at a fixed specific growth rate of 0.1 h(-1), and at different inositol choline concentrations, we quantified the expression of genes involved in lipid biosynthesis and the fluxes towards the different lipid components. Through integrated analysis of the transcriptome, the lipid profiling and the fluxome, it was possible to obtain a high quality, large-scale dataset that could be used to identify correlations and associations between the different components. At the transcription level, Snf1 and inositol-choline interact either directly through the main phospholipid-involving transcription factors (i.e. Ino2, Ino4, and Opi1) or through other transcription factors e.g. Gis1, Mga2, and Hac1. However, there seems to be flux regulation at the enzyme levels of several lipid involving enzymes. The analysis showed the strength of using both transcriptome and lipid profiling analysis for mapping the co-influence of inositol-choline and Snf1 on phospholipid metabolism.
Transcriptome of intraperitoneal organs of starry flounder Platichthys stellatus challenged by Edwardsiella ictaluri JCM1680

NASA Astrophysics Data System (ADS)

Tong, Yanli; Sun, Xiuqin; Wang, Bo; Wang, Ling; Li, Yan; Tian, Jinhu; Zheng, Fengrong; Zheng, Minggang

2015-01-01

Platichthys stellatus is an economically important marine bony fish species that is cultured in China on a large scale. However, very little is known about its immune-related genes. In this study, the transcriptome of the immune organs of P. stellatus that were intraperitoneally challenged with the pathogen E dwardsiella ictaluri JCM1680 is analyzed. Total RNA from four tissues (spleen, kidney, liver, and intestine) was mixed equally and then sequenced on an Illumina HiSeq 2000 platform. Overall, 28 465 813 quality reads were generated and assembled into 43 061 unigenes. Similarity searches against public protein sequence databases were used to annotate 28 291 unigenes (65.7% of the total), 368 of which were associated with immunoregulation, including 188 related to immunity response. Additionally, the transcript levels of immunity response unigenes annotated as related to tumor necrosis factor (TNF), TNF receptor, chemokine, major histocompatibility complex, and interleukin-6 were investigated in the different tissues of normal and infected P. stellatus by real-time quantitative PCR. The results confirmed that the unigenes identified in the transcriptome database were indeed expressed and up-regulated in infected P. stellatus. To our knowledge, this is the first report of the sequencing and analysis of the transcriptome of P. stellatus. These findings provide insights into the transcriptomics and immunogenetics of bony fish.
Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets.

PubMed

Lam, Max; Trampush, Joey W; Yu, Jin; Knowles, Emma; Davies, Gail; Liewald, David C; Starr, John M; Djurovic, Srdjan; Melle, Ingrid; Sundet, Kjetil; Christoforou, Andrea; Reinvang, Ivar; DeRosse, Pamela; Lundervold, Astri J; Steen, Vidar M; Espeseth, Thomas; Räikkönen, Katri; Widen, Elisabeth; Palotie, Aarno; Eriksson, Johan G; Giegling, Ina; Konte, Bettina; Roussos, Panos; Giakoumaki, Stella; Burdick, Katherine E; Payton, Antony; Ollier, William; Chiba-Falek, Ornit; Attix, Deborah K; Need, Anna C; Cirulli, Elizabeth T; Voineskos, Aristotle N; Stefanis, Nikos C; Avramopoulos, Dimitrios; Hatzimanolis, Alex; Arking, Dan E; Smyrnis, Nikolaos; Bilder, Robert M; Freimer, Nelson A; Cannon, Tyrone D; London, Edythe; Poldrack, Russell A; Sabb, Fred W; Congdon, Eliza; Conley, Emily Drabant; Scult, Matthew A; Dickinson, Dwight; Straub, Richard E; Donohoe, Gary; Morris, Derek; Corvin, Aiden; Gill, Michael; Hariri, Ahmad R; Weinberger, Daniel R; Pendleton, Neil; Bitsios, Panos; Rujescu, Dan; Lahti, Jari; Le Hellard, Stephanie; Keller, Matthew C; Andreassen, Ole A; Deary, Ian J; Glahn, David C; Malhotra, Anil K; Lencz, Todd

2017-11-28

Here, we present a large (n = 107,207) genome-wide association study (GWAS) of general cognitive ability ("g"), further enhanced by combining results with a large-scale GWAS of educational attainment. We identified 70 independent genomic loci associated with general cognitive ability. Results showed significant enrichment for genes causing Mendelian disorders with an intellectual disability phenotype. Competitive pathway analysis implicated the biological processes of neurogenesis and synaptic regulation, as well as the gene targets of two pharmacologic agents: cinnarizine, a T-type calcium channel blocker, and LY97241, a potassium channel inhibitor. Transcriptome-wide and epigenome-wide analysis revealed that the implicated loci were enriched for genes expressed across all brain regions (most strongly in the cerebellum). Enrichment was exclusive to genes expressed in neurons but not oligodendrocytes or astrocytes. Finally, we report genetic correlations between cognitive ability and disparate phenotypes including psychiatric disorders, several autoimmune disorders, longevity, and maternal age at first birth. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
The aquatic animals' transcriptome resource for comparative functional analysis.

PubMed

Chou, Chih-Hung; Huang, Hsi-Yuan; Huang, Wei-Chih; Hsu, Sheng-Da; Hsiao, Chung-Der; Liu, Chia-Yu; Chen, Yu-Hung; Liu, Yu-Chen; Huang, Wei-Yun; Lee, Meng-Lin; Chen, Yi-Chang; Huang, Hsien-Da

2018-05-09

Aquatic animals have great economic and ecological importance. Among them, non-model organisms have been studied regarding eco-toxicity, stress biology, and environmental adaptation. Due to recent advances in next-generation sequencing techniques, large amounts of RNA-seq data for aquatic animals are publicly available. However, currently there is no comprehensive resource exist for the analysis, unification, and integration of these datasets. This study utilizes computational approaches to build a new resource of transcriptomic maps for aquatic animals. This aquatic animal transcriptome map database dbATM provides de novo assembly of transcriptome, gene annotation and comparative analysis of more than twenty aquatic organisms without draft genome. To improve the assembly quality, three computational tools (Trinity, Oases and SOAPdenovo-Trans) were employed to enhance individual transcriptome assembly, and CAP3 and CD-HIT-EST software were then used to merge these three assembled transcriptomes. In addition, functional annotation analysis provides valuable clues to gene characteristics, including full-length transcript coding regions, conserved domains, gene ontology and KEGG pathways. Furthermore, all aquatic animal genes are essential for comparative genomics tasks such as constructing homologous gene groups and blast databases and phylogenetic analysis. In conclusion, we establish a resource for non model organism aquatic animals, which is great economic and ecological importance and provide transcriptomic information including functional annotation and comparative transcriptome analysis. The database is now publically accessible through the URL http://dbATM.mbc.nctu.edu.tw/ .

Systems perspectives on erythromycin biosynthesis by comparative genomic and transcriptomic analyses of S. erythraea E3 and NRRL23338 strains

PubMed Central

2013-01-01

Background S. erythraea is a Gram-positive filamentous bacterium used for the industrial-scale production of erythromycin A which is of high clinical importance. In this work, we sequenced the whole genome of a high-producing strain (E3) obtained by random mutagenesis and screening from the wild-type strain NRRL23338, and examined time-series expression profiles of both E3 and NRRL23338. Based on the genomic data and transcriptpmic data of these two strains, we carried out comparative analysis of high-producing strain and wild-type strain at both the genomic level and the transcriptomic level. Results We observed a large number of genetic variants including 60 insertions, 46 deletions and 584 single nucleotide variations (SNV) in E3 in comparison with NRRL23338, and the analysis of time series transcriptomic data indicated that the genes involved in erythromycin biosynthesis and feeder pathways were significantly up-regulated during the 60 hours time-course. According to our data, BldD, a previously identified ery cluster regulator, did not show any positive correlations with the expression of ery cluster, suggesting the existence of alternative regulation mechanisms of erythromycin synthesis in S. erythraea. Several potential regulators were then proposed by integration analysis of genomic and transcriptomic data. Conclusion This is a demonstration of the functional comparative genomics between an industrial S. erythraea strain and the wild-type strain. These findings help to understand the global regulation mechanisms of erythromycin biosynthesis in S. erythraea, providing useful clues for genetic and metabolic engineering in the future. PMID:23902230
A survey of the sorghum transcriptome using single-molecule long reads

DOE PAGES

Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; ...

2016-06-24

Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novelmore » splice isoforms. Additionally, we uncover APA ofB11,000 expressed genes and more than 2,100 novel genes. Lastly, these results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism.« less
A survey of the sorghum transcriptome using single-molecule long reads

PubMed Central

Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; Ngam, Peter; Devitt, Nicholas; Schilkey, Faye; Ben-Hur, Asa; Reddy, Anireddy S. N.

2016-01-01

Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novel splice isoforms. Additionally, we uncover APA of ∼11,000 expressed genes and more than 2,100 novel genes. These results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism. PMID:27339290
Identification of candidate genes involved in the sugar metabolism and accumulation during pear fruit post-harvest ripening of 'Red Clapp's Favorite' (Pyrus communis L.) by transcriptome analysis.

PubMed

Wang, Long; Chen, Yun; Wang, Suke; Xue, Huabai; Su, Yanli; Yang, Jian; Li, Xiugen

2018-01-01

Pear ( Pyrus spp.) is a popular fruit that is commercially cultivated in most temperate regions. In fruits, sugar metabolism and accumulation are important factors for fruit organoleptic quality. Post-harvest ripening is a special feature of 'Red Clapp's Favorite'. In this study, transcriptome sequencing based on the Illumina platform generated 23.8 - 35.8 million unigenes of nine cDNA libraries constructed using RNAs from the 'Red Clapp's Favorite' pear variety with different treatments, in which 2629 new genes were discovered, and 2121 of them were annotated. A total of 2146 DEGs, 3650 DEGs, 1830 DEGs from each comparison were assembled. Moreover, the gene expression patterns of 8 unigenes related to sugar metabolism revealed by qPCR. The main constituents of soluble sugars were fructose and glucose after pear fruit post-harvest ripening, and five unigenes involved in sugar metabolism were discovered. Our study not only provides a large-scale assessment of transcriptome resources of 'Red Clapp's Favorite' but also lays the foundation for further research into genes correlated with sugar metabolism.
Genome-Scale Transcriptome Analysis in Response to Nitric Oxide in Birch Cells: Implications of the Triterpene Biosynthetic Pathway

PubMed Central

Zeng, Fansuo; Sun, Fengkun; Li, Leilei; Liu, Kun; Zhan, Yaguang

2014-01-01

Evidence supporting nitric oxide (NO) as a mediator of plant biochemistry continues to grow, but its functions at the molecular level remains poorly understood and, in some cases, controversial. To study the role of NO at the transcriptional level in Betula platyphylla cells, we conducted a genome-scale transcriptome analysis of these cells. The transcriptome of untreated birch cells and those treated by sodium nitroprusside (SNP) were analyzed using the Solexa sequencing. Data were collected by sequencing cDNA libraries of birch cells, which had a long period to adapt to the suspension culture conditions before SNP-treated cells and untreated cells were sampled. Among the 34,100 UniGenes detected, BLASTX search revealed that 20,631 genes showed significant (E-values≤10−5) sequence similarity with proteins from the NR-database. Numerous expressed sequence tags (i.e., 1374) were identified as differentially expressed between the 12 h SNP-treated cells and control cells samples: 403 up-regulated and 971 down-regulated. From this, we specifically examined a core set of NO-related transcripts. The altered expression levels of several transcripts, as determined by transcriptome analysis, was confirmed by qRT-PCR. The results of transcriptome analysis, gene expression quantification, the content of triterpenoid and activities of defensive enzymes elucidated NO has a significant effect on many processes including triterpenoid production, carbohydrate metabolism and cell wall biosynthesis. PMID:25551661
Transcriptome Analysis of Leaves, Flowers and Fruits Perisperm of Coffea arabica L. Reveals the Differential Expression of Genes Involved in Raffinose Biosynthesis

PubMed Central

dos Santos, Tiago Benedito; de Oliveira, Fernanda Freitas; Pot, David; Leroy, Thierry; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães

2017-01-01

Coffea arabica L. is an important crop in several developing countries. Despite its economic importance, minimal transcriptome data are available for fruit tissues, especially during fruit development where several compounds related to coffee quality are produced. To understand the molecular aspects related to coffee fruit and grain development, we report a large-scale transcriptome analysis of leaf, flower and perisperm fruit tissue development. Illumina sequencing yielded 41,881,572 high-quality filtered reads. De novo assembly generated 65,364 unigenes with an average length of 1,264 bp. A total of 24,548 unigenes were annotated as protein coding genes, including 12,560 full-length sequences. In the annotation process, we identified nine candidate genes related to the biosynthesis of raffinose family oligossacarides (RFOs). These sugars confer osmoprotection and are accumulated during initial fruit development. Four genes from this pathway had their transcriptional pattern validated by quantitative reverse transcription polymerase chain reaction (qRT-PCR). Furthermore, we identified ~24,000 putative target sites for microRNAs (miRNAs) and 134 putative transcriptionally active transposable elements (TE) sequences in our dataset. This C. arabica transcriptomic atlas provides an important step for identifying candidate genes related to several coffee metabolic pathways, especially those related to fruit chemical composition and therefore beverage quality. Our results are the starting point for enhancing our knowledge about the coffee genes that are transcribed during the flowering and initial fruit development stages. PMID:28068432
Transcriptome Analysis of Leaves, Flowers and Fruits Perisperm of Coffea arabica L. Reveals the Differential Expression of Genes Involved in Raffinose Biosynthesis.

PubMed

Ivamoto, Suzana Tiemi; Reis, Osvaldo; Domingues, Douglas Silva; Dos Santos, Tiago Benedito; de Oliveira, Fernanda Freitas; Pot, David; Leroy, Thierry; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães; Pereira, Luiz Filipe Protasio

2017-01-01

Coffea arabica L. is an important crop in several developing countries. Despite its economic importance, minimal transcriptome data are available for fruit tissues, especially during fruit development where several compounds related to coffee quality are produced. To understand the molecular aspects related to coffee fruit and grain development, we report a large-scale transcriptome analysis of leaf, flower and perisperm fruit tissue development. Illumina sequencing yielded 41,881,572 high-quality filtered reads. De novo assembly generated 65,364 unigenes with an average length of 1,264 bp. A total of 24,548 unigenes were annotated as protein coding genes, including 12,560 full-length sequences. In the annotation process, we identified nine candidate genes related to the biosynthesis of raffinose family oligossacarides (RFOs). These sugars confer osmoprotection and are accumulated during initial fruit development. Four genes from this pathway had their transcriptional pattern validated by quantitative reverse transcription polymerase chain reaction (qRT-PCR). Furthermore, we identified ~24,000 putative target sites for microRNAs (miRNAs) and 134 putative transcriptionally active transposable elements (TE) sequences in our dataset. This C. arabica transcriptomic atlas provides an important step for identifying candidate genes related to several coffee metabolic pathways, especially those related to fruit chemical composition and therefore beverage quality. Our results are the starting point for enhancing our knowledge about the coffee genes that are transcribed during the flowering and initial fruit development stages.
Transcriptome analysis of carbohydrate metabolism during bulblet formation and development in Lilium davidii var. unicolor.

PubMed

Li, XueYan; Wang, ChunXia; Cheng, JinYun; Zhang, Jing; da Silva, Jaime A Teixeira; Liu, XiaoYu; Duan, Xin; Li, TianLai; Sun, HongMei

2014-12-19

The formation and development of bulblets are crucial to the Lilium genus since these processes are closely related to carbohydrate metabolism, especially to starch and sucrose metabolism. However, little is known about the transcriptional regulation of both processes. To gain insight into carbohydrate-related genes involved in bulblet formation and development, we conducted comparative transcriptome profiling of Lilium davidii var. unicolor bulblets at 0 d, 15 d (bulblets emerged) and 35 d (bulblets formed a basic shape with three or four scales) after scale propagation. Analysis of the transcriptome revealed that a total of 52,901 unigenes with an average sequence size of 630 bp were generated. Based on Clusters of Orthologous Groups (COG) analysis, 8% of the sequences were attributed to carbohydrate transport and metabolism. The results of KEGG pathway enrichment analysis showed that starch and sucrose metabolism constituted the predominant pathway among the three library pairs. The starch content in mother scales and bulblets decreased and increased, respectively, with almost the same trend as sucrose content. Gene expression analysis of the key enzymes in starch and sucrose metabolism suggested that sucrose synthase (SuSy) and invertase (INV), mainly hydrolyzing sucrose, presented higher gene expression in mother scales and bulblets at stages of bulblet appearance and enlargement, while sucrose phosphate synthase (SPS) showed higher expression in bulblets at morphogenesis. The enzymes involved in the starch synthetic direction such as ADPG pyrophosphorylase (AGPase), soluble starch synthase (SSS), starch branching enzyme (SBE) and granule-bound starch synthase (GBSS) showed a decreasing trend in mother scales and higher gene expression in bulblets at bulblet appearance and enlargement stages while the enzyme in the cleavage direction, starch de-branching enzyme (SDBE), showed higher gene expression in mother scales than in bulblets. An extensive transcriptome analysis of three bulblet development stages contributes considerable novel information to our understanding of carbohydrate metabolism-related genes in Lilium at the transcriptional level, and demonstrates the fundamentality of carbohydrate metabolism in bulblet emergence and development at the molecular level. This could facilitate further investigation into the molecular mechanisms underlying these processes in lily and other related species.
Transcriptome dynamics through alternative polyadenylation in developmental and environmental responses in plants revealed by deep sequencing

PubMed Central

Shen, Yingjia; Venu, R.C.; Nobuta, Kan; Wu, Xiaohui; Notibala, Varun; Demirci, Caghan; Meyers, Blake C.; Wang, Guo-Liang; Ji, Guoli; Li, Qingshun Q.

2011-01-01

Polyadenylation sites mark the ends of mRNA transcripts. Alternative polyadenylation (APA) may alter sequence elements and/or the coding capacity of transcripts, a mechanism that has been demonstrated to regulate gene expression and transcriptome diversity. To study the role of APA in transcriptome dynamics, we analyzed a large-scale data set of RNA “tags” that signify poly(A) sites and expression levels of mRNA. These tags were derived from a wide range of tissues and developmental stages that were mutated or exposed to environmental treatments, and generated using digital gene expression (DGE)–based protocols of the massively parallel signature sequencing (MPSS-DGE) and the Illumina sequencing-by-synthesis (SBS-DGE) sequencing platforms. The data offer a global view of APA and how it contributes to transcriptome dynamics. Upon analysis of these data, we found that ∼60% of Arabidopsis genes have multiple poly(A) sites. Likewise, ∼47% and 82% of rice genes use APA, supported by MPSS-DGE and SBS-DGE tags, respectively. In both species, ∼49%–66% of APA events were mapped upstream of annotated stop codons. Interestingly, 10% of the transcriptomes are made up of APA transcripts that are differentially distributed among developmental stages and in tissues responding to environmental stresses, providing an additional level of transcriptome dynamics. Examples of pollen-specific APA switching and salicylic acid treatment-specific APA clearly demonstrated such dynamics. The significance of these APAs is more evident in the 3034 genes that have conserved APA events between rice and Arabidopsis. PMID:21813626
Transcriptome analysis of Pinus halepensis under drought stress and during recovery

PubMed Central

Fox, Hagar; Doron-Faigenboim, Adi; Kelly, Gilor; Bourstein, Ronny; Attia, Ziv; Zhou, Jing; Moshe, Yosef; Moshelion, Menachem; David-Schwartz, Rakefet

2018-01-01

Abstract Forest trees use various strategies to cope with drought stress and these strategies involve complex molecular mechanisms. Pinus halepensis Miller (Aleppo pine) is found throughout the Mediterranean basin and is one of the most drought-tolerant pine species. In order to decipher the molecular mechanisms that P. halepensis uses to withstand drought, we performed large-scale physiological and transcriptome analyses. We selected a mature tree from a semi-arid area with suboptimal growth conditions for clonal propagation through cuttings. We then used a high-throughput experimental system to continuously monitor whole-plant transpiration rates, stomatal conductance and the vapor pressure deficit. The transcriptomes of plants were examined at six physiological stages: pre-stomatal response, partial stomatal closure, minimum transpiration, post-irrigation, partial recovery and full recovery. At each stage, data from plants exposed to the drought treatment were compared with data collected from well-irrigated control plants. A drought-stressed P. halepensis transcriptome was created using paired-end RNA-seq. In total, ~6000 differentially expressed, non-redundant transcripts were identified between drought-treated and control trees. Cluster analysis has revealed stress-induced down-regulation of transcripts related to photosynthesis, reactive oxygen species (ROS)-scavenging through the ascorbic acid (AsA)-glutathione cycle, fatty acid and cell wall biosynthesis, stomatal activity, and the biosynthesis of flavonoids and terpenoids. Up-regulated processes included chlorophyll degradation, ROS-scavenging through AsA-independent thiol-mediated pathways, abscisic acid response and accumulation of heat shock proteins, thaumatin and exordium. Recovery from drought induced strong transcription of retrotransposons, especially the retrovirus-related transposon Tnt1-94. The drought-related transcriptome illustrates this species’ dynamic response to drought and recovery and unravels novel mechanisms. PMID:29177514
Transcriptome analysis of Pinus halepensis under drought stress and during recovery.

PubMed

Fox, Hagar; Doron-Faigenboim, Adi; Kelly, Gilor; Bourstein, Ronny; Attia, Ziv; Zhou, Jing; Moshe, Yosef; Moshelion, Menachem; David-Schwartz, Rakefet

2018-03-01

Forest trees use various strategies to cope with drought stress and these strategies involve complex molecular mechanisms. Pinus halepensis Miller (Aleppo pine) is found throughout the Mediterranean basin and is one of the most drought-tolerant pine species. In order to decipher the molecular mechanisms that P. halepensis uses to withstand drought, we performed large-scale physiological and transcriptome analyses. We selected a mature tree from a semi-arid area with suboptimal growth conditions for clonal propagation through cuttings. We then used a high-throughput experimental system to continuously monitor whole-plant transpiration rates, stomatal conductance and the vapor pressure deficit. The transcriptomes of plants were examined at six physiological stages: pre-stomatal response, partial stomatal closure, minimum transpiration, post-irrigation, partial recovery and full recovery. At each stage, data from plants exposed to the drought treatment were compared with data collected from well-irrigated control plants. A drought-stressed P. halepensis transcriptome was created using paired-end RNA-seq. In total, ~6000 differentially expressed, non-redundant transcripts were identified between drought-treated and control trees. Cluster analysis has revealed stress-induced down-regulation of transcripts related to photosynthesis, reactive oxygen species (ROS)-scavenging through the ascorbic acid (AsA)-glutathione cycle, fatty acid and cell wall biosynthesis, stomatal activity, and the biosynthesis of flavonoids and terpenoids. Up-regulated processes included chlorophyll degradation, ROS-scavenging through AsA-independent thiol-mediated pathways, abscisic acid response and accumulation of heat shock proteins, thaumatin and exordium. Recovery from drought induced strong transcription of retrotransposons, especially the retrovirus-related transposon Tnt1-94. The drought-related transcriptome illustrates this species' dynamic response to drought and recovery and unravels novel mechanisms.
Transcriptomic and proteomic responses of Serratia marcescens to spaceflight conditions involve large-scale changes in metabolic pathways

NASA Astrophysics Data System (ADS)

Wang, Yajuan; Yuan, Yanting; Liu, Jinwen; Su, Longxiang; Chang, De; Guo, Yinghua; Chen, Zhenhong; Fang, Xiangqun; Wang, Junfeng; Li, Tianzhi; Zhou, Lisha; Fang, Chengxiang; Yang, Ruifu; Liu, Changting

2014-04-01

The microgravity environment of spaceflight expeditions has been associated with altered microbial responses. This study explores the characterization of Serratia marcescensis grown in a spaceflight environment at the phenotypic, transcriptomic and proteomic levels. From November 1, 2011 to November 17, 2011, a strain of S. marcescensis was sent into space for 398 h on the Shenzhou VIII spacecraft, and ground simulation was performed as a control (LCT-SM213). After the flight, two mutant strains (LCT-SM166 and LCT-SM262) were selected for further analysis. Although no changes in the morphology, post-culture growth kinetics, hemolysis or antibiotic sensitivity were observed, the two mutant strains exhibited significant changes in their metabolic profiles after exposure to spaceflight. Enrichment analysis of the transcriptome showed that the differentially expressed genes of the two spaceflight strains and the ground control strain mainly included those involved in metabolism and degradation. The proteome revealed that changes at the protein level were also associated with metabolic functions, such as glycolysis/gluconeogenesis, pyruvate metabolism, arginine and proline metabolism and the degradation of valine, leucine and isoleucine. In summary S. marcescens showed alterations primarily in genes and proteins that were associated with metabolism under spaceflight conditions, which gave us valuable clues for future research.
Integrating Transcriptomics with Metabolic Modeling Predicts Biomarkers and Drug Targets for Alzheimer's Disease

PubMed Central

Stempler, Shiri; Yizhak, Keren; Ruppin, Eytan

2014-01-01

Accumulating evidence links numerous abnormalities in cerebral metabolism with the progression of Alzheimer's disease (AD), beginning in its early stages. Here, we integrate transcriptomic data from AD patients with a genome-scale computational human metabolic model to characterize the altered metabolism in AD, and employ state-of-the-art metabolic modelling methods to predict metabolic biomarkers and drug targets in AD. The metabolic descriptions derived are first tested and validated on a large scale versus existing AD proteomics and metabolomics data. Our analysis shows a significant decrease in the activity of several key metabolic pathways, including the carnitine shuttle, folate metabolism and mitochondrial transport. We predict several metabolic biomarkers of AD progression in the blood and the CSF, including succinate and prostaglandin D2. Vitamin D and steroid metabolism pathways are enriched with predicted drug targets that could mitigate the metabolic alterations observed. Taken together, this study provides the first network wide view of the metabolic alterations associated with AD progression. Most importantly, it offers a cohort of new metabolic leads for the diagnosis of AD and its treatment. PMID:25127241
Evidence for Adaptation to the Tibetan Plateau Inferred from Tibetan Loach Transcriptomes

PubMed Central

Wang, Ying; Yang, Liandong; Zhou, Kun; Zhang, Yanping; Song, Zhaobin; He, Shunping

2015-01-01

Abstract Triplophysa fishes are the primary component of the fish fauna on the Tibetan Plateau and are well adapted to the high-altitude environment. Despite the importance of Triplophysa fishes on the plateau, the genetic mechanisms of the adaptations of these fishes to this high-altitude environment remain poorly understood. In this study, we generated the transcriptome sequences for three Triplophysa fishes, that is, Triplophysa siluroides, Triplophysa scleroptera, and Triplophysa dalaica, and used these and the previously available transcriptome and genome sequences from fishes living at low altitudes to identify potential genetic mechanisms for the high-altitude adaptations in Triplophysa fishes. An analysis of 2,269 orthologous genes among cave fish (Astyanax mexicanus), zebrafish (Danio rerio), large-scale loach (Paramisgurnus dabryanus), and Triplophysa fishes revealed that each of the terminal branches of the Triplophysa fishes had a significantly higher ratio of nonsynonymous to synonymous substitutions than that of the branches of the fishes from low altitudes, which provided consistent evidence for genome-wide rapid evolution in the Triplophysa genus. Many of the GO (Gene Ontology) categories associated with energy metabolism and hypoxia response exhibited accelerated evolution in the Triplophysa fishes compared with the large-scale loach. The genes that exhibited signs of positive selection and rapid evolution in the Triplophysa fishes were also significantly enriched in energy metabolism and hypoxia response categories. Our analysis identified widespread Triplophysa-specific nonsynonymous mutations in the fast evolving genes and positively selected genes. Moreover, we detected significant evidence of positive selection in the HIF (hypoxia-inducible factor)-1A and HIF-2B genes in Triplophysa fishes and found that the Triplophysa-specific nonsynonymous mutations in the HIF-1A and HIF-2B genes were associated with functional changes. Overall, our study provides new insights into the adaptations and evolution of fishes in the high-altitude environment of the Tibetan Plateau and complements previous findings on the adaptations of mammals and birds to high altitudes. PMID:26454018
A regulation probability model-based meta-analysis of multiple transcriptomics data sets for cancer biomarker identification.

PubMed

Xie, Xin-Ping; Xie, Yu-Feng; Wang, Hong-Qiang

2017-08-23

Large-scale accumulation of omics data poses a pressing challenge of integrative analysis of multiple data sets in bioinformatics. An open question of such integrative analysis is how to pinpoint consistent but subtle gene activity patterns across studies. Study heterogeneity needs to be addressed carefully for this goal. This paper proposes a regulation probability model-based meta-analysis, jGRP, for identifying differentially expressed genes (DEGs). The method integrates multiple transcriptomics data sets in a gene regulatory space instead of in a gene expression space, which makes it easy to capture and manage data heterogeneity across studies from different laboratories or platforms. Specifically, we transform gene expression profiles into a united gene regulation profile across studies by mathematically defining two gene regulation events between two conditions and estimating their occurring probabilities in a sample. Finally, a novel differential expression statistic is established based on the gene regulation profiles, realizing accurate and flexible identification of DEGs in gene regulation space. We evaluated the proposed method on simulation data and real-world cancer datasets and showed the effectiveness and efficiency of jGRP in identifying DEGs identification in the context of meta-analysis. Data heterogeneity largely influences the performance of meta-analysis of DEGs identification. Existing different meta-analysis methods were revealed to exhibit very different degrees of sensitivity to study heterogeneity. The proposed method, jGRP, can be a standalone tool due to its united framework and controllable way to deal with study heterogeneity.
Construction of an Ostrea edulis database from genomic and expressed sequence tags (ESTs) obtained from Bonamia ostreae infected haemocytes: Development of an immune-enriched oligo-microarray.

PubMed

Pardo, Belén G; Álvarez-Dios, José Antonio; Cao, Asunción; Ramilo, Andrea; Gómez-Tato, Antonio; Planas, Josep V; Villalba, Antonio; Martínez, Paulino

2016-12-01

The flat oyster, Ostrea edulis, is one of the main farmed oysters, not only in Europe but also in the United States and Canada. Bonamiosis due to the parasite Bonamia ostreae has been associated with high mortality episodes in this species. This parasite is an intracellular protozoan that infects haemocytes, the main cells involved in oyster defence. Due to the economical and ecological importance of flat oyster, genomic data are badly needed for genetic improvement of the species, but they are still very scarce. The objective of this study is to develop a sequence database, OedulisDB, with new genomic and transcriptomic resources, providing new data and convenient tools to improve our knowledge of the oyster's immune mechanisms. Transcriptomic and genomic sequences were obtained using 454 pyrosequencing and compiled into an O. edulis database, OedulisDB, consisting of two sets of 10,318 and 7159 unique sequences that represent the oyster's genome (WG) and de novo haemocyte transcriptome (HT), respectively. The flat oyster transcriptome was obtained from two strains (naïve and tolerant) challenged with B. ostreae, and from their corresponding non-challenged controls. Approximately 78.5% of 5619 HT unique sequences were successfully annotated by Blast search using public databases. A total of 984 sequences were identified as being related to immune response and several key immune genes were identified for the first time in flat oyster. Additionally, transcriptome information was used to design and validate the first oligo-microarray in flat oyster enriched with immune sequences from haemocytes. Our transcriptomic and genomic sequencing and subsequent annotation have largely increased the scarce resources available for this economically important species and have enabled us to develop an OedulisDB database and accompanying tools for gene expression analysis. This study represents the first attempt to characterize in depth the O. edulis haemocyte transcriptome in response to B. ostreae through massively sequencing and has aided to improve our knowledge of the immune mechanisms of flat oyster. The validated oligo-microarray and the establishment of a reference transcriptome will be useful for large-scale gene expression studies in this species. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
Systems biology of embryonic development: Prospects for a complete understanding of the Caenorhabditis elegans embryo.

PubMed

Murray, John Isaac

2018-05-01

The convergence of developmental biology and modern genomics tools brings the potential for a comprehensive understanding of developmental systems. This is especially true for the Caenorhabditis elegans embryo because its small size, invariant developmental lineage, and powerful genetic and genomic tools provide the prospect of a cellular resolution understanding of messenger RNA (mRNA) expression and regulation across the organism. We describe here how a systems biology framework might allow large-scale determination of the embryonic regulatory relationships encoded in the C. elegans genome. This framework consists of two broad steps: (a) defining the "parts list"-all genes expressed in all cells at each time during development and (b) iterative steps of computational modeling and refinement of these models by experimental perturbation. Substantial progress has been made towards defining the parts list through imaging methods such as large-scale green fluorescent protein (GFP) reporter analysis. Imaging results are now being augmented by high-resolution transcriptome methods such as single-cell RNA sequencing, and it is likely the complete expression patterns of all genes across the embryo will be known within the next few years. In contrast, the modeling and perturbation experiments performed so far have focused largely on individual cell types or genes, and improved methods will be needed to expand them to the full genome and organism. This emerging comprehensive map of embryonic expression and regulatory function will provide a powerful resource for developmental biologists, and would also allow scientists to ask questions not accessible without a comprehensive picture. This article is categorized under: Invertebrate Organogenesis > Worms Technologies > Analysis of the Transcriptome Gene Expression and Transcriptional Hierarchies > Gene Networks and Genomics. © 2018 Wiley Periodicals, Inc.
Targeted exploration and analysis of large cross-platform human transcriptomic compendia

PubMed Central

Zhu, Qian; Wong, Aaron K; Krishnan, Arjun; Aure, Miriam R; Tadych, Alicja; Zhang, Ran; Corney, David C; Greene, Casey S; Bongo, Lars A; Kristensen, Vessela N; Charikar, Moses; Li, Kai; Troyanskaya, Olga G.

2016-01-01

We present SEEK (http://seek.princeton.edu), a query-based search engine across very large transcriptomic data collections, including thousands of human data sets from almost 50 microarray and next-generation sequencing platforms. SEEK uses a novel query-level cross-validation-based algorithm to automatically prioritize data sets relevant to the query and a robust search approach to identify query-coregulated genes, pathways, and processes. SEEK provides cross-platform handling, multi-gene query search, iterative metadata-based search refinement, and extensive visualization-based analysis options. PMID:25581801
Sugarcane giant borer transcriptome analysis and identification of genes related to digestion.

PubMed

Fonseca, Fernando Campos de Assis; Firmino, Alexandre Augusto Pereira; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; de Souza Júnior, José Dijair Antonino; de Sousa Júnior, José Dijair Antonino; Silva-Junior, Orzenil Bonfim; Togawa, Roberto Coiti; Pappas, Georgios Joannis; de Góis, Luiz Avelar Brandão; da Silva, Maria Cristina Mattar; Grossi-de-Sá, Maria Fátima

2015-01-01

Sugarcane is a widely cultivated plant that serves primarily as a source of sugar and ethanol. Its annual yield can be significantly reduced by the action of several insect pests including the sugarcane giant borer (Telchin licus licus), a lepidopteran that presents a long life cycle and which efforts to control it using pesticides have been inefficient. Although its economical relevance, only a few DNA sequences are available for this species in the GenBank. Pyrosequencing technology was used to investigate the transcriptome of several developmental stages of the insect. To maximize transcript diversity, a pool of total RNA was extracted from whole body insects and used to construct a normalized cDNA database. Sequencing produced over 650,000 reads, which were de novo assembled to generate a reference library of 23,824 contigs. After quality score and annotation, 43% of the contigs had at least one BLAST hit against the NCBI non-redundant database, and 40% showed similarities with the lepidopteran Bombyx mori. In a further analysis, we conducted a comparison with Manduca sexta midgut sequences to identify transcripts of genes involved in digestion. Of these transcripts, many presented an expansion or depletion in gene number, compared to B. mori genome. From the sugarcane giant borer (SGB) transcriptome, a number of aminopeptidase N (APN) cDNAs were characterized based on homology to those reported as Cry toxin receptors. This is the first report that provides a large-scale EST database for the species. Transcriptome analysis will certainly be useful to identify novel developmental genes, to better understand the insect's biology and to guide the development of new strategies for insect-pest control.
Transcriptome profiling to discover putative genes associated with paraquat resistance in goosegrass (Eleusine indica L.).

PubMed

An, Jing; Shen, Xuefeng; Ma, Qibin; Yang, Cunyi; Liu, Simin; Chen, Yong

2014-01-01

Goosegrass (Eleusine indica L.), a serious annual weed in the world, has evolved resistance to several herbicides including paraquat, a non-selective herbicide. The mechanism of paraquat resistance in weeds is only partially understood. To further study the molecular mechanism underlying paraquat resistance in goosegrass, we performed transcriptome analysis of susceptible and resistant biotypes of goosegrass with or without paraquat treatment. The RNA-seq libraries generated 194,716,560 valid reads with an average length of 91.29 bp. De novo assembly analysis produced 158,461 transcripts with an average length of 1153.74 bp and 100,742 unigenes with an average length of 712.79 bp. Among these, 25,926 unigenes were assigned to 65 GO terms that contained three main categories. A total of 13,809 unigenes with 1,208 enzyme commission numbers were assigned to 314 predicted KEGG metabolic pathways, and 12,719 unigenes were categorized into 25 KOG classifications. Furthermore, our results revealed that 53 genes related to reactive oxygen species scavenging, 10 genes related to polyamines and 18 genes related to transport were differentially expressed in paraquat treatment experiments. The genes related to polyamines and transport are likely potential candidate genes that could be further investigated to confirm their roles in paraquat resistance of goosegrass. This is the first large-scale transcriptome sequencing of E. indica using the Illumina platform. Potential genes involved in paraquat resistance were identified from the assembled sequences. The transcriptome data may serve as a reference for further analysis of gene expression and functional genomics studies, and will facilitate the study of paraquat resistance at the molecular level in goosegrass.

Transcriptome Profiling to Discover Putative Genes Associated with Paraquat Resistance in Goosegrass (Eleusine indica L.)

PubMed Central

An, Jing; Shen, Xuefeng; Ma, Qibin; Yang, Cunyi; Liu, Simin; Chen, Yong

2014-01-01

Background Goosegrass (Eleusine indica L.), a serious annual weed in the world, has evolved resistance to several herbicides including paraquat, a non-selective herbicide. The mechanism of paraquat resistance in weeds is only partially understood. To further study the molecular mechanism underlying paraquat resistance in goosegrass, we performed transcriptome analysis of susceptible and resistant biotypes of goosegrass with or without paraquat treatment. Results The RNA-seq libraries generated 194,716,560 valid reads with an average length of 91.29 bp. De novo assembly analysis produced 158,461 transcripts with an average length of 1153.74 bp and 100,742 unigenes with an average length of 712.79 bp. Among these, 25,926 unigenes were assigned to 65 GO terms that contained three main categories. A total of 13,809 unigenes with 1,208 enzyme commission numbers were assigned to 314 predicted KEGG metabolic pathways, and 12,719 unigenes were categorized into 25 KOG classifications. Furthermore, our results revealed that 53 genes related to reactive oxygen species scavenging, 10 genes related to polyamines and 18 genes related to transport were differentially expressed in paraquat treatment experiments. The genes related to polyamines and transport are likely potential candidate genes that could be further investigated to confirm their roles in paraquat resistance of goosegrass. Conclusion This is the first large-scale transcriptome sequencing of E. indica using the Illumina platform. Potential genes involved in paraquat resistance were identified from the assembled sequences. The transcriptome data may serve as a reference for further analysis of gene expression and functional genomics studies, and will facilitate the study of paraquat resistance at the molecular level in goosegrass. PMID:24927422
Sugarcane Giant Borer Transcriptome Analysis and Identification of Genes Related to Digestion

PubMed Central

de Assis Fonseca, Fernando Campos; Firmino, Alexandre Augusto Pereira; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; de Sousa Júnior, José Dijair Antonino; Silva-Junior, Orzenil Bonfim; Togawa, Roberto Coiti; Pappas, Georgios Joannis; de Góis, Luiz Avelar Brandão; da Silva, Maria Cristina Mattar; Grossi-de-Sá, Maria Fátima

2015-01-01

Sugarcane is a widely cultivated plant that serves primarily as a source of sugar and ethanol. Its annual yield can be significantly reduced by the action of several insect pests including the sugarcane giant borer (Telchin licus licus), a lepidopteran that presents a long life cycle and which efforts to control it using pesticides have been inefficient. Although its economical relevance, only a few DNA sequences are available for this species in the GenBank. Pyrosequencing technology was used to investigate the transcriptome of several developmental stages of the insect. To maximize transcript diversity, a pool of total RNA was extracted from whole body insects and used to construct a normalized cDNA database. Sequencing produced over 650,000 reads, which were de novo assembled to generate a reference library of 23,824 contigs. After quality score and annotation, 43% of the contigs had at least one BLAST hit against the NCBI non-redundant database, and 40% showed similarities with the lepidopteran Bombyx mori. In a further analysis, we conducted a comparison with Manduca sexta midgut sequences to identify transcripts of genes involved in digestion. Of these transcripts, many presented an expansion or depletion in gene number, compared to B. mori genome. From the sugarcane giant borer (SGB) transcriptome, a number of aminopeptidase N (APN) cDNAs were characterized based on homology to those reported as Cry toxin receptors. This is the first report that provides a large-scale EST database for the species. Transcriptome analysis will certainly be useful to identify novel developmental genes, to better understand the insect’s biology and to guide the development of new strategies for insect-pest control. PMID:25706301
Autotoxicity mechanism of Oryza sativa: transcriptome response in rice roots exposed to ferulic acid

PubMed Central

2013-01-01

Background Autotoxicity plays an important role in regulating crop yield and quality. To help characterize the autotoxicity mechanism of rice, we performed a large-scale, transcriptomic analysis of the rice root response to ferulic acid, an autotoxin from rice straw. Results Root growth rate was decreased and reactive oxygen species, calcium content and lipoxygenase activity were increased with increasing ferulic acid concentration in roots. Transcriptome analysis revealed more transcripts responsive to short ferulic-acid exposure (1- and 3-h treatments, 1,204 genes) than long exposure (24 h, 176 genes). Induced genes were involved in cell wall formation, chemical detoxification, secondary metabolism, signal transduction, and abiotic stress response. Genes associated with signaling and biosynthesis for ethylene and jasmonic acid were upregulated with ferulic acid. Ferulic acid upregulated ATP-binding cassette and amino acid/auxin permease transporters as well as genes encoding signaling components such as leucine-rich repeat VIII and receptor-like cytoplasmic kinases VII protein kinases, APETALA2/ethylene response factor, WRKY, MYB and Zinc-finger protein expressed in inflorescence meristem transcription factors. Conclusions The results of a transcriptome analysis suggest the molecular mechanisms of plants in response to FA, including toxicity, detoxicification and signaling machinery. FA may have a significant effect on inhibiting rice root elongation through modulating ET and JA hormone homeostasis. FA-induced gene expression of AAAP transporters may contribute to detoxicification of the autotoxin. Moreover, the WRKY and Myb TFs and LRR-VIII and SD-2b kinases might regulate downstream genes under FA stress but not general allelochemical stress. This comprehensive description of gene expression information could greatly facilitate our understanding of the mechanisms of autotoxicity in plants. PMID:23705659
A reference guide for tree analysis and visualization

PubMed Central

2010-01-01

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis. PMID:20175922
De novo Assembly of the Indo-Pacific Humpback Dolphin Leucocyte Transcriptome to Identify Putative Genes Involved in the Aquatic Adaptation and Immune Response

PubMed Central

Xia, Jia; Yang, Lili; Chen, Jialin; Wu, Yuping; Yi, Meisheng

2013-01-01

Background The Indo-Pacific humpback dolphin (Sousa chinensis), a marine mammal species inhabited in the waters of Southeast Asia, South Africa and Australia, has attracted much attention because of the dramatic decline in population size in the past decades, which raises the concern of extinction. So far, this species is poorly characterized at molecular level due to little sequence information available in public databases. Recent advances in large-scale RNA sequencing provide an efficient approach to generate abundant sequences for functional genomic analyses in the species with un-sequenced genomes. Principal Findings We performed a de novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome by Illumina sequencing. 108,751 high quality sequences from 47,840,388 paired-end reads were generated, and 48,868 and 46,587 unigenes were functionally annotated by BLAST search against the NCBI non-redundant and Swiss-Prot protein databases (E-value<10−5), respectively. In total, 16,467 unigenes were clustered into 25 functional categories by searching against the COG database, and BLAST2GO search assigned 37,976 unigenes to 61 GO terms. In addition, 36,345 unigenes were grouped into 258 KEGG pathways. We also identified 9,906 simple sequence repeats and 3,681 putative single nucleotide polymorphisms as potential molecular markers in our assembled sequences. A large number of unigenes were predicted to be involved in immune response, and many genes were predicted to be relevant to adaptive evolution and cetacean-specific traits. Conclusion This study represented the first transcriptome analysis of the Indo-Pacific humpback dolphin, an endangered species. The de novo transcriptome analysis of the unique transcripts will provide valuable sequence information for discovery of new genes, characterization of gene expression, investigation of various pathways and adaptive evolution, as well as identification of genetic markers. PMID:24015242
De novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome to identify putative genes involved in the aquatic adaptation and immune response.

PubMed

Gui, Duan; Jia, Kuntong; Xia, Jia; Yang, Lili; Chen, Jialin; Wu, Yuping; Yi, Meisheng

2013-01-01

The Indo-Pacific humpback dolphin (Sousa chinensis), a marine mammal species inhabited in the waters of Southeast Asia, South Africa and Australia, has attracted much attention because of the dramatic decline in population size in the past decades, which raises the concern of extinction. So far, this species is poorly characterized at molecular level due to little sequence information available in public databases. Recent advances in large-scale RNA sequencing provide an efficient approach to generate abundant sequences for functional genomic analyses in the species with un-sequenced genomes. We performed a de novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome by Illumina sequencing. 108,751 high quality sequences from 47,840,388 paired-end reads were generated, and 48,868 and 46,587 unigenes were functionally annotated by BLAST search against the NCBI non-redundant and Swiss-Prot protein databases (E-value<10(-5)), respectively. In total, 16,467 unigenes were clustered into 25 functional categories by searching against the COG database, and BLAST2GO search assigned 37,976 unigenes to 61 GO terms. In addition, 36,345 unigenes were grouped into 258 KEGG pathways. We also identified 9,906 simple sequence repeats and 3,681 putative single nucleotide polymorphisms as potential molecular markers in our assembled sequences. A large number of unigenes were predicted to be involved in immune response, and many genes were predicted to be relevant to adaptive evolution and cetacean-specific traits. This study represented the first transcriptome analysis of the Indo-Pacific humpback dolphin, an endangered species. The de novo transcriptome analysis of the unique transcripts will provide valuable sequence information for discovery of new genes, characterization of gene expression, investigation of various pathways and adaptive evolution, as well as identification of genetic markers.
Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis

PubMed Central

Hahn, Daniel A; Ragland, Gregory J; Shoemaker, D DeWayne; Denlinger, David L

2009-01-01

Background Flesh flies in the genus Sarcophaga are important models for investigating endocrinology, diapause, cold hardiness, reproduction, and immunity. Despite the prominence of Sarcophaga flesh flies as models for insect physiology and biochemistry, and in forensic studies, little genomic or transcriptomic data are available for members of this genus. We used massively parallel pyrosequencing on the Roche 454-FLX platform to produce a substantial EST dataset for the flesh fly Sarcophaga crassipalpis. To maximize sequence diversity, we pooled RNA extracted from whole bodies of all life stages and normalized the cDNA pool after reverse transcription. Results We obtained 207,110 ESTs with an average read length of 241 bp. These reads assembled into 20,995 contigs and 31,056 singletons. Using BLAST searches of the NR and NT databases we were able to identify 11,757 unique gene elements (E<0.0001) representing approximately 9,000 independent transcripts. Comparison of the distribution of S. crassipalpis unigenes among GO Biological Process functional groups with that of the Drosophila melanogaster transcriptome suggests that our ESTs are broadly representative of the flesh fly transcriptome. Insertion and deletion errors in 454 sequencing present a serious hurdle to comparative transcriptome analysis. Aided by a new approach to correcting for these errors, we performed a comparative analysis of genetic divergence across GO categories among S. crassipalpis, D. melanogaster, and Anopheles gambiae. The results suggest that non-synonymous substitutions occur at similar rates across categories, although genes related to response to stimuli may evolve slightly faster. In addition, we identified over 500 potential microsatellite loci and more than 12,000 SNPs among our ESTs. Conclusion Our data provides the first large-scale EST-project for flesh flies, a much-needed resource for exploring this model species. In addition, we identified a large number of potential microsatellite and SNP markers that could be used in population and systematic studies of S. crassipalpis and other flesh flies. PMID:19454017
Single-cell triple omics sequencing reveals genetic, epigenetic, and transcriptomic heterogeneity in hepatocellular carcinomas

PubMed Central

Hou, Yu; Guo, Huahu; Cao, Chen; Li, Xianlong; Hu, Boqiang; Zhu, Ping; Wu, Xinglong; Wen, Lu; Tang, Fuchou; Huang, Yanyi; Peng, Jirun

2016-01-01

Single-cell genome, DNA methylome, and transcriptome sequencing methods have been separately developed. However, to accurately analyze the mechanism by which transcriptome, genome and DNA methylome regulate each other, these omic methods need to be performed in the same single cell. Here we demonstrate a single-cell triple omics sequencing technique, scTrio-seq, that can be used to simultaneously analyze the genomic copy-number variations (CNVs), DNA methylome, and transcriptome of an individual mammalian cell. We show that large-scale CNVs cause proportional changes in RNA expression of genes within the gained or lost genomic regions, whereas these CNVs generally do not affect DNA methylation in these regions. Furthermore, we applied scTrio-seq to 25 single cancer cells derived from a human hepatocellular carcinoma tissue sample. We identified two subpopulations within these cells based on CNVs, DNA methylome, or transcriptome of individual cells. Our work offers a new avenue of dissecting the complex contribution of genomic and epigenomic heterogeneities to the transcriptomic heterogeneity within a population of cells. PMID:26902283
Transcriptional analysis of product-concentration driven changes in cellular programs of recombinant Clostridium acetobutylicumstrains.

PubMed

Tummala, Seshu B; Junne, Stefan G; Paredes, Carlos J; Papoutsakis, Eleftherios T

2003-12-30

Antisense RNA (asRNA) downregulation alters protein expression without changing the regulation of gene expression. Downregulation of primary metabolic enzymes possibly combined with overexpression of other metabolic enzymes may result in profound changes in product formation, and this may alter the large-scale transcriptional program of the cells. DNA-array based large-scale transcriptional analysis has the potential to elucidate factors that control cellular fluxes even in the absence of proteome data. These themes are explored in the study of large-scale transcriptional analysis programs and the in vivo primary-metabolism fluxes of several related recombinant C. acetobutylicum strains: C. acetobutylicum ATCC 824(pSOS95del) (plasmid control; produces high levels of butanol snd acetone), 824(pCTFB1AS) (expresses antisense RNA against CoA transferase (ctfb1-asRNA); produces very low levels of butanol and acetone), and 824(pAADB1) (expresses ctfb1-asRNA and the alcohol-aldehyde dahydrogenase gene (aad); produce high alcohol and low acetone levels). DNA-array based transcriptional analysis revealed that the large changes in product concentrations (snd notably butanol concentration) due to ctfb1-asRNA expression alone and in combination with aad overexpression resulted in dramatic changes of the cellular transcriptome. Cluster analysis and gene expression patterns of established and putative operons involved in stress response, motility, sporulation, and fatty-acid biosynthesis indicate that these simple genetic changes dramatically alter the cellular programs of C. acetobutylicum. Comparison of gene expression and flux analysis data may point to possible flux-controling steps and suggest unknown regulatory mechanisms. Copyright 2003; Wiley Periodicals, Inc.
Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shi, CY; Yang, H; Wei, CL

Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled intomore » 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis.« less
Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

PubMed Central

2011-01-01

Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A)+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). Conclusions An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis. PMID:21356090
Identification and characterization of large DNA deletions affecting oil quality traits in soybean seeds through transcriptome sequencing analysis

USDA-ARS?s Scientific Manuscript database

Understanding the molecular and genetic mechanisms underlying variation in seed composition and contents among different genotypes is important for soybean oil quality improvement. We designed a bioinformatics approach to compare seed transcriptomes of 9 soybean genotypes varying in oil composition ...
Probing the Xenopus laevis inner ear transcriptome for biological function

PubMed Central

2012-01-01

Background The senses of hearing and balance depend upon mechanoreception, a process that originates in the inner ear and shares features across species. Amphibians have been widely used for physiological studies of mechanotransduction by sensory hair cells. In contrast, much less is known of the genetic basis of auditory and vestibular function in this class of animals. Among amphibians, the genus Xenopus is a well-characterized genetic and developmental model that offers unique opportunities for inner ear research because of the amphibian capacity for tissue and organ regeneration. For these reasons, we implemented a functional genomics approach as a means to undertake a large-scale analysis of the Xenopus laevis inner ear transcriptome through microarray analysis. Results Microarray analysis uncovered genes within the X. laevis inner ear transcriptome associated with inner ear function and impairment in other organisms, thereby supporting the inclusion of Xenopus in cross-species genetic studies of the inner ear. The use of gene categories (inner ear tissue; deafness; ion channels; ion transporters; transcription factors) facilitated the assignment of functional significance to probe set identifiers. We enhanced the biological relevance of our microarray data by using a variety of curation approaches to increase the annotation of the Affymetrix GeneChip® Xenopus laevis Genome array. In addition, annotation analysis revealed the prevalence of inner ear transcripts represented by probe set identifiers that lack functional characterization. Conclusions We identified an abundance of targets for genetic analysis of auditory and vestibular function. The orthologues to human genes with known inner ear function and the highly expressed transcripts that lack annotation are particularly interesting candidates for future analyses. We used informatics approaches to impart biologically relevant information to the Xenopus inner ear transcriptome, thereby addressing the impediment imposed by insufficient gene annotation. These findings heighten the relevance of Xenopus as a model organism for genetic investigations of inner ear organogenesis, morphogenesis, and regeneration. PMID:22676585
Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.).

PubMed

Dubey, Anuja; Farmer, Andrew; Schlueter, Jessica; Cannon, Steven B; Abernathy, Brian; Tuteja, Reetu; Woodward, Jimmy; Shah, Trushar; Mulasmanovic, Benjamin; Kudapa, Himabindu; Raju, Nikku L; Gothalwal, Ragini; Pande, Suresh; Xiao, Yongli; Town, Chris D; Singh, Nagendra K; May, Gregory D; Jackson, Scott; Varshney, Rajeev K

2011-06-01

This study reports generation of large-scale genomic resources for pigeonpea, a so-called 'orphan crop species' of the semi-arid tropic regions. FLX/454 sequencing carried out on a normalized cDNA pool prepared from 31 tissues produced 494 353 short transcript reads (STRs). Cluster analysis of these STRs, together with 10 817 Sanger ESTs, resulted in a pigeonpea trancriptome assembly (CcTA) comprising of 127 754 tentative unique sequences (TUSs). Functional analysis of these TUSs highlights several active pathways and processes in the sampled tissues. Comparison of the CcTA with the soybean genome showed similarity to 10 857 and 16 367 soybean gene models (depending on alignment methods). Additionally, Illumina 1G sequencing was performed on Fusarium wilt (FW)- and sterility mosaic disease (SMD)-challenged root tissues of 10 resistant and susceptible genotypes. More than 160 million sequence tags were used to identify FW- and SMD-responsive genes. Sequence analysis of CcTA and the Illumina tags identified a large new set of markers for use in genetics and breeding, including 8137 simple sequence repeats, 12 141 single-nucleotide polymorphisms and 5845 intron-spanning regions. Genomic resources developed in this study should be useful for basic and applied research, not only for pigeonpea improvement but also for other related, agronomically important legumes.
Novel transcriptome assembly and improved annotation of the whiteleg shrimp (Litopenaeus vannamei), a dominant crustacean in global seafood mariculture.

PubMed

Ghaffari, Noushin; Sanchez-Flores, Alejandro; Doan, Ryan; Garcia-Orozco, Karina D; Chen, Patricia L; Ochoa-Leyva, Adrian; Lopez-Zavala, Alonso A; Carrasco, J Salvador; Hong, Chris; Brieba, Luis G; Rudiño-Piñera, Enrique; Blood, Philip D; Sawyer, Jason E; Johnson, Charles D; Dindot, Scott V; Sotelo-Mundo, Rogerio R; Criscitiello, Michael F

2014-11-25

We present a new transcriptome assembly of the Pacific whiteleg shrimp (Litopenaeus vannamei), the species most farmed for human consumption. Its functional annotation, a substantial improvement over previous ones, is provided freely. RNA-Seq with Illumina HiSeq technology was used to analyze samples extracted from shrimp abdominal muscle, hepatopancreas, gills and pleopods. We used the Trinity and Trinotate software suites for transcriptome assembly and annotation, respectively. The quality of this assembly and the affiliated targeted homology searches greatly enrich the curated transcripts currently available in public databases for this species. Comparison with the model arthropod Daphnia allows some insights into defining characteristics of decapod crustaceans. This large-scale gene discovery gives the broadest depth yet to the annotated transcriptome of this important species and should be of value to ongoing genomics and immunogenetic resistance studies in this shrimp of paramount global economic importance.
Integrated network analysis identifies fight-club nodes as a class of hubs encompassing key putative switch genes that induce major transcriptome reprogramming during grapevine development.

PubMed

Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola

2014-12-01

We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named "fight-club hubs" characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named "switch genes" was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. © 2014 American Society of Plant Biologists. All rights reserved.
Integrated Network Analysis Identifies Fight-Club Nodes as a Class of Hubs Encompassing Key Putative Switch Genes That Induce Major Transcriptome Reprogramming during Grapevine Development[W][OPEN

PubMed Central

Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola

2014-01-01

We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named “fight-club hubs” characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named “switch genes” was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. PMID:25490918
Design of a 9K illumina BeadChip for polar bears (Ursus maritimus) from RAD and transcriptome sequencing.

PubMed

Malenfant, René M; Coltman, David W; Davis, Corey S

2015-05-01

Single-nucleotide polymorphisms (SNPs) offer numerous advantages over anonymous markers such as microsatellites, including improved estimation of population parameters, finer-scale resolution of population structure and more precise genomic dissection of quantitative traits. However, many SNPs are needed to equal the resolution of a single microsatellite, and reliable large-scale genotyping of SNPs remains a challenge in nonmodel species. Here, we document the creation of a 9K Illumina Infinium BeadChip for polar bears (Ursus maritimus), which will be used to investigate: (i) the fine-scale population structure among Canadian polar bears and (ii) the genomic architecture of phenotypic traits in the Western Hudson Bay subpopulation. To this end, we used restriction-site associated DNA (RAD) sequencing from 38 bears across their circumpolar range, as well as blood/fat transcriptome sequencing of 10 individuals from Western Hudson Bay. Six-thousand RAD SNPs and 3000 transcriptomic SNPs were selected for the chip, based primarily on genomic spacing and gene function respectively. Of the 9000 SNPs ordered from Illumina, 8042 were successfully printed, and - after genotyping 1450 polar bears - 5441 of these SNPs were found to be well clustered and polymorphic. Using this array, we show rapid linkage disequilibrium decay among polar bears, we demonstrate that in a subsample of 78 individuals, our SNPs detect known genetic structure more clearly than 24 microsatellites genotyped for the same individuals and that these results are not driven by the SNP ascertainment scheme. Here, we present one of the first large-scale genotyping resources designed for a threatened species. © 2014 John Wiley & Sons Ltd.
Resources for Functional Genomics Studies in Drosophila melanogaster

PubMed Central

Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

2014-01-01

Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer.

PubMed

Ruggles, Kelly V; Tang, Zuojian; Wang, Xuya; Grover, Himanshu; Askenazi, Manor; Teubl, Jennifer; Cao, Song; McLellan, Michael D; Clauser, Karl R; Tabb, David L; Mertins, Philipp; Slebos, Robbert; Erdmann-Gilmore, Petra; Li, Shunqiang; Gunawardena, Harsha P; Xie, Ling; Liu, Tao; Zhou, Jian-Ying; Sun, Shisheng; Hoadley, Katherine A; Perou, Charles M; Chen, Xian; Davies, Sherri R; Maher, Christopher A; Kinsinger, Christopher R; Rodland, Karen D; Zhang, Hui; Zhang, Zhen; Ding, Li; Townsend, R Reid; Rodriguez, Henry; Chan, Daniel; Smith, Richard D; Liebler, Daniel C; Carr, Steven A; Payne, Samuel; Ellis, Matthew J; Fenyő, David

2016-03-01

Improvements in mass spectrometry (MS)-based peptide sequencing provide a new opportunity to determine whether polymorphisms, mutations, and splice variants identified in cancer cells are translated. Herein, we apply a proteogenomic data integration tool (QUILTS) to illustrate protein variant discovery using whole genome, whole transcriptome, and global proteome datasets generated from a pair of luminal and basal-like breast-cancer-patient-derived xenografts (PDX). The sensitivity of proteogenomic analysis for singe nucleotide variant (SNV) expression and novel splice junction (NSJ) detection was probed using multiple MS/MS sample process replicates defined here as an independent tandem MS experiment using identical sample material. Despite analysis of over 30 sample process replicates, only about 10% of SNVs (somatic and germline) detected by both DNA and RNA sequencing were observed as peptides. An even smaller proportion of peptides corresponding to NSJ observed by RNA sequencing were detected (<0.1%). Peptides mapping to DNA-detected SNVs without a detectable mRNA transcript were also observed, suggesting that transcriptome coverage was incomplete (∼80%). In contrast to germline variants, somatic variants were less likely to be detected at the peptide level in the basal-like tumor than in the luminal tumor, raising the possibility of differential translation or protein degradation effects. In conclusion, this large-scale proteogenomic integration allowed us to determine the degree to which mutations are translated and identify gaps in sequence coverage, thereby benchmarking current technology and progress toward whole cancer proteome and transcriptome analysis. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

Analysis of the Citrullus colocynthis Transcriptome during Water Deficit Stress

PubMed Central

Wang, Zhuoyu; Hu, Hongtao; Goertzen, Leslie R.; McElroy, J. Scott; Dane, Fenny

2014-01-01

Citrullus colocynthis is a very drought tolerant species, closely related to watermelon (C. lanatus var. lanatus), an economically important cucurbit crop. Drought is a threat to plant growth and development, and the discovery of drought inducible genes with various functions is of great importance. We used high throughput mRNA Illumina sequencing technology and bioinformatic strategies to analyze the C. colocynthis leaf transcriptome under drought treatment. Leaf samples at four different time points (0, 24, 36, or 48 hours of withholding water) were used for RNA extraction and Illumina sequencing. qRT-PCR of several drought responsive genes was performed to confirm the accuracy of RNA sequencing. Leaf transcriptome analysis provided the first glimpse of the drought responsive transcriptome of this unique cucurbit species. A total of 5038 full-length cDNAs were detected, with 2545 genes showing significant changes during drought stress. Principle component analysis indicated that drought was the major contributing factor regulating transcriptome changes. Up regulation of many transcription factors, stress signaling factors, detoxification genes, and genes involved in phytohormone signaling and citrulline metabolism occurred under the water deficit conditions. The C. colocynthis transcriptome data highlight the activation of a large set of drought related genes in this species, thus providing a valuable resource for future functional analysis of candidate genes in defense of drought stress. PMID:25118696
Molecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animals

PubMed Central

Knoll-Gellida, Anja; André, Michèle; Gattegno, Tamar; Forgue, Jean; Admon, Arie; Babin, Patrick J

2006-01-01

Background The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. PMID:16526958
-A curated transcriptomic dataset collection relevant to embryonic development associated with in vitro fertilization in healthy individuals and patients with polycystic ovary syndrome.

PubMed

Mackeh, Rafah; Boughorbel, Sabri; Chaussabel, Damien; Kino, Tomoshige

2017-01-01

The collection of large-scale datasets available in public repositories is rapidly growing and providing opportunities to identify and fill gaps in different fields of biomedical research. However, users of these datasets should be able to selectively browse datasets related to their field of interest. Here we made available a collection of transcriptome datasets related to human follicular cells from normal individuals or patients with polycystic ovary syndrome, in the process of their development, during in vitro fertilization. After RNA-seq dataset exclusion and careful selection based on study description and sample information, 12 datasets, encompassing a total of 85 unique transcriptome profiles, were identified in NCBI Gene Expression Omnibus and uploaded to the Gene Expression Browser (GXB), a web application specifically designed for interactive query and visualization of integrated large-scale data. Once annotated in GXB, multiple sample grouping has been made in order to create rank lists to allow easy data interpretation and comparison. The GXB tool also allows the users to browse a single gene across multiple projects to evaluate its expression profiles in multiple biological systems/conditions in a web-based customized graphical views. The curated dataset is accessible at the following link: http://ivf.gxbsidra.org/dm3/landing.gsp.
Developmental Transcriptome of Aplysia californica

PubMed Central

HEYLAND, ANDREAS; VUE, ZER; VOOLSTRA, CHRISTIAN R.; MEDINA, MÓNICA; MOROZ, LEONID L.

2014-01-01

Genome-wide transcriptional changes in development provide important insight into mechanisms underlying growth, differentiation, and patterning. However, such large-scale developmental studies have been limited to a few representatives of Ecdysozoans and Chordates. Here, we characterize transcriptomes of embryonic, larval, and metamorphic development in the marine mollusc Aplysia californica and reveal novel molecular components associated with life history transitions. Specifically, we identify more than 20 signal peptides, putative hormones, and transcription factors in association with early development and metamorphic stages—many of which seem to be evolutionarily conserved elements of signal transduction pathways. We also characterize genes related to biomineralization—a critical process of molluscan development. In summary, our experiment provides the first large-scale survey of gene expression in mollusc development, and complements previous studies on the regulatory mechanisms underlying body plan patterning and the formation of larval and juvenile structures. This study serves as a resource for further functional annotation of transcripts and genes in Aplysia, specifically and molluscs in general. A comparison of the Aplysia developmental transcriptome with similar studies in the zebra fish Danio rerio, the fruit fly Drosophila melanogaster, the nematode Caenorhabditis elegans, and other studies on molluscs suggests an overall highly divergent pattern of gene regulatory mechanisms that are likely a consequence of the different developmental modes of these organisms. PMID:21328528
A curated transcriptomic dataset collection relevant to embryonic development associated with in vitro fertilization in healthy individuals and patients with polycystic ovary syndrome

PubMed Central

Mackeh, Rafah; Boughorbel, Sabri; Chaussabel, Damien; Kino, Tomoshige

2017-01-01

The collection of large-scale datasets available in public repositories is rapidly growing and providing opportunities to identify and fill gaps in different fields of biomedical research. However, users of these datasets should be able to selectively browse datasets related to their field of interest. Here we made available a collection of transcriptome datasets related to human follicular cells from normal individuals or patients with polycystic ovary syndrome, in the process of their development, during in vitro fertilization. After RNA-seq dataset exclusion and careful selection based on study description and sample information, 12 datasets, encompassing a total of 85 unique transcriptome profiles, were identified in NCBI Gene Expression Omnibus and uploaded to the Gene Expression Browser (GXB), a web application specifically designed for interactive query and visualization of integrated large-scale data. Once annotated in GXB, multiple sample grouping has been made in order to create rank lists to allow easy data interpretation and comparison. The GXB tool also allows the users to browse a single gene across multiple projects to evaluate its expression profiles in multiple biological systems/conditions in a web-based customized graphical views. The curated dataset is accessible at the following link: http://ivf.gxbsidra.org/dm3/landing.gsp. PMID:28413616
Transcriptomic Analysis Reveals Mechanisms of Sterile and Fertile Flower Differentiation and Development in Viburnum macrocephalum f. keteleeri

PubMed Central

Lu, Zhaogeng; Xu, Jing; Li, Weixing; Zhang, Li; Cui, Jiawen; He, Qingsong; Wang, Li; Jin, Biao

2017-01-01

Sterile and fertile flowers are an important evolutionary developmental (evo-devo) phenotype in angiosperm flowers, playing important roles in pollinator attraction and sexual reproductive success. However, the gene regulatory mechanisms underlying fertile and sterile flower differentiation and development remain largely unknown. Viburnum macrocephalum f. keteleeri, which possesses fertile and sterile flowers in a single inflorescence, is a useful candidate species for investigating the regulatory networks in differentiation and development. We developed a de novo-assembled flower reference transcriptome. Using RNA sequencing (RNA-seq), we compared the expression patterns of fertile and sterile flowers isolated from the same inflorescence over its rapid developmental stages. The flower reference transcriptome consisted of 105,683 non-redundant transcripts, of which 5,675 transcripts showed significant differential expression between fertile and sterile flowers. Combined with morphological and cytological changes between fertile and sterile flowers, we identified expression changes of many genes potentially involved in reproductive processes, phytohormone signaling, and cell proliferation and expansion using RNA-seq and qRT-PCR. In particular, many transcription factors (TFs), including MADS-box family members and ABCDE-class genes, were identified, and expression changes in TFs involved in multiple functions were analyzed and highlighted to determine their roles in regulating fertile and sterile flower differentiation and development. Our large-scale transcriptional analysis of fertile and sterile flowers revealed the dynamics of transcriptional networks and potentially key components in regulating differentiation and development of fertile and sterile flowers in Viburnum macrocephalum f. keteleeri. Our data provide a useful resource for Viburnum transcriptional research and offer insights into gene regulation of differentiation of diverse evo-devo processes in flowers. PMID:28298915
N-of-1-pathways MixEnrich: advancing precision medicine via single-subject analysis in discovering dynamic changes of transcriptomes.

PubMed

Li, Qike; Schissler, A Grant; Gardeux, Vincent; Achour, Ikbel; Kenost, Colleen; Berghout, Joanne; Li, Haiquan; Zhang, Hao Helen; Lussier, Yves A

2017-05-24

Transcriptome analytic tools are commonly used across patient cohorts to develop drugs and predict clinical outcomes. However, as precision medicine pursues more accurate and individualized treatment decisions, these methods are not designed to address single-patient transcriptome analyses. We previously developed and validated the N-of-1-pathways framework using two methods, Wilcoxon and Mahalanobis Distance (MD), for personal transcriptome analysis derived from a pair of samples of a single patient. Although, both methods uncover concordantly dysregulated pathways, they are not designed to detect dysregulated pathways with up- and down-regulated genes (bidirectional dysregulation) that are ubiquitous in biological systems. We developed N-of-1-pathways MixEnrich, a mixture model followed by a gene set enrichment test, to uncover bidirectional and concordantly dysregulated pathways one patient at a time. We assess its accuracy in a comprehensive simulation study and in a RNA-Seq data analysis of head and neck squamous cell carcinomas (HNSCCs). In presence of bidirectionally dysregulated genes in the pathway or in presence of high background noise, MixEnrich substantially outperforms previous single-subject transcriptome analysis methods, both in the simulation study and the HNSCCs data analysis (ROC Curves; higher true positive rates; lower false positive rates). Bidirectional and concordant dysregulated pathways uncovered by MixEnrich in each patient largely overlapped with the quasi-gold standard compared to other single-subject and cohort-based transcriptome analyses. The greater performance of MixEnrich presents an advantage over previous methods to meet the promise of providing accurate personal transcriptome analysis to support precision medicine at point of care.
First Transcriptome and Digital Gene Expression Analysis in Neuroptera with an Emphasis on Chemoreception Genes in Chrysopa pallens (Rambur).

PubMed

Li, Zhao-Qun; Zhang, Shuai; Ma, Yan; Luo, Jun-Yu; Wang, Chun-Yi; Lv, Li-Min; Dong, Shuang-Lin; Cui, Jin-Jie

2013-01-01

Chrysopa pallens (Rambur) are the most important natural enemies and predators of various agricultural pests. Understanding the sophisticated olfactory system in insect antennae is crucial for studying the physiological bases of olfaction and also could lead to effective applications of C. pallens in integrated pest management. However no transcriptome information is available for Neuroptera, and sequence data for C. pallens are scarce, so obtaining more sequence data is a priority for researchers on this species. To facilitate identifying sets of genes involved in olfaction, a normalized transcriptome of C. pallens was sequenced. A total of 104,603 contigs were obtained and assembled into 10,662 clusters and 39,734 singletons; 20,524 were annotated based on BLASTX analyses. A large number of candidate chemosensory genes were identified, including 14 odorant-binding proteins (OBPs), 22 chemosensory proteins (CSPs), 16 ionotropic receptors, 14 odorant receptors, and genes potentially involved in olfactory modulation. To better understand the OBPs, CSPs and cytochrome P450s, phylogenetic trees were constructed. In addition, 10 digital gene expression libraries of different tissues were constructed and gene expression profiles were compared among different tissues in males and females. Our results provide a basis for exploring the mechanisms of chemoreception in C. pallens, as well as other insects. The evolutionary analyses in our study provide new insights into the differentiation and evolution of insect OBPs and CSPs. Our study provided large-scale sequence information for further studies in C. pallens.
Transcriptome characterization and SSR discovery in large-scale loach Paramisgurnus dabryanus (Cobitidae, Cypriniformes).

PubMed

Li, Caijuan; Ling, Qufei; Ge, Chen; Ye, Zhuqing; Han, Xiaofei

2015-02-25

The large-scale loach (Paramisgurnus dabryanus, Cypriniformes) is a bottom-dwelling freshwater species of fish found mainly in eastern Asia. The natural germplasm resources of this important aquaculture species has been recently threatened due to overfishing and artificial propagation. The objective of this study is to obtain the first functional genomic resource and candidate molecular markers for future conservation and breeding research. Illumina paired-end sequencing generated over one hundred million reads that resulted in 71,887 assembled transcripts, with an average length of 1465bp. 42,093 (58.56%) protein-coding sequences were predicted; and 43,837 transcripts had significant matches to NCBI nonredundant protein (Nr) database. 29,389 and 14,419 transcripts were assigned into gene ontology (GO) categories and Eukaryotic Orthologous Groups (KOG), respectively. 22,102 (31.14%) transcripts were mapped to 302 KEGG pathways. In addition, 15,106 candidate SSR markers were identified, with 11,037 pairs of PCR primers designed. 400 primers pairs of SSR selected randomly were validated, of which 364 (91%) pairs of primers were able to produce PCR products. Further test with 41 loci and 20 large-scale loach specimens collected from the four largest lakes in China showed that 36 (87.8%) loci were polymorphic. The transcriptomic profile and SSR repertoire obtained in this study will facilitate population genetic studies and selective breeding of large-scale loach in the future. Copyright © 2015. Published by Elsevier B.V.
High-confidence coding and noncoding transcriptome maps

PubMed Central

2017-01-01

The advent of high-throughput RNA sequencing (RNA-seq) has led to the discovery of unprecedentedly immense transcriptomes encoded by eukaryotic genomes. However, the transcriptome maps are still incomplete partly because they were mostly reconstructed based on RNA-seq reads that lack their orientations (known as unstranded reads) and certain boundary information. Methods to expand the usability of unstranded RNA-seq data by predetermining the orientation of the reads and precisely determining the boundaries of assembled transcripts could significantly benefit the quality of the resulting transcriptome maps. Here, we present a high-performing transcriptome assembly pipeline, called CAFE, that significantly improves the original assemblies, respectively assembled with stranded and/or unstranded RNA-seq data, by orienting unstranded reads using the maximum likelihood estimation and by integrating information about transcription start sites and cleavage and polyadenylation sites. Applying large-scale transcriptomic data comprising 230 billion RNA-seq reads from the ENCODE, Human BodyMap 2.0, The Cancer Genome Atlas, and GTEx projects, CAFE enabled us to predict the directions of about 220 billion unstranded reads, which led to the construction of more accurate transcriptome maps, comparable to the manually curated map, and a comprehensive lncRNA catalog that includes thousands of novel lncRNAs. Our pipeline should not only help to build comprehensive, precise transcriptome maps from complex genomes but also to expand the universe of noncoding genomes. PMID:28396519
Large-scale identification of wheat genes resistant to cereal cyst nematode Heterodera avenae using comparative transcriptomic analysis.

PubMed

Kong, Ling-An; Wu, Du-Qing; Huang, Wen-Kun; Peng, Huan; Wang, Gao-Feng; Cui, Jiang-Kuan; Liu, Shi-Ming; Li, Zhi-Gang; Yang, Jun; Peng, De-Liang

2015-10-16

Cereal cyst nematode Heterodera avenae, an important soil-borne pathogen in wheat, causes numerous annual yield losses worldwide, and use of resistant cultivars is the best strategy for control. However, target genes are not readily available for breeding resistant cultivars. Therefore, comparative transcriptomic analyses were performed to identify more applicable resistance genes for cultivar breeding. The developing nematodes within roots were stained with acid fuchsin solution. Transcriptome assemblies and redundancy filteration were obtained by Trinity, TGI Clustering Tool and BLASTN, respectively. Gene Ontology annotation was yielded by Blast2GO program, and metabolic pathways of transcripts were analyzed by Path_finder. The ROS levels were determined by luminol-chemiluminescence assay. The transcriptional gene expression profiles were obtained by quantitative RT-PCR. The RNA-sequencing was performed using an incompatible wheat cultivar VP1620 and a compatible control cultivar WEN19 infected with H. avenae at 24 h, 3 d and 8 d. Infection assays showed that VP1620 failed to block penetration of H. avenae but disturbed the transition of developmental stages, leading to a significant reduction in cyst formation. Two types of expression profiles were established to predict candidate resistance genes after developing a novel strategy to generate clean RNA-seq data by removing the transcripts of H. avenae within the raw data before assembly. Using the uncoordinated expression profiles with transcript abundance as a standard, 424 candidate resistance genes were identified, including 302 overlapping genes and 122 VP1620-specific genes. Genes with similar expression patterns were further classified according to the scales of changed transcript abundances, and 182 genes were rescued as supplementary candidate resistance genes. Functional characterizations revealed that diverse defense-related pathways were responsible for wheat resistance against H. avenae. Moreover, phospholipase was involved in many defense-related pathways and localized in the connection position. Furthermore, strong bursts of reactive oxygen species (ROS) within VP1620 roots infected with H. avenae were induced at 24 h and 3 d, and eight ROS-producing genes were significantly upregulated, including three class III peroxidase and five lipoxygenase genes. Large-scale identification of wheat resistance genes were processed by comparative transcriptomic analysis. Functional characterization showed that phospholipases associated with ROS production played vital roles in early defense responses to H. avenae via involvement in diverse defense-related pathways as a hub switch. This study is the first to investigate the early defense responses of wheat against H. avenae, not only provides applicable candidate resistance genes for breeding novel wheat cultivars, but also enables a better understanding of the defense mechanisms of wheat against H. avenae.
Major transcriptome re-organisation and abrupt changes in signalling, cell cycle and chromatin regulation at neural differentiation in vivo.

PubMed

Olivera-Martinez, Isabel; Schurch, Nick; Li, Roman A; Song, Junfang; Halley, Pamela A; Das, Raman M; Burt, Dave W; Barton, Geoffrey J; Storey, Kate G

2014-08-01

Here, we exploit the spatial separation of temporal events of neural differentiation in the elongating chick body axis to provide the first analysis of transcriptome change in progressively more differentiated neural cell populations in vivo. Microarray data, validated against direct RNA sequencing, identified: (1) a gene cohort characteristic of the multi-potent stem zone epiblast, which contains neuro-mesodermal progenitors that progressively generate the spinal cord; (2) a major transcriptome re-organisation as cells then adopt a neural fate; and (3) increasing diversity as neural patterning and neuron production begin. Focussing on the transition from multi-potent to neural state cells, we capture changes in major signalling pathways, uncover novel Wnt and Notch signalling dynamics, and implicate new pathways (mevalonate pathway/steroid biogenesis and TGFβ). This analysis further predicts changes in cellular processes, cell cycle, RNA-processing and protein turnover as cells acquire neural fate. We show that these changes are conserved across species and provide biological evidence for reduced proteasome efficiency and a novel lengthening of S phase. This latter step may provide time for epigenetic events to mediate large-scale transcriptome re-organisation; consistent with this, we uncover simultaneous downregulation of major chromatin modifiers as the neural programme is established. We further demonstrate that transcription of one such gene, HDAC1, is dependent on FGF signalling, making a novel link between signals that control neural differentiation and transcription of a core regulator of chromatin organisation. Our work implicates new signalling pathways and dynamics, cellular processes and epigenetic modifiers in neural differentiation in vivo, identifying multiple new potential cellular and molecular mechanisms that direct differentiation. © 2014. Published by The Company of Biologists Ltd.
Transcriptome analysis of Ruditapes philippinarum hepatopancreas provides insights into immune signaling pathways under Vibrio anguillarum infection.

PubMed

Ren, Yipeng; Xue, Junli; Yang, Huanhuan; Pan, Baoping; Bu, Wenjun

2017-05-01

The Manila clam, Ruditapes philippinarum, is one of the most economically important aquatic clams that are harvested on a large scale by the mariculture industry in China. However, increasing reports of bacterial pathogenic diseases have had a negative effect on the aquaculture industry of R. philippinarum. In the present study, the two transcriptome libraries of untreated (termed H) and challenged Vibrio anguillarum (termed HV) hepatopancreas were constructed and sequenced from Manila clam using an Illumina-based paired-end sequencing platform. In total, 75,302,886 and 66,578,976 high-quality clean reads were assembled from 101,080,746 and 99,673,538 raw data points from the two transcriptome libraries described above, respectively. Furthermore, 156,116 unigenes were generated from 210,685 transcripts, with an N50 length of 1125 bp, and from the annotated SwissProt, NR, NT, KO, GO, KOG and KEGG databases. Moreover, a total of 4071 differentially expressed unigenes (HV vs H) were detected, including 903 up-regulated and 3168 down-regulated genes. Among these differentially expressed unigenes, 226 unigenes were annotated using KEGG annotation in 16 immune-related signaling pathways, including Toll-like receptor, NF-kappa B, MAPK, NOD-like receptor, RIG-I-like receptor, and the TNF and chemokine signaling pathways. Finally, 20,341 simple sequence repeats (SSRs) and 214,430 potential single nucleotide polymorphisms (SNPs) were detected from the H and HV transcriptome libraries. In conclusion, these studies identified many candidate immune-related genes and signaling pathways and conducted a comparative analysis of the differentially expressed unigenes from Manila clam hepatopancreas in response to V. anguillarum stimulation. These data laid the foundation for studying the innate immune systems and defense mechanisms in R. philippinarum. Copyright © 2017 Elsevier Ltd. All rights reserved.
Transcriptome analysis reveals the time of the fourth round of genome duplication in common carp (Cyprinus carpio)

PubMed Central

2012-01-01

Background Common carp (Cyprinus carpio) is thought to have undergone one extra round of genome duplication compared to zebrafish. Transcriptome analysis has been used to study the existence and timing of genome duplication in species for which genome sequences are incomplete. Large-scale transcriptome data for the common carp genome should help reveal the timing of the additional duplication event. Results We have sequenced the transcriptome of common carp using 454 pyrosequencing. After assembling the 454 contigs and the published common carp sequences together, we obtained 49,669 contigs and identified genes using homology searches and an ab initio method. We identified 4,651 orthologous pairs between common carp and zebrafish and found 129,984 paralogous pairs within the common carp. An estimation of the synonymous substitution rate in the orthologous pairs indicated that common carp and zebrafish diverged 120 million years ago (MYA). We identified one round of genome duplication in common carp and estimated that it had occurred 5.6 to 11.3 MYA. In zebrafish, no genome duplication event after speciation was observed, suggesting that, compared to zebrafish, common carp had undergone an additional genome duplication event. We annotated the common carp contigs with Gene Ontology terms and KEGG pathways. Compared with zebrafish gene annotations, we found that a set of biological processes and pathways were enriched in common carp. Conclusions The assembled contigs helped us to estimate the time of the fourth-round of genome duplication in common carp. The resource that we have built as part of this study will help advance functional genomics and genome annotation studies in the future. PMID:22424280
Gene expression profiles of auxin metabolism in maturing apple fruit

USDA-ARS?s Scientific Manuscript database

Variation exists among apple genotypes in fruit maturation and ripening patterns that influences at-harvest fruit firmness and postharvest storability. Based on the results from our previous large-scale transcriptome profiling on apple fruit maturation and well-documented auxin-ethylene crosstalk, t...
Integrative approaches for large-scale transcriptome-wide association studies

PubMed Central

Gusev, Alexander; Ko, Arthur; Shi, Huwenbo; Bhatia, Gaurav; Chung, Wonil; Penninx, Brenda W J H; Jansen, Rick; de Geus, Eco JC; Boomsma, Dorret I; Wright, Fred A; Sullivan, Patrick F; Nikkola, Elina; Alvarez, Marcus; Civelek, Mete; Lusis, Aldons J.; Lehtimäki, Terho; Raitoharju, Emma; Kähönen, Mika; Seppälä, Ilkka; Raitakari, Olli T.; Kuusisto, Johanna; Laakso, Markku; Price, Alkes L.; Pajukanta, Päivi; Pasaniuc, Bogdan

2016-01-01

Many genetic variants influence complex traits by modulating gene expression, thus altering the abundance levels of one or multiple proteins. Here, we introduce a powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated to complex traits. We leverage expression imputation to perform a transcriptome wide association scan (TWAS) to identify significant expression-trait associations. We applied our approaches to expression data from blood and adipose tissue measured in ~3,000 individuals overall. We imputed gene expression into GWAS data from over 900,000 phenotype measurements to identify 69 novel genes significantly associated to obesity-related traits (BMI, lipids, and height). Many of the novel genes are associated with relevant phenotypes in the Hybrid Mouse Diversity Panel. Our results showcase the power of integrating genotype, gene expression and phenotype to gain insights into the genetic basis of complex traits. PMID:26854917
A resource of large-scale molecular markers for monitoring Agropyron cristatum chromatin introgression in wheat background based on transcriptome sequences.

PubMed

Zhang, Jinpeng; Liu, Weihua; Lu, Yuqing; Liu, Qunxing; Yang, Xinming; Li, Xiuquan; Li, Lihui

2017-09-20

Agropyron cristatum is a wild grass of the tribe Triticeae and serves as a gene donor for wheat improvement. However, very few markers can be used to monitor A. cristatum chromatin introgressions in wheat. Here, we reported a resource of large-scale molecular markers for tracking alien introgressions in wheat based on transcriptome sequences. By aligning A. cristatum unigenes with the Chinese Spring reference genome sequences, we designed 9602 A. cristatum expressed sequence tag-sequence-tagged site (EST-STS) markers for PCR amplification and experimental screening. As a result, 6063 polymorphic EST-STS markers were specific for the A. cristatum P genome in the single-receipt wheat background. A total of 4956 randomly selected polymorphic EST-STS markers were further tested in eight wheat variety backgrounds, and 3070 markers displaying stable and polymorphic amplification were validated. These markers covered more than 98% of the A. cristatum genome, and the marker distribution density was approximately 1.28 cM. An application case of all EST-STS markers was validated on the A. cristatum 6 P chromosome. These markers were successfully applied in the tracking of alien A. cristatum chromatin. Altogether, this study provided a universal method of large-scale molecular marker development to monitor wild relative chromatin in wheat.
Transcriptome profiling and physiological studies reveal a major role for aromatic amino acids in mercury stress tolerance in rice seedlings.

PubMed

Chen, Yun-An; Chi, Wen-Chang; Trinh, Ngoc Nam; Huang, Li-Yao; Chen, Ying-Chih; Cheng, Kai-Teng; Huang, Tsai-Lien; Lin, Chung-Yi; Huang, Hao-Jen

2014-01-01

Mercury (Hg) is a serious environmental pollution threat to the planet. The accumulation of Hg in plants disrupts many cellular-level functions and inhibits growth and development, but the mechanism is not fully understood. To gain more insight into the cellular response to Hg, we performed a large-scale analysis of the rice transcriptome during Hg stress. Genes induced with short-term exposure represented functional categories of cell-wall formation, chemical detoxification, secondary metabolism, signal transduction and abiotic stress response. Moreover, Hg stress upregulated several genes involved in aromatic amino acids (Phe and Trp) and increased the level of free Phe and Trp content. Exogenous application of Phe and Trp to rice roots enhanced tolerance to Hg and effectively reduced Hg-induced production of reactive oxygen species. Hg induced calcium accumulation and activated mitogen-activated protein kinase. Further characterization of the Hg-responsive genes we identified may be helpful for better understanding the mechanisms of Hg in plants.
Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis

DOE Office of Scientific and Technical Information (OSTI.GOV)

He, Fei; Maslov, Sergei; Yoo, Shinjae

Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset andmore » found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less
Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis

DOE PAGES

He, Fei; Maslov, Sergei; Yoo, Shinjae; ...

2016-05-25

Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset andmore » found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less

Comparative transcriptome analysis of duckweed (Landoltia punctata) in response to cadmium provides insights into molecular mechanisms underlying hyperaccumulation.

PubMed

Xu, Hua; Yu, Changjiang; Xia, Xinli; Li, Mingliang; Li, Huiguang; Wang, Yu; Wang, Shumin; Wang, Congpeng; Ma, Yubin; Zhou, Gongke

2018-01-01

Cadmium (Cd) is a detrimental environmental pollutant. Duckweeds have been considered promising candidates for Cd phytoremediation. Although many physiological studies have been conducted, the molecular mechanisms underlying Cd hyperaccumulation in duckweeds are largely unknown. In this study, clone 6001 of Landoltia punctata, which showed high Cd tolerance, was obtained by large-scale screening of over 200 duckweed clones. Subsequently, its growth, Cd flux, Cd accumulation, and Cd distribution characteristics were investigated. To further explore the global molecular mechanism, a comprehensive transcriptome analysis was performed. For RNA-Seq, samples were treated with 20 μM CdCl 2 for 0, 1, 3, and 6 days. In total, 9,461, 9,847, and 9615 differentially expressed unigenes (DEGs) were discovered between Cd-treated and control (0 day) samples. DEG clustering and enrichment analysis identified several biological processes for coping with Cd stress. Genes involved in DNA repair acted as an early response to Cd, while RNA and protein metabolism would be likely to respond as well. Furthermore, the carbohydrate metabolic flux tended to be modulated in response to Cd stress, and upregulated genes involved in sulfur and ROS metabolism might cause high Cd tolerance. Vacuolar sequestration most likely played an important role in Cd detoxification in L. punctata 6001. These novel findings provided important clues for molecular assisted screening and breeding of Cd hyperaccumulating cultivars for phytoremediation. Copyright © 2017 Elsevier Ltd. All rights reserved.
Genome-wide inference of regulatory networks in Streptomyces coelicolor.

PubMed

Castro-Melchor, Marlene; Charaniya, Salim; Karypis, George; Takano, Eriko; Hu, Wei-Shou

2010-10-18

The onset of antibiotics production in Streptomyces species is co-ordinated with differentiation events. An understanding of the genetic circuits that regulate these coupled biological phenomena is essential to discover and engineer the pharmacologically important natural products made by these species. The availability of genomic tools and access to a large warehouse of transcriptome data for the model organism, Streptomyces coelicolor, provides incentive to decipher the intricacies of the regulatory cascades and develop biologically meaningful hypotheses. In this study, more than 500 samples of genome-wide temporal transcriptome data, comprising wild-type and more than 25 regulatory gene mutants of Streptomyces coelicolor probed across multiple stress and medium conditions, were investigated. Information based on transcript and functional similarity was used to update a previously-predicted whole-genome operon map and further applied to predict transcriptional networks constituting modules enriched in diverse functions such as secondary metabolism, and sigma factor. The predicted network displays a scale-free architecture with a small-world property observed in many biological networks. The networks were further investigated to identify functionally-relevant modules that exhibit functional coherence and a consensus motif in the promoter elements indicative of DNA-binding elements. Despite the enormous experimental as well as computational challenges, a systems approach for integrating diverse genome-scale datasets to elucidate complex regulatory networks is beginning to emerge. We present an integrated analysis of transcriptome data and genomic features to refine a whole-genome operon map and to construct regulatory networks at the cistron level in Streptomyces coelicolor. The functionally-relevant modules identified in this study pose as potential targets for further studies and verification.
Transcriptome profiling of petal abscission zone and functional analysis of AUX/IAA family genes reveal that RhIAA16 is involved in petal shedding in rose

USDA-ARS?s Scientific Manuscript database

Rose is one of the most important cut flowers among ornamental plants. Rose flower longevity is largely dependent on the timing of petal shedding occurrence. To understand the molecular mechanism underlying petal abscission in rose, we performed transcriptome profiling of the petal abscission zone d...
Next-Generation Sequencing: The Translational Medicine Approach from “Bench to Bedside to Population”

PubMed Central

Beigh, Mohammad Muzafar

2016-01-01

Humans have predicted the relationship between heredity and diseases for a long time. Only in the beginning of the last century, scientists begin to discover the connotations between different genes and disease phenotypes. Recent trends in next-generation sequencing (NGS) technologies have brought a great momentum in biomedical research that in turn has remarkably augmented our basic understanding of human biology and its associated diseases. State-of-the-art next generation biotechnologies have started making huge strides in our current understanding of mechanisms of various chronic illnesses like cancers, metabolic disorders, neurodegenerative anomalies, etc. We are experiencing a renaissance in biomedical research primarily driven by next generation biotechnologies like genomics, transcriptomics, proteomics, metabolomics, lipidomics etc. Although genomic discoveries are at the forefront of next generation omics technologies, however, their implementation into clinical arena had been painstakingly slow mainly because of high reaction costs and unavailability of requisite computational tools for large-scale data analysis. However rapid innovations and steadily lowering cost of sequence-based chemistries along with the development of advanced bioinformatics tools have lately prompted launching and implementation of large-scale massively parallel genome sequencing programs in different fields ranging from medical genetics, infectious biology, agriculture sciences etc. Recent advances in large-scale omics-technologies is bringing healthcare research beyond the traditional “bench to bedside” approach to more of a continuum that will include improvements, in public healthcare and will be primarily based on predictive, preventive, personalized, and participatory medicine approach (P4). Recent large-scale research projects in genetic and infectious disease biology have indicated that massively parallel whole-genome/whole-exome sequencing, transcriptome analysis, and other functional genomic tools can reveal large number of unique functional elements and/or markers that otherwise would be undetected by traditional sequencing methodologies. Therefore, latest trends in the biomedical research is giving birth to the new branch in medicine commonly referred to as personalized and/or precision medicine. Developments in the post-genomic era are believed to completely restructure the present clinical pattern of disease prevention and treatment as well as methods of diagnosis and prognosis. The next important step in the direction of the precision/personalized medicine approach should be its early adoption in clinics for future medical interventions. Consequently, in coming year’s next generation biotechnologies will reorient medical practice more towards disease prediction and prevention approaches rather than curing them at later stages of their development and progression, even at wider population level(s) for general public healthcare system. PMID:28930123
CAS-viewer: web-based tool for splicing-guided integrative analysis of multi-omics cancer data.

PubMed

Han, Seonggyun; Kim, Dongwook; Kim, Youngjun; Choi, Kanghoon; Miller, Jason E; Kim, Dokyoon; Lee, Younghee

2018-04-20

The Cancer Genome Atlas (TCGA) project is a public resource that provides transcriptomic, DNA sequence, methylation, and clinical data for 33 cancer types. Transforming the large size and high complexity of TCGA cancer genome data into integrated knowledge can be useful to promote cancer research. Alternative splicing (AS) is a key regulatory mechanism of genes in human cancer development and in the interaction with epigenetic factors. Therefore, AS-guided integration of existing TCGA data sets will make it easier to gain insight into the genetic architecture of cancer risk and related outcomes. There are already existing tools analyzing and visualizing alternative mRNA splicing patterns for large-scale RNA-seq experiments. However, these existing web-based tools are limited to the analysis of individual TCGA data sets at a time, such as only transcriptomic information. We implemented CAS-viewer (integrative analysis of Cancer genome data based on Alternative Splicing), a web-based tool leveraging multi-cancer omics data from TCGA. It illustrates alternative mRNA splicing patterns along with methylation, miRNAs, and SNPs, and then provides an analysis tool to link differential transcript expression ratio to methylation, miRNA, and splicing regulatory elements for 33 cancer types. Moreover, one can analyze AS patterns with clinical data to identify potential transcripts associated with different survival outcome for each cancer. CAS-viewer is a web-based application for transcript isoform-driven integration of multi-omics data in multiple cancer types and will aid in the visualization and possible discovery of biomarkers for cancer by integrating multi-omics data from TCGA.
Cell-type- and tissue-specific transcriptomes of the white spruce (Picea glauca) bark unmask fine-scale spatial patterns of constitutive and induced conifer defense.

PubMed

Celedon, Jose M; Yuen, Macaire M S; Chiang, Angela; Henderson, Hannah; Reid, Karen E; Bohlmann, Jörg

2017-11-01

Plant defenses often involve specialized cells and tissues. In conifers, specialized cells of the bark are important for defense against insects and pathogens. Using laser microdissection, we characterized the transcriptomes of cortical resin duct cells, phenolic cells and phloem of white spruce (Picea glauca) bark under constitutive and methyl jasmonate (MeJa)-induced conditions, and we compared these transcriptomes with the transcriptome of the bark tissue complex. Overall, ~3700 bark transcripts were differentially expressed in response to MeJa. Approximately 25% of transcripts were expressed in only one cell type, revealing cell specialization at the transcriptome level. MeJa caused cell-type-specific transcriptome responses and changed the overall patterns of cell-type-specific transcript accumulation. Comparison of transcriptomes of the conifer bark tissue complex and specialized cells resolved a masking effect inherent to transcriptome analysis of complex tissues, and showed the actual cell-type-specific transcriptome signatures. Characterization of cell-type-specific transcriptomes is critical to reveal the dynamic patterns of spatial and temporal display of constitutive and induced defense systems in a complex plant tissue or organ. This was demonstrated with the improved resolution of spatially restricted expression of sets of genes of secondary metabolism in the specialized cell types. © 2017 The Authors The Plant Journal published by John Wiley & Sons Ltd and Society for Experimental Biology.
The grapevine expression atlas reveals a deep transcriptome shift driving the entire plant into a maturation program.

PubMed

Fasoli, Marianna; Dal Santo, Silvia; Zenoni, Sara; Tornielli, Giovanni Battista; Farina, Lorenzo; Zamboni, Anita; Porceddu, Andrea; Venturini, Luca; Bicego, Manuele; Murino, Vittorio; Ferrarini, Alberto; Delledonne, Massimo; Pezzotti, Mario

2012-09-01

We developed a genome-wide transcriptomic atlas of grapevine (Vitis vinifera) based on 54 samples representing green and woody tissues and organs at different developmental stages as well as specialized tissues such as pollen and senescent leaves. Together, these samples expressed ∼91% of the predicted grapevine genes. Pollen and senescent leaves had unique transcriptomes reflecting their specialized functions and physiological status. However, microarray and RNA-seq analysis grouped all the other samples into two major classes based on maturity rather than organ identity, namely, the vegetative/green and mature/woody categories. This division represents a fundamental transcriptomic reprogramming during the maturation process and was highlighted by three statistical approaches identifying the transcriptional relationships among samples (correlation analysis), putative biomarkers (O2PLS-DA approach), and sets of strongly and consistently expressed genes that define groups (topics) of similar samples (biclustering analysis). Gene coexpression analysis indicated that the mature/woody developmental program results from the reiterative coactivation of pathways that are largely inactive in vegetative/green tissues, often involving the coregulation of clusters of neighboring genes and global regulation based on codon preference. This global transcriptomic reprogramming during maturation has not been observed in herbaceous annual species and may be a defining characteristic of perennial woody plants.
The genome- and transcriptome-wide analysis of innate immunity in the brown planthopper, Nilaparvata lugens

PubMed Central

2013-01-01

Background The brown planthopper (Nilaparvata lugens) is one of the most serious rice plant pests in Asia. N. lugens causes extensive rice damage by sucking rice phloem sap, which results in stunted plant growth and the transmission of plant viruses. Despite the importance of this insect pest, little is known about the immunological mechanisms occurring in this hemimetabolous insect species. Results In this study, we performed a genome- and transcriptome-wide analysis aiming at the immune-related genes. The transcriptome datasets include the N. lugens intestine, the developmental stage, wing formation, and sex-specific expression information that provided useful gene expression sequence data for the genome-wide analysis. As a result, we identified a large number of genes encoding N. lugens pattern recognition proteins, modulation proteins in the prophenoloxidase (proPO) activating cascade, immune effectors, and the signal transduction molecules involved in the immune pathways, including the Toll, Immune deficiency (Imd) and Janus kinase signal transducers and activators of transcription (JAK-STAT) pathways. The genome scale analysis revealed detailed information of the gene structure, distribution and transcription orientations in scaffolds. A comparison of the genome-available hemimetabolous and metabolous insect species indicate the differences in the immune-related gene constitution. We investigated the gene expression profiles with regards to how they responded to bacterial infections and tissue, as well as development and sex expression specificity. Conclusions The genome- and transcriptome-wide analysis of immune-related genes including pattern recognition and modulation molecules, immune effectors, and the signal transduction molecules involved in the immune pathways is an important step in determining the overall architecture and functional network of the immune components in N. lugens. Our findings provide the comprehensive gene sequence resource and expression profiles of the immune-related genes of N. lugens, which could facilitate the understanding of the innate immune mechanisms in the hemimetabolous insect species. These data give insight into clarifying the potential functional roles of the immune-related genes involved in the biological processes of development, reproduction, and virus transmission in N. lugens. PMID:23497397
RNA-Seq analysis of yak ovary: improving yak gene structure information and mining reproduction-related genes.

PubMed

Lan, DaoLiang; Xiong, XianRong; Wei, YanLi; Xu, Tong; Zhong, JinCheng; Zhi, XiangDong; Wang, Yong; Li, Jian

2014-09-01

RNA-Seq, a high-throughput (HT) sequencing technique, has been used effectively in large-scale transcriptomic studies, and is particularly useful for improving gene structure information and mining of new genes. In this study, RNA-Seq HT technology was employed to analyze the transcriptome of yak ovary. After Illumina-Solexa deep sequencing, 26826516 clean reads with a total of 4828772880 bp were obtained from the ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome and 3734 of these genes were involved in alternative splicing. Gene structure refinement analysis showed that 7340 genes that were annotated in the yak genome could be extended at the 5' or 3' ends based on the alignments been the transcripts and the genome sequence. Novel transcript prediction analysis identified 6321 new transcripts with lengths ranging from 180 to 14884 bp, and 2267 of them were predicted to code proteins. BLAST analysis of the new transcripts showed that 1200?4933 mapped to the non-redundant (nr), nucleotide (nt) and/or SwissProt sequence databases. Comparative statistical analysis of the new mapped transcripts showed that the majority of them were similar to genes in Bos taurus (41.4%), Bos grunniens mutus (33.0%), Ovis aries (6.3%), Homo sapiens (2.8%), Mus musculus (1.6%) and other species. Functional analysis showed that these expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes pathways. GO analysis of the new transcripts found that the largest proportion of them was associated with reproduction. The results of this study will provide a basis for describing the normal transcriptome map of yak ovary and for future studies on yak breeding performance. Moreover, the results confirmed that RNA-Seq HT technology is highly advantageous in improving gene structure information and mining of new genes, as well as in providing valuable data to expand the yak genome information.
First Transcriptome and Digital Gene Expression Analysis in Neuroptera with an Emphasis on Chemoreception Genes in Chrysopa pallens (Rambur)

PubMed Central

Li, Zhao-Qun; Zhang, Shuai; Ma, Yan; Luo, Jun-Yu; Wang, Chun-Yi; Lv, Li-Min; Dong, Shuang-Lin; Cui, Jin-Jie

2013-01-01

Background Chrysopa pallens (Rambur) are the most important natural enemies and predators of various agricultural pests. Understanding the sophisticated olfactory system in insect antennae is crucial for studying the physiological bases of olfaction and also could lead to effective applications of C. pallens in integrated pest management. However no transcriptome information is available for Neuroptera, and sequence data for C. pallens are scarce, so obtaining more sequence data is a priority for researchers on this species. Results To facilitate identifying sets of genes involved in olfaction, a normalized transcriptome of C. pallens was sequenced. A total of 104,603 contigs were obtained and assembled into 10,662 clusters and 39,734 singletons; 20,524 were annotated based on BLASTX analyses. A large number of candidate chemosensory genes were identified, including 14 odorant-binding proteins (OBPs), 22 chemosensory proteins (CSPs), 16 ionotropic receptors, 14 odorant receptors, and genes potentially involved in olfactory modulation. To better understand the OBPs, CSPs and cytochrome P450s, phylogenetic trees were constructed. In addition, 10 digital gene expression libraries of different tissues were constructed and gene expression profiles were compared among different tissues in males and females. Conclusions Our results provide a basis for exploring the mechanisms of chemoreception in C. pallens, as well as other insects. The evolutionary analyses in our study provide new insights into the differentiation and evolution of insect OBPs and CSPs. Our study provided large-scale sequence information for further studies in C. pallens. PMID:23826220
Liver Transcriptome Analysis of the Large Yellow Croaker (Larimichthys crocea) during Fasting by Using RNA-Seq

PubMed Central

Qian, Baoying; Xue, Liangyi; Huang, Hongli

2016-01-01

The large yellow croaker (Larimichthys crocea) is an economically important fish species in Chinese mariculture industry. To understand the molecular basis underlying the response to fasting, Illumina HiSeqTM 2000 was used to analyze the liver transcriptome of fasting large yellow croakers. A total of 54,933,550 clean reads were obtained and assembled into 110,364 contigs. Annotation to the NCBI database identified a total of 38,728 unigenes, of which 19,654 were classified into Gene Ontology and 22,683 were found in Kyoto Encyclopedia of Genes and Genomes (KEGG). Comparative analysis of the expression profiles between fasting fish and normal-feeding fish identified a total of 7,623 differentially expressed genes (P < 0.05), including 2,500 upregulated genes and 5,123 downregulated genes. Dramatic differences were observed in the genes involved in metabolic pathways such as fat digestion and absorption, citrate cycle, and glycolysis/gluconeogenesis, and the similar results were also found in the transcriptome of skeletal muscle. Further qPCR analysis confirmed that the genes encoding the factors involved in those pathways significantly changed in terms of expression levels. The results of the present study provide insights into the molecular mechanisms underlying the metabolic response of the large yellow croaker to fasting as well as identified areas that require further investigation. PMID:26967898
Rare Cell Detection by Single-Cell RNA Sequencing as Guided by Single-Molecule RNA FISH.

PubMed

Torre, Eduardo; Dueck, Hannah; Shaffer, Sydney; Gospocic, Janko; Gupte, Rohit; Bonasio, Roberto; Kim, Junhyong; Murray, John; Raj, Arjun

2018-02-28

Although single-cell RNA sequencing can reliably detect large-scale transcriptional programs, it is unclear whether it accurately captures the behavior of individual genes, especially those that express only in rare cells. Here, we use single-molecule RNA fluorescence in situ hybridization as a gold standard to assess trade-offs in single-cell RNA-sequencing data for detecting rare cell expression variability. We quantified the gene expression distribution for 26 genes that range from ubiquitous to rarely expressed and found that the correspondence between estimates across platforms improved with both transcriptome coverage and increased number of cells analyzed. Further, by characterizing the trade-off between transcriptome coverage and number of cells analyzed, we show that when the number of genes required to answer a given biological question is small, then greater transcriptome coverage is more important than analyzing large numbers of cells. More generally, our report provides guidelines for selecting quality thresholds for single-cell RNA-sequencing experiments aimed at rare cell analyses. Copyright © 2018 Elsevier Inc. All rights reserved.
Transcriptome analysis of nitric oxide-responsive genes in upland cotton (Gossypium hirsutum).

PubMed

Huang, Juan; Wei, Hengling; Li, Libei; Yu, Shuxun

2018-01-01

Nitric oxide (NO) is an important signaling molecule with diverse physiological functions in plants. It is therefore important to characterize the downstream genes and signal transduction networks modulated by NO. Here, we identified 1,932 differentially expressed genes (DEGs) responding to NO in upland cotton using high throughput tag sequencing. The results of quantitative real-time polymerase chain reaction (qRT-PCR) analysis of 25 DEGs showed good consistency. Gene Ontology (GO) and KEGG pathway were analyzed to gain a better understanding of these DEGs. We identified 157 DEGs belonging to 36 transcription factor (TF) families and 72 DEGs related to eight plant hormones, among which several TF families and hormones were involved in stress responses. Hydrogen peroxide and malondialdehyde (MDA) contents were increased, as well related genes after treatment with sodium nitroprusside (SNP) (an NO donor), suggesting a role for NO in the plant stress response. Finally, we compared of the current and previous data indicating a massive number of NO-responsive genes at the large-scale transcriptome level. This study evaluated the landscape of NO-responsive genes in cotton and identified the involvement of NO in the stress response. Some of the identified DEGs represent good candidates for further functional analysis in cotton.
Transcriptional Regulation of Fruit Ripening by Tomato FRUITFULL Homologs and Associated MADS Box Proteins[W

PubMed Central

Fujisawa, Masaki; Shima, Yoko; Nakagawa, Hiroyuki; Kitagawa, Mamiko; Kimbara, Junji; Nakano, Toshitsugu; Kasumi, Takafumi; Ito, Yasuhiro

2014-01-01

The tomato (Solanum lycopersicum) MADS box FRUITFULL homologs FUL1 and FUL2 act as key ripening regulators and interact with the master regulator MADS box protein RIPENING INHIBITOR (RIN). Here, we report the large-scale identification of direct targets of FUL1 and FUL2 by transcriptome analysis of FUL1/FUL2 suppressed fruits and chromatin immunoprecipitation coupled with microarray analysis (ChIP-chip) targeting tomato gene promoters. The ChIP-chip and transcriptome analysis identified FUL1/FUL2 target genes that contain at least one genomic region bound by FUL1 or FUL2 (regions that occur mainly in their promoters) and exhibit FUL1/FUL2-dependent expression during ripening. These analyses identified 860 direct FUL1 targets and 878 direct FUL2 targets; this set of genes includes both direct targets of RIN and nontargets of RIN. Functional classification of the FUL1/FUL2 targets revealed that these FUL homologs function in many biological processes via the regulation of ripening-related gene expression, both in cooperation with and independent of RIN. Our in vitro assay showed that the FUL homologs, RIN, and tomato AGAMOUS-LIKE1 form DNA binding complexes, suggesting that tetramer complexes of these MADS box proteins are mainly responsible for the regulation of ripening. PMID:24415769
Transcriptome of interstitial cells of Cajal reveals unique and selective gene signatures

PubMed Central

Park, Paul J.; Fuchs, Robert; Wei, Lai; Jorgensen, Brian G.; Redelman, Doug; Ward, Sean M.; Sanders, Kenton M.

2017-01-01

Transcriptome-scale data can reveal essential clues into understanding the underlying molecular mechanisms behind specific cellular functions and biological processes. Transcriptomics is a continually growing field of research utilized in biomarker discovery. The transcriptomic profile of interstitial cells of Cajal (ICC), which serve as slow-wave electrical pacemakers for gastrointestinal (GI) smooth muscle, has yet to be uncovered. Using copGFP-labeled ICC mice and flow cytometry, we isolated ICC populations from the murine small intestine and colon and obtained their transcriptomes. In analyzing the transcriptome, we identified a unique set of ICC-restricted markers including transcription factors, epigenetic enzymes/regulators, growth factors, receptors, protein kinases/phosphatases, and ion channels/transporters. This analysis provides new and unique insights into the cellular and biological functions of ICC in GI physiology. Additionally, we constructed an interactive ICC genome browser (http://med.unr.edu/physio/transcriptome) based on the UCSC genome database. To our knowledge, this is the first online resource that provides a comprehensive library of all known genetic transcripts expressed in primary ICC. Our genome browser offers a new perspective into the alternative expression of genes in ICC and provides a valuable reference for future functional studies. PMID:28426719
Identification and characterization of large DNA deletions affecting oil quality traits in soybean seeds through transcriptome sequencing analysis.

PubMed

Goettel, Wolfgang; Ramirez, Martha; Upchurch, Robert G; An, Yong-Qiang Charles

2016-08-01

Identification and characterization of a 254-kb genomic deletion on a duplicated chromosome segment that resulted in a low level of palmitic acid in soybean seeds using transcriptome sequencing. A large number of soybean genotypes varying in seed oil composition and content have been identified. Understanding the molecular mechanisms underlying these variations is important for breeders to effectively utilize them as a genetic resource. Through design and application of a bioinformatics approach, we identified nine co-regulated gene clusters by comparing seed transcriptomes of nine soybean genotypes varying in oil composition and content. We demonstrated that four gene clusters in the genotypes M23, Jack and N0304-303-3 coincided with large-scale genome rearrangements. The co-regulated gene clusters in M23 and Jack mapped to a previously described 164-kb deletion and a copy number amplification of the Rhg1 locus, respectively. The coordinately down-regulated gene clusters in N0304-303-3 were caused by a 254-kb deletion containing 19 genes including a fatty acyl-ACP thioesterase B gene (FATB1a). This deletion was associated with reduced palmitic acid content in seeds and was the molecular cause of a previously reported nonfunctional FATB1a allele, fap nc . The M23 and N0304-304-3 deletions were located in duplicated genome segments retained from the Glycine-specific whole genome duplication that occurred 13 million years ago. The homoeologous genes in these duplicated regions shared a strong similarity in both their encoded protein sequences and transcript accumulation levels, suggesting that they may have conserved and important functions in seeds. The functional conservation of homoeologous genes may result in genetic redundancy and gene dosage effects for their associated seed traits, explaining why the large deletion did not cause lethal effects or completely eliminate palmitic acid in N0304-303-3.
SONAR Discovers RNA-Binding Proteins from Analysis of Large-Scale Protein-Protein Interactomes.

PubMed

Brannan, Kristopher W; Jin, Wenhao; Huelga, Stephanie C; Banks, Charles A S; Gilmore, Joshua M; Florens, Laurence; Washburn, Michael P; Van Nostrand, Eric L; Pratt, Gabriel A; Schwinn, Marie K; Daniels, Danette L; Yeo, Gene W

2016-10-20

RNA metabolism is controlled by an expanding, yet incomplete, catalog of RNA-binding proteins (RBPs), many of which lack characterized RNA binding domains. Approaches to expand the RBP repertoire to discover non-canonical RBPs are currently needed. Here, HaloTag fusion pull down of 12 nuclear and cytoplasmic RBPs followed by quantitative mass spectrometry (MS) demonstrates that proteins interacting with multiple RBPs in an RNA-dependent manner are enriched for RBPs. This motivated SONAR, a computational approach that predicts RNA binding activity by analyzing large-scale affinity precipitation-MS protein-protein interactomes. Without relying on sequence or structure information, SONAR identifies 1,923 human, 489 fly, and 745 yeast RBPs, including over 100 human candidate RBPs that contain zinc finger domains. Enhanced CLIP confirms RNA binding activity and identifies transcriptome-wide RNA binding sites for SONAR-predicted RBPs, revealing unexpected RNA binding activity for disease-relevant proteins and DNA binding proteins. Copyright © 2016 Elsevier Inc. All rights reserved.
Dual RNA-seq reveals no plastic transcriptional response of the coccidian parasite Eimeria falciformis to host immune defenses.

PubMed

Ehret, Totta; Spork, Simone; Dieterich, Christoph; Lucius, Richard; Heitlinger, Emanuel

2017-09-05

Parasites can either respond to differences in immune defenses that exist between individual hosts plastically or, alternatively, follow a genetically canalized ("hard wired") program of infection. Assuming that large-scale functional plasticity would be discernible in the parasite transcriptome we have performed a dual RNA-seq study of the lifecycle of Eimeria falciformis using infected mice with different immune status as models for coccidian infections. We compared parasite and host transcriptomes (dual transcriptome) between naïve and challenge infected mice, as well as between immune competent and immune deficient ones. Mice with different immune competence show transcriptional differences as well as differences in parasite reproduction (oocyst shedding). Broad gene categories represented by differently abundant host genes indicate enrichments for immune reaction and tissue repair functions. More specifically, TGF-beta, EGF, TNF and IL-1 and IL-6 are examples of functional annotations represented differently depending on host immune status. Much in contrast, parasite transcriptomes were neither different between Coccidia isolated from immune competent and immune deficient mice, nor between those harvested from naïve and challenge infected mice. Instead, parasite transcriptomes have distinct profiles early and late in infection, characterized largely by biosynthesis or motility associated functional gene groups, respectively. Extracellular sporozoite and oocyst stages showed distinct transcriptional profiles and sporozoite transcriptomes were found enriched for species specific genes and likely pathogenicity factors. We propose that the niche and host-specific parasite E. falciformis uses a genetically canalized program of infection. This program is likely fixed in an evolutionary process rather than employing phenotypic plasticity to interact with its host. This in turn might limit the potential of the parasite to adapt to new host species or niches, forcing it to coevolve with its host.
ATGC transcriptomics: a web-based application to integrate, explore and analyze de novo transcriptomic data.

PubMed

Gonzalez, Sergio; Clavijo, Bernardo; Rivarola, Máximo; Moreno, Patricio; Fernandez, Paula; Dopazo, Joaquín; Paniego, Norma

2017-02-22

In the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species. We developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results. ATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net .
Comparison of the Transcriptomes of Ginger (Zingiber officinale Rosc.) and Mango Ginger (Curcuma amada Roxb.) in Response to the Bacterial Wilt Infection

PubMed Central

Prasath, Duraisamy; Karthika, Raveendran; Habeeba, Naduva Thadath; Suraby, Erinjery Jose; Rosana, Ottakandathil Babu; Shaji, Avaroth; Eapen, Santhosh Joseph; Deshpande, Uday; Anandaraj, Muthuswamy

2014-01-01

Bacterial wilt in ginger (Zingiber officinale Rosc.) caused by Ralstonia solanacearum is one of the most important production constraints in tropical, sub-tropical and warm temperature regions of the world. Lack of resistant genotype adds constraints to the crop management. However, mango ginger (Curcuma amada Roxb.), which is resistant to R. solanacearum, is a potential donor, if the exact mechanism of resistance is understood. To identify genes involved in resistance to R. solanacearum, we have sequenced the transcriptome from wilt-sensitive ginger and wilt-resistant mango ginger using Illumina sequencing technology. A total of 26387032 and 22268804 paired-end reads were obtained after quality filtering for C. amada and Z. officinale, respectively. A total of 36359 and 32312 assembled transcript sequences were obtained from both the species. The functions of the unigenes cover a diverse set of molecular functions and biological processes, among which we identified a large number of genes associated with resistance to stresses and response to biotic stimuli. Large scale expression profiling showed that many of the disease resistance related genes were expressed more in C. amada. Comparative analysis also identified genes belonging to different pathways of plant defense against biotic stresses that are differentially expressed in either ginger or mango ginger. The identification of many defense related genes differentially expressed provides many insights to the resistance mechanism to R. solanacearum and for studying potential pathways involved in responses to pathogen. Also, several candidate genes that may underline the difference in resistance to R. solanacearum between ginger and mango ginger were identified. Finally, we have developed a web resource, ginger transcriptome database, which provides public access to the data. Our study is among the first to demonstrate the use of Illumina short read sequencing for de novo transcriptome assembly and comparison in non-model species of Zingiberaceae. PMID:24940878

Comparison of the transcriptomes of ginger (Zingiber officinale Rosc.) and mango ginger (Curcuma amada Roxb.) in response to the bacterial wilt infection.

PubMed

Prasath, Duraisamy; Karthika, Raveendran; Habeeba, Naduva Thadath; Suraby, Erinjery Jose; Rosana, Ottakandathil Babu; Shaji, Avaroth; Eapen, Santhosh Joseph; Deshpande, Uday; Anandaraj, Muthuswamy

2014-01-01

Bacterial wilt in ginger (Zingiber officinale Rosc.) caused by Ralstonia solanacearum is one of the most important production constraints in tropical, sub-tropical and warm temperature regions of the world. Lack of resistant genotype adds constraints to the crop management. However, mango ginger (Curcuma amada Roxb.), which is resistant to R. solanacearum, is a potential donor, if the exact mechanism of resistance is understood. To identify genes involved in resistance to R. solanacearum, we have sequenced the transcriptome from wilt-sensitive ginger and wilt-resistant mango ginger using Illumina sequencing technology. A total of 26387032 and 22268804 paired-end reads were obtained after quality filtering for C. amada and Z. officinale, respectively. A total of 36359 and 32312 assembled transcript sequences were obtained from both the species. The functions of the unigenes cover a diverse set of molecular functions and biological processes, among which we identified a large number of genes associated with resistance to stresses and response to biotic stimuli. Large scale expression profiling showed that many of the disease resistance related genes were expressed more in C. amada. Comparative analysis also identified genes belonging to different pathways of plant defense against biotic stresses that are differentially expressed in either ginger or mango ginger. The identification of many defense related genes differentially expressed provides many insights to the resistance mechanism to R. solanacearum and for studying potential pathways involved in responses to pathogen. Also, several candidate genes that may underline the difference in resistance to R. solanacearum between ginger and mango ginger were identified. Finally, we have developed a web resource, ginger transcriptome database, which provides public access to the data. Our study is among the first to demonstrate the use of Illumina short read sequencing for de novo transcriptome assembly and comparison in non-model species of Zingiberaceae.
Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution

PubMed Central

Clarke, Thomas H.; Garb, Jessica E.; Hayashi, Cheryl Y.; Arensburger, Peter; Ayoub, Nadia A.

2015-01-01

The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). PMID:26058392
Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants.

PubMed

Li, Xinguo; Wu, Harry X; Southerton, Simon G

2010-06-21

Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution.
Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants

PubMed Central

2010-01-01

Background Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. Results The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conclusions Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution. PMID:20565927
Elucidating and mining the Tulipa and Lilium transcriptomes.

PubMed

Moreno-Pachon, Natalia M; Leeggangers, Hendrika A C F; Nijveen, Harm; Severing, Edouard; Hilhorst, Henk; Immink, Richard G H

2016-10-01

Genome sequencing remains a challenge for species with large and complex genomes containing extensive repetitive sequences, of which the bulbous and monocotyledonous plants tulip and lily are examples. In such a case, sequencing of only the active part of the genome, represented by the transcriptome, is a good alternative to obtain information about gene content. In this study we aimed to generate a high quality transcriptome of tulip and lily and to make this data available as an open-access resource via a user-friendly web-based interface. The Illumina HiSeq 2000 platform was applied and the transcribed RNA was sequenced from a collection of different lily and tulip tissues, respectively. In order to obtain good transcriptome coverage and to facilitate effective data mining, assembly was done using different filtering parameters for clearing out contamination and noise of the RNAseq datasets. This analysis revealed limitations of commonly applied methods and parameter settings used in de novo transcriptome assembly. The final created transcriptomes are publicly available via a user friendly Transcriptome browser ( http://www.bioinformatics.nl/bulbs/db/species/index ). The usefulness of this resource has been exemplified by a search for all potential transcription factors in lily and tulip, with special focus on the TCP transcription factor family. This analysis and other quality parameters point out the quality of the transcriptomes, which can serve as a basis for further genomics studies in lily, tulip, and bulbous plants in general.
New Markers for Predicting Fertility of the Male Gametes in the Post Genomic Age.

PubMed

Dipresa, Savina; De Toni, Luca; Foresta, Carlo; Garolla, Andrea

2018-04-18

A number of test have been proposed to assess male fertility potential, ranging from routine testing by light microscopic method for evaluating semen samples, to screening test for DNA integrity aimed to look at sperm chromatin abnormalities. Spermatozoa are an extremely differentiated cell, they have critical functions for embryo development and heredity, in addiction to delivering a haploid paternal genome to the oocyte. Towards this goal certain requirements must always be met. The ability of spermatozoa to perform its reproductive function taking place in the spermatogenesis, a highly specialized process depending on multiple factors with effect on male fertility. In the past 30 years, large-scale analyses of transcriptomic and genome expression in mammals have generated a large amount of informations on numberless biomolecules involved in spermatogenesis and male germ cell reproductive function. Sperm proteome represents the protein content that spermatozoa needs to survive and work correctly and modifications of sperm proteome play a role in determining functional changes leading to a decrease of reproductive competence into affected spermatozoa. The post-genomic approach consists of different methodologies for concurrently testicular transcriptome studies, protein compositional analysis and metabolomics findings of the spermatozoa in humans. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Defining the Human Macula Transcriptome and Candidate Retinal Disease Genes UsingEyeSAGE

PubMed Central

Rickman, Catherine Bowes; Ebright, Jessica N.; Zavodni, Zachary J.; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P.; Wistow, Graeme; Boon, Kathy; Hauser, Michael A.

2009-01-01

Purpose To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Methods Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Results Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. Conclusions The EyeSAGE database, combining three different gene-profiling platforms including the authors’ multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions. PMID:16723438
Defining the human macula transcriptome and candidate retinal disease genes using EyeSAGE.

PubMed

Bowes Rickman, Catherine; Ebright, Jessica N; Zavodni, Zachary J; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P; Wistow, Graeme; Boon, Kathy; Hauser, Michael A

2006-06-01

To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. The EyeSAGE database, combining three different gene-profiling platforms including the authors' multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions.
Comparative transcriptome analysis of the Asteraceae halophyte Karelinia caspica under salt stress.

PubMed

Zhang, Xia; Liao, Maoseng; Chang, Dan; Zhang, Fuchun

2014-12-17

Much attention has been given to the potential of halophytes as sources of tolerance traits for introduction into cereals. However, a great deal remains unknown about the diverse mechanisms employed by halophytes to cope with salinity. To characterize salt tolerance mechanisms underlying Karelinia caspica, an Asteraceae halophyte, we performed Large-scale transcriptomic analysis using a high-throughput Illumina sequencing platform. Comparative gene expression analysis was performed to correlate the effects of salt stress and ABA regulation at the molecular level. Total sequence reads generated by pyrosequencing were assembled into 287,185 non-redundant transcripts with an average length of 652 bp. Using the BLAST function in the Swiss-Prot, NCBI nr, GO, KEGG, and KOG databases, a total of 216,416 coding sequences associated with known proteins were annotated. Among these, 35,533 unigenes were classified into 69 gene ontology categories, and 18,378 unigenes were classified into 202 known pathways. Based on the fold changes observed when comparing the salt stress and control samples, 60,127 unigenes were differentially expressed, with 38,122 and 22,005 up- and down-regulated, respectively. Several of the differentially expressed genes are known to be involved in the signaling pathway of the plant hormone ABA, including ABA metabolism, transport, and sensing as well as the ABA signaling cascade. Transcriptome profiling of K. caspica contribute to a comprehensive understanding of K. caspica at the molecular level. Moreover, the global survey of differentially expressed genes in this species under salt stress and analyses of the effects of salt stress and ABA regulation will contribute to the identification and characterization of genes and molecular mechanisms underlying salt stress responses in Asteraceae plants.
Machine Learning–Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis[W

PubMed Central

Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng

2014-01-01

Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154
Impact of Transcriptomics on Our Understanding of Pulmonary Fibrosis

PubMed Central

Vukmirovic, Milica; Kaminski, Naftali

2018-01-01

Idiopathic pulmonary fibrosis (IPF) is a lethal fibrotic lung disease characterized by aberrant remodeling of the lung parenchyma with extensive changes to the phenotypes of all lung resident cells. The introduction of transcriptomics, genome scale profiling of thousands of RNA transcripts, caused a significant inversion in IPF research. Instead of generating hypotheses based on animal models of disease, or biological plausibility, with limited validation in humans, investigators were able to generate hypotheses based on unbiased molecular analysis of human samples and then use animal models of disease to test their hypotheses. In this review, we describe the insights made from transcriptomic analysis of human IPF samples. We describe how transcriptomic studies led to identification of novel genes and pathways involved in the human IPF lung such as: matrix metalloproteinases, WNT pathway, epithelial genes, role of microRNAs among others, as well as conceptual insights such as the involvement of developmental pathways and deep shifts in epithelial and fibroblast phenotypes. The impact of lung and transcriptomic studies on disease classification, endotype discovery, and reproducible biomarkers is also described in detail. Despite these impressive achievements, the impact of transcriptomic studies has been limited because they analyzed bulk tissue and did not address the cellular and spatial heterogeneity of the IPF lung. We discuss new emerging technologies and applications, such as single-cell RNAseq and microenvironment analysis that may address cellular and spatial heterogeneity. We end by making the point that most current tissue collections and resources are not amenable to analysis using the novel technologies. To take advantage of the new opportunities, we need new efforts of sample collections, this time focused on access to all the microenvironments and cells in the IPF lung. PMID:29670881
Transcriptomic analysis identifies genes and pathways related to myrmecophagy in the Malayan pangolin (Manis javanica)

PubMed Central

Ma, Jing-E; Li, Lin-Miao; Jiang, Hai-Ying; Zhang, Xiu-Juan; Li, Juan; Li, Guan-Yu; Yuan, Li-Hong; Wu, Jun

2017-01-01

The Malayan pangolin (Manis javanica) is an unusual, scale-covered, toothless mammal that specializes in myrmecophagy. Due to their threatened status and continuing decline in the wild, concerted efforts have been made to conserve and rescue this species in captivity in China. Maintaining this species in captivity is a significant challenge, partly because little is known of the molecular mechanisms of its digestive system. Here, the first large-scale sequencing analyses of the salivary gland, liver and small intestine transcriptomes of an adult M. javanica genome were performed, and the results were compared with published liver transcriptome profiles for a pregnant M. javanica female. A total of 24,452 transcripts were obtained, among which 22,538 were annotated on the basis of seven databases. In addition, 3,373 new genes were predicted, of which 1,459 were annotated. Several pathways were found to be involved in myrmecophagy, including olfactory transduction, amino sugar and nucleotide sugar metabolism, lipid metabolism, and terpenoid and polyketide metabolism pathways. Many of the annotated transcripts were involved in digestive functions: 997 transcripts were related to sensory perception, 129 were related to digestive enzyme gene families, and 199 were related to molecular transporters. One transcript for an acidic mammalian chitinase was found in the annotated data, and this might be closely related to the unique digestive function of pangolins. These pathways and transcripts are involved in specialization processes related to myrmecophagy (a form of insectivory) and carbohydrate, protein and lipid digestive pathways, probably reflecting adaptations to myrmecophagy. Our study is the first to investigate the molecular mechanisms underlying myrmecophagy in M. javanica, and we hope that our results may play a role in the conservation of this species. PMID:29302388
Epigenetic transgenerational inheritance of somatic transcriptomes and epigenetic control regions

PubMed Central

2012-01-01

Background Environmentally induced epigenetic transgenerational inheritance of adult onset disease involves a variety of phenotypic changes, suggesting a general alteration in genome activity. Results Investigation of different tissue transcriptomes in male and female F3 generation vinclozolin versus control lineage rats demonstrated all tissues examined had transgenerational transcriptomes. The microarrays from 11 different tissues were compared with a gene bionetwork analysis. Although each tissue transgenerational transcriptome was unique, common cellular pathways and processes were identified between the tissues. A cluster analysis identified gene modules with coordinated gene expression and each had unique gene networks regulating tissue-specific gene expression and function. A large number of statistically significant over-represented clusters of genes were identified in the genome for both males and females. These gene clusters ranged from 2-5 megabases in size, and a number of them corresponded to the epimutations previously identified in sperm that transmit the epigenetic transgenerational inheritance of disease phenotypes. Conclusions Combined observations demonstrate that all tissues derived from the epigenetically altered germ line develop transgenerational transcriptomes unique to the tissue, but common epigenetic control regions in the genome may coordinately regulate these tissue-specific transcriptomes. This systems biology approach provides insight into the molecular mechanisms involved in the epigenetic transgenerational inheritance of a variety of adult onset disease phenotypes. PMID:23034163
5'-Serial Analysis of Gene Expression studies reveal a transcriptomic switch during fruiting body development in Coprinopsis cinerea

PubMed Central

2013-01-01

Background The transition from the vegetative mycelium to the primordium during fruiting body development is the most complex and critical developmental event in the life cycle of many basidiomycete fungi. Understanding the molecular mechanisms underlying this process has long been a goal of research on basidiomycetes. Large scale assessment of the expressed transcriptomes of these developmental stages will facilitate the generation of a more comprehensive picture of the mushroom fruiting process. In this study, we coupled 5'-Serial Analysis of Gene Expression (5'-SAGE) to high-throughput pyrosequencing from 454 Life Sciences to analyze the transcriptomes and identify up-regulated genes among vegetative mycelium (Myc) and stage 1 primordium (S1-Pri) of Coprinopsis cinerea during fruiting body development. Results We evaluated the expression of >3,000 genes in the two respective growth stages and discovered that almost one-third of these genes were preferentially expressed in either stage. This identified a significant turnover of the transcriptome during the course of fruiting body development. Additionally, we annotated more than 79,000 transcription start sites (TSSs) based on the transcriptomes of the mycelium and stage 1 primoridum stages. Patterns of enrichment based on gene annotations from the GO and KEGG databases indicated that various structural and functional protein families were uniquely employed in either stage and that during primordial growth, cellular metabolism is highly up-regulated. Various signaling pathways such as the cAMP-PKA, MAPK and TOR pathways were also identified as up-regulated, consistent with the model that sensing of nutrient levels and the environment are important in this developmental transition. More than 100 up-regulated genes were also found to be unique to mushroom forming basidiomycetes, highlighting the novelty of fruiting body development in the fungal kingdom. Conclusions We implicated a wealth of new candidate genes important to early stages of mushroom fruiting development, though their precise molecular functions and biological roles are not yet fully known. This study serves to advance our understanding of the molecular mechanisms of fruiting body development in the model mushroom C. cinerea. PMID:23514374
Genome scale transcriptomics of baculovirus-insect interactions.

PubMed

Nguyen, Quan; Nielsen, Lars K; Reid, Steven

2013-11-12

Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors' and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system' which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.
Low Temperature and Short-Term High-CO2 Treatment in Postharvest Storage of Table Grapes at Two Maturity Stages: Effects on Transcriptome Profiling

PubMed Central

Rosales, Raquel; Romero, Irene; Fernandez-Caballero, Carlos; Escribano, M. Isabel; Merodio, Carmen; Sanchez-Ballesta, M. Teresa

2016-01-01

Table grapes (Vitis vinifera cv. Cardinal) are highly perishable and their quality deteriorates during postharvest storage at low temperature mainly because of sensitivity to fungal decay and senescence of rachis. The application of a 3-day CO2 treatment (20 kPa CO2 + 20 kPa O2 + 60 kPa N2) at 0°C reduced total decay and retained fruit quality in early and late-harvested table grapes during postharvest storage. In order to study the transcriptional responsiveness of table grapes to low temperature and high CO2 levels in the first stage of storage and how the maturity stage affect these changes, we have performed a comparative large-scale transcriptional analysis using the custom-made GrapeGen GeneChip®. In the first stage of storage, low temperature led to a significantly intense change in grape skin transcriptome irrespective of fruit maturity, although there were different changes within each stage. In the case of CO2 treated samples, in comparison to fruit at time zero, only slight differences were observed. Functional enrichment analysis revealed that major modifications in the transcriptome profile of early- and late-harvested grapes stored at 0°C are linked to biotic and abiotic stress-responsive terms. However, in both cases there is a specific reprogramming of the transcriptome during the first stage of storage at 0°C in order to withstand the cold stress. Thus, genes involved in gluconeogenesis, photosynthesis, mRNA translation and lipid transport were up-regulated in the case of early-harvested grapes, and genes related to protein folding stability and intracellular membrane trafficking in late-harvested grapes. The beneficial effect of high CO2 treatment maintaining table grape quality seems to be an active process requiring the induction of several transcription factors and kinases in early-harvested grapes, and the activation of processes associated to the maintenance of energy in late-harvested grapes. PMID:27468290
Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.

PubMed

Gerstein, Mark B; Lu, Zhi John; Van Nostrand, Eric L; Cheng, Chao; Arshinoff, Bradley I; Liu, Tao; Yip, Kevin Y; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P; Barber, Galt; Brdlik, Cathleen M; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O; Dernburg, Abby F; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A; Gassmann, Reto; Good, Peter J; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S; Habegger, Lukas; Han, Ting; Henikoff, Jorja G; Henz, Stefan R; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K; Kolasinska-Zwierz, Paulina; Lai, Eric C; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M; Muroyama, Andrew; Murray, John I; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J; Slightam, Cindie; Smith, Richard; Spencer, William C; Stinson, E O; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L; Whittle, Christina M; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C; Micklem, Gos; Liu, X Shirley; Reinke, Valerie; Kim, Stuart K; Hillier, LaDeana W; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D; Waterston, Robert H

2010-12-24

We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution.

PubMed

van Iterson, Maarten; van Zwet, Erik W; Heijmans, Bastiaan T

2017-01-27

We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking.
Transcriptomics of cortical gray matter thickness decline during normal aging

PubMed Central

Kochunov, P; Charlesworth, J; Winkler, A; Hong, LE; Nichols, T; Curran, JE; Sprooten, E; Jahanshad, N; Thompson, PM; Johnson, MP; Kent, JW; Landman, BA; Mitchell, B; Cole, SA; Dyer, TD; Moses, EK; Goring, HHH; Almasy, L; Duggirala, R; Olvera, RL; Glahn, DC; Blangero, J

2013-01-01

Introduction We performed a whole-transcriptome correlation analysis, followed by the pathway enrichment and testing of innate immune response pathways analyses to evaluate the hypothesis that transcriptional activity can predict cortical gray matter thickness (GMT) variability during normal cerebral aging Methods Transcriptome and GMT data were availabe for 379 individuals (age range=28–85) community-dwelling members of large extended Mexican-American families. Collection of transcriptome data preceded that of neuroimaging data by 17 years. Genome-wide gene transcriptome data consisted of 20,413 heritable lymphocytes-based transcripts. GMT measurements were performed from high-resolution (isotropic 800µm) T1-weighted MRI. Transcriptome-wide and pathway enrichment analysis was used to classify genes correlated with GMT. Transcripts for sixty genes from seven innate immune pathways were tested as specific predictors of GMT variability. Results Transcripts for eight genes (IGFBP3, LRRN3, CRIP2, SCD, IDS, TCF4, GATA3, HN1) passed the transcriptome-wide significance threshold. Four orthogonal factors extracted from this set predicted 31.9% of the variability in the whole-brain and between 23.4 and 35% of regional GMT measurements. Pathway enrichment analysis identified six functional categories including cellular proliferation, aggregation, differentiation, viral infection, and metabolism. The integrin signaling pathway was significantly (p<10−6) enriched with GMT. Finally, three innate immune pathways (complement signaling, toll-receptors and scavenger and immunoglobulins) were significantly associated with GMT. Conclusion Expression activity for the genes that regulate cellular proliferation, adhesion, differentiation and inflammation can explain a significant proportion of individual variability in cortical GMT. Our findings suggest that normal cerebral aging is the product of a progressive decline in regenerative capacity and increased neuroinflammation. PMID:23707588
Transcriptomics of cortical gray matter thickness decline during normal aging.

PubMed

Kochunov, P; Charlesworth, J; Winkler, A; Hong, L E; Nichols, T E; Curran, J E; Sprooten, E; Jahanshad, N; Thompson, P M; Johnson, M P; Kent, J W; Landman, B A; Mitchell, B; Cole, S A; Dyer, T D; Moses, E K; Goring, H H H; Almasy, L; Duggirala, R; Olvera, R L; Glahn, D C; Blangero, J

2013-11-15

We performed a whole-transcriptome correlation analysis, followed by the pathway enrichment and testing of innate immune response pathway analyses to evaluate the hypothesis that transcriptional activity can predict cortical gray matter thickness (GMT) variability during normal cerebral aging. Transcriptome and GMT data were available for 379 individuals (age range=28-85) community-dwelling members of large extended Mexican American families. Collection of transcriptome data preceded that of neuroimaging data by 17 years. Genome-wide gene transcriptome data consisted of 20,413 heritable lymphocytes-based transcripts. GMT measurements were performed from high-resolution (isotropic 800 μm) T1-weighted MRI. Transcriptome-wide and pathway enrichment analysis was used to classify genes correlated with GMT. Transcripts for sixty genes from seven innate immune pathways were tested as specific predictors of GMT variability. Transcripts for eight genes (IGFBP3, LRRN3, CRIP2, SCD, IDS, TCF4, GATA3, and HN1) passed the transcriptome-wide significance threshold. Four orthogonal factors extracted from this set predicted 31.9% of the variability in the whole-brain and between 23.4 and 35% of regional GMT measurements. Pathway enrichment analysis identified six functional categories including cellular proliferation, aggregation, differentiation, viral infection, and metabolism. The integrin signaling pathway was significantly (p<10(-6)) enriched with GMT. Finally, three innate immune pathways (complement signaling, toll-receptors and scavenger and immunoglobulins) were significantly associated with GMT. Expression activity for the genes that regulate cellular proliferation, adhesion, differentiation and inflammation can explain a significant proportion of individual variability in cortical GMT. Our findings suggest that normal cerebral aging is the product of a progressive decline in regenerative capacity and increased neuroinflammation. Copyright © 2013 Elsevier Inc. All rights reserved.

Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

PubMed

Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

2015-06-08

The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Transcriptomic configuration of mouse brain induced by adolescent exposure to 3,4-methylenedioxymethamphetamine

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eun, Jung Woo; Kwack, Seung Jun; Noh, Ji Heon

The amphetamine derivative ({+-})-3,4-methylenedioxymethamphetamine (MDMA or ecstasy) is a synthetic amphetamine analogue used recreationally to obtain an enhanced affiliative emotional response. MDMA is a potent monoaminergic neurotoxin with the potential to damage brain serotonin and/or dopamine neurons. As the majority of MDMA users are young adults, the risk that users may expose the fetus to MDMA is a concern. However, the majority of studies on MDMA have investigated the effects on adult animals. Here, we investigated whether long-term exposure to MDMA, especially in adolescence, could induce comprehensive transcriptional changes in mouse brain. Transcriptomic analysis of mouse brain regions demonstrated significantmore » gene expression changes in the cerebral cortex. Supervised analysis identified 1028 genes that were chronically dysregulated by long-term exposure to MDMA in adolescent mice. Functional categories most represented by this MDMA characteristic signature are intracellular molecular signaling pathways of neurotoxicity, such as, the MAPK signaling pathway, the Wnt signaling pathway, neuroactive ligand-receptor interaction, long-term potentiation, and the long-term depression signaling pathway. Although these resultant large-scale molecular changes remain to be studied associated with functional brain damage caused by MDMA, our observations delineate the possible neurotoxic effects of MDMA on brain function, and have therapeutic implications concerning neuro-pathological conditions associated with MDMA abuse.« less
De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum

PubMed Central

Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

2013-01-01

Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689
Prediction of the neuropeptidomes of members of the Astacidea (Crustacea, Decapoda) using publicly accessible transcriptome shotgun assembly (TSA) sequence data.

PubMed

Christie, Andrew E; Chi, Megan

2015-12-01

The decapod infraorder Astacidea is comprised of clawed lobsters and freshwater crayfish. Due to their economic importance and their use as models for investigating neurochemical signaling, much work has focused on elucidating their neurochemistry, particularly their peptidergic systems. Interestingly, no astacidean has been the subject of large-scale peptidomic analysis via in silico transcriptome mining, this despite growing transcriptomic resources for members of this taxon. Here, the publicly accessible astacidean transcriptome shotgun assembly data were mined for putative peptide-encoding transcripts; these sequences were used to predict the structures of mature neuropeptides. One hundred seventy-six distinct peptides were predicted for Procambarus clarkii, including isoforms of adipokinetic hormone-corazonin-like peptide (ACP), allatostatin A (AST-A), allatostatin B, allatostatin C (AST-C) bursicon α, bursicon β, CCHamide, crustacean hyperglycemic hormone (CHH)/ion transport peptide (ITP), diuretic hormone 31 (DH31), eclosion hormone (EH), FMRFamide-like peptide, GSEFLamide, intocin, leucokinin, neuroparsin, neuropeptide F, pigment dispersing hormone, pyrokinin, RYamide, short neuropeptide F (sNPF), SIFamide, sulfakinin and tachykinin-related peptide (TRP). Forty-six distinct peptides, including isoforms of AST-A, AST-C, bursicon α, CCHamide, CHH/ITP, DH31, EH, intocin, myosuppressin, neuroparsin, red pigment concentrating hormone, sNPF and TRP, were predicted for Pontastacus leptodactylus, with a bursicon β and a neuroparsin predicted for Cherax quadricarinatus. The identification of ACP is the first from a decapod, while the predictions of CCHamide, EH, GSEFLamide, intocin, neuroparsin and RYamide are firsts for the Astacidea. Collectively, these data greatly expand the catalog of known astacidean neuropeptides and provide a foundation for functional studies of peptidergic signaling in members of this decapod infraorder. Copyright © 2015 Elsevier Inc. All rights reserved.
De Novo transcriptome sequencing reveals important molecular networks and metabolic pathways of the plant, Chlorophytum borivilianum.

PubMed

Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

2013-01-01

Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.
CONVERGENT TRANSCRIPTOMICS AND PROTEOMICS OF ENVIRONMENTAL ENRICHMENT AND COCAINE IDENTIFIES NOVEL THERAPEUTIC STRATEGIES FOR ADDICTION

PubMed Central

ZHANG, YAFANG; CROFTON, ELIZABETH J.; FAN, XIUZHEN; LI, DINGGE; KONG, FANPING; SINHA, MALA; LUXON, BRUCE A.; SPRATT, HEIDI M.; LICHTI, CHERYL F.; GREEN, THOMAS A.

2016-01-01

Transcriptomic and proteomic approaches have separately proven effective at identifying novel mechanisms affecting addiction-related behavior; however, it is difficult to prioritize the many promising leads from each approach. A convergent secondary analysis of proteomic and transcriptomic results can glean additional information to help prioritize promising leads. The current study is a secondary analysis of the convergence of recently published separate transcriptomic and proteomic analyses of nucleus accumbens (NAc) tissue from rats subjected to environmental enrichment vs. isolation and cocaine self-administration vs. saline. Multiple bioinformatics approaches (e.g. Gene Ontology (GO) analysis, Ingenuity Pathway Analysis (IPA), and Gene Set Enrichment Analysis (GSEA)) were used to interrogate these rich data sets. Although there was little correspondence between mRNA vs. protein at the individual target level, good correspondence was found at the level of gene/protein sets, particularly for the environmental enrichment manipulation. These data identify gene sets where there is a positive relationship between changes in mRNA and protein (e.g. glycolysis, ATP synthesis, translation elongation factor activity, etc.) and gene sets where there is an inverse relationship (e.g. ribosomes, Rho GTPase signaling, protein ubiquitination, etc.). Overall environmental enrichment produced better correspondence than cocaine self-administration. The individual targets contributing to mRNA and protein effects were largely not overlapping. As a whole, these results confirm that robust transcriptomic and proteomic data sets can provide similar results at the gene/protein set level even when there is little correspondence at the individual target level and little overlap in the targets contributing to the effects. PMID:27717806
Comparative transcriptome analysis between planarian Dugesia japonica and other platyhelminth species.

PubMed

Nishimura, Osamu; Hirao, Yukako; Tarui, Hiroshi; Agata, Kiyokazu

2012-06-29

Planarians are considered to be among the extant animals close to one of the earliest groups of organisms that acquired a central nervous system (CNS) during evolution. Planarians have a bilobed brain with nine lateral branches from which a variety of external signals are projected into different portions of the main lobes. Various interneurons process different signals to regulate behavior and learning/memory. Furthermore, planarians have robust regenerative ability and are attracting attention as a new model organism for the study of regeneration. Here we conducted large-scale EST analysis of the head region of the planarian Dugesia japonica to construct a database of the head-region transcriptome, and then performed comparative analyses among related species. A total of 54,752 high-quality EST reads were obtained from a head library of the planarian Dugesia japonica, and 13,167 unigene sequences were produced by de novo assembly. A new method devised here revealed that proteins related to metabolism and defense mechanisms have high flexibility of amino-acid substitutions within the planarian family. Eight-two CNS-development genes were found in the planarian (cf. C. elegans 3; chicken 129). Comparative analysis revealed that 91% of the planarian CNS-development genes could be mapped onto the schistosome genome, but one-third of these shared genes were not expressed in the schistosome. We constructed a database that is a useful resource for comparative planarian transcriptome studies. Analysis comparing homologous genes between two planarian species showed that the potential of genes is important for accumulation of amino-acid substitutions. The presence of many CNS-development genes in our database supports the notion that the planarian has a fundamental brain with regard to evolution and development at not only the morphological/functional, but also the genomic, level. In addition, our results indicate that the planarian CNS-development genes already existed before the divergence of planarians and schistosomes from their common ancestor.
Transcriptome Analysis of the Differentially Expressed Genes in the Male and Female Shrub Willows (Salix suchowensis)

PubMed Central

Liu, Jingjing; Yin, Tongming; Ye, Ning; Chen, Yingnan; Yin, Tingting; Liu, Min; Hassani, Danial

2013-01-01

Background The dioecious system is relatively rare in plants. Shrub willow is an annual flowering dioecious woody plant, and possesses many characteristics that lend it as a great model for tracking the missing pieces of sex determination evolution. To gain a global view of the genes differentially expressed in the male and female shrub willows and to develop a database for further studies, we performed a large-scale transcriptome sequencing of flower buds which were separately collected from two types of sexes. Results Totally, 1,201,931 high quality reads were obtained, with an average length of 389 bp and a total length of 467.96 Mb. The ESTs were assembled into 29,048 contigs, and 132,709 singletons. These unigenes were further functionally annotated by comparing their sequences to different proteins and functional domain databases and assigned with Gene Ontology (GO) terms. A biochemical pathway database containing 291 predicted pathways was also created based on the annotations of the unigenes. Digital expression analysis identified 806 differentially expressed genes between the male and female flower buds. And 33 of them located on the incipient sex chromosome of Salicaceae, among which, 12 genes might involve in plant sex determination empirically. These genes were worthy of special notification in future studies. Conclusions In this study, a large number of EST sequences were generated from the flower buds of a male and a female shrub willow. We also reported the differentially expressed genes between the two sex-type flowers. This work provides valuable information and sequence resources for uncovering the sex determining genes and for future functional genomics analysis of Salicaceae spp. PMID:23560075
The Secret Life of RNA: Lessons from Emerging Methodologies.

PubMed

Medioni, Caroline; Besse, Florence

2018-01-01

The last past decade has witnessed a revolution in our appreciation of transcriptome complexity and regulation. This remarkable expansion in our knowledge largely originates from the advent of high-throughput methodologies, and the consecutive discovery that up to 90% of eukaryotic genomes are transcribed, thus generating an unanticipated large range of noncoding RNAs (Hangauer et al., 15(4):112, 2014). Besides leading to the identification of new noncoding RNA species, transcriptome-wide studies have uncovered novel layers of posttranscriptional regulatory mechanisms controlling RNA processing, maturation or translation, and each contributing to the precise and dynamic regulation of gene expression. Remarkably, the development of systems-level studies has been accompanied by tremendous progress in the visualization of individual RNA molecules in single cells, such that it is now possible to image RNA species with a single-molecule resolution from birth to translation or decay. Monitoring quantitatively, with unprecedented spatiotemporal resolution, the fate of individual molecules has been key to understanding the molecular mechanisms underlying the different steps of RNA regulation. This has also revealed biologically relevant, intracellular and intercellular heterogeneities in RNA distribution or regulation. More recently, the convergence of imaging and high-throughput technologies has led to the emergence of spatially resolved transcriptomic techniques that provide a means to perform large-scale analyses while preserving spatial information. By generating transcriptome-wide data on single-cell RNA content, or even subcellular RNA distribution, these methodologies are opening avenues to a wide range of network-level studies at the cell and organ-level, and promise to strongly improve disease diagnostic and treatment.In this introductory chapter, we highlight how recently developed technologies aiming at detecting and visualizing RNA molecules have contributed to the emergence of entirely new research fields, and to dramatic progress in our understanding of gene expression regulation.
Developmental Gene Discovery in a Hemimetabolous Insect: De Novo Assembly and Annotation of a Transcriptome for the Cricket Gryllus bimaculatus

PubMed Central

Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W.; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G.

2013-01-01

Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus. PMID:23671567
Functional Analysis of the Drosophila Embryonic Germ Cell Transcriptome by RNA Interference

PubMed Central

Bujna, Ágnes; Vilmos, Péter; Spirohn, Kerstin; Boutros, Michael; Erdélyi, Miklós

2014-01-01

In Drosophila melanogaster, primordial germ cells are specified at the posterior pole of the very early embryo. This process is regulated by the posterior localized germ plasm that contains a large number of RNAs of maternal origin. Transcription in the primordial germ cells is actively down-regulated until germ cell fate is established. Bulk expression of the zygotic genes commences concomitantly with the degradation of the maternal transcripts. Thus, during embryogenesis, maternally provided and zygotically transcribed mRNAs determine germ cell development collectively. In an effort to identify novel genes involved in the regulation of germ cell behavior, we carried out a large-scale RNAi screen targeting both maternal and zygotic components of the embryonic germ line transcriptome. We identified 48 genes necessary for distinct stages in germ cell development. We found pebble and fascetto to be essential for germ cell migration and germ cell division, respectively. Our data uncover a previously unanticipated role of mei-P26 in maintenance of embryonic germ cell fate. We also performed systematic co-RNAi experiments, through which we found a low rate of functional redundancy among homologous gene pairs. As our data indicate a high degree of evolutionary conservation in genetic regulation of germ cell development, they are likely to provide valuable insights into the biology of the germ line in general. PMID:24896584
Transcriptomics reveals tissue/organ-specific differences in gene expression in the starfish Patiria pectinifera.

PubMed

Kim, Chan-Hee; Go, Hye-Jin; Oh, Hye Young; Jo, Yong Hun; Elphick, Maurice R; Park, Nam Gyu

2018-02-01

Starfish (Phylum Echinodermata) are of interest from an evolutionary perspective because as deuterostomian invertebrates they occupy an "intermediate" phylogenetic position with respect to chordates (e.g. vertebrates) and protostomian invertebrates (e.g. Drosophila). Furthermore, starfish are model organisms for research on fertilization, embryonic development, innate immunity and tissue regeneration. However, large-scale molecular data for starfish tissues/organs are limited. To provide a comprehensive genetic resource for the starfish Patiria pectinifera, we report de novo transcriptome assemblies and global gene expression analysis for six P. pectinifera tissues/organs - body wall (BW), coelomic epithelium (CE), tube feet (TF), stomach (SM), pyloric caeca (PC) and gonad (GN). A total of 408 million high-quality reads obtained from six cDNA libraries were assembled de novo using Trinity, resulting in a total of 549,598 contigs with a mean length of 835 nucleotides (nt), an N50 of 1473nt, and GC ratio of 42.5%. A total of 126,136 contigs (22.9%) were obtained as predicted open reading frames (ORFs) by TransDecoder, of which 102,187 were annotated with NCBI non-redundant (NR) hits, and 51,075 and 10,963 were annotated with Gene Ontology (GO) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) using the Blast2GO program, respectively. Gene expression analysis revealed that tissues/organs are grouped into three clusters: BW/CE/TF, SM/PC, and GN, which likely reflect functional relationships. 2408, 8560, 2687, 1727, 3321, and 2667 specifically expressed genes were identified for BW, GN, PC, CE, SM and TF, respectively, using the ROKU method. This study provides a valuable transcriptome resource and novel molecular insights into the functional biology of different tissues/organs in starfish as a model organism. Copyright © 2017 Elsevier B.V. All rights reserved.
Transcriptomic Modification in the Cerebral Cortex following Noninvasive Brain Stimulation: RNA-Sequencing Approach

DTIC Science & Technology

2017-04-20

was attached to the skull in order to anchor the acrylic and maintain the integrity of the head cap. 2.3. Whole Transcriptome RNA-Sequencing...no. 12, article 550, 2014. [24] D. W. Huang, B. T. Sherman, and R. A. Lempicki, “Systematic and integrative analysis of large gene lists using DAVID...BMC Bioinformatics, vol. 9, article 559, 2008. [29] Z. Hu, E. S. Snitkin, and C. DeLisi, “VisANT: an integrative framework for networks in systems
Transcriptome Meta-Analysis of Lung Cancer Reveals Recurrent Aberrations in NRG1 and Hippo Pathway Genes

PubMed Central

Dhanasekaran, Saravana M.; Balbin, O. Alejandro; Chen, Guoan; Nadal, Ernest; Kalyana-Sundaram, Shanker; Pan, Jincheng; Veeneman, Brendan; Cao, Xuhong; Malik, Rohit; Vats, Pankaj; Wang, Rui; Huang, Stephanie; Zhong, Jinjie; Jing, Xiaojun; Iyer, Matthew; Wu, Yi-Mi; Harms, Paul W.; Lin, Jules; Reddy, Rishindra; Brennan, Christine; Palanisamy, Nallasivam; Chang, Andrew C.; Truini, Anna; Truini, Mauro; Robinson, Dan R.; Beer, David G.; Chinnaiyan, Arul M.

2014-01-01

Lung cancer is emerging as a paradigm for disease molecular subtyping, facilitating targeted therapy based on driving somatic alterations. Here, we perform transcriptome analysis of 153 samples representing lung adenocarcinomas, squamous cell carcinomas, large cell lung cancer, adenoid cystic carcinomas and cell lines. By integrating our data with The Cancer Genome Atlas and published sources, we analyze 753 lung cancer samples for gene fusions and other transcriptomic alterations. We show that higher numbers of gene fusions is an independent prognostic factor for poor survival in lung cancer. Our analysis confirms the recently reported CD74-NRG1 fusion and suggests that NRG1, NF1 and Hippo pathway fusions may play important roles in tumors without known driver mutations. In addition, we observe exon skipping events in c-MET, which are attributable to splice site mutations. These classes of genetic aberrations may play a significant role in the genesis of lung cancers lacking known driver mutations. PMID:25531467
Probing the evolution, ecology and physiology of marine protists using transcriptomics.

PubMed

Caron, David A; Alexander, Harriet; Allen, Andrew E; Archibald, John M; Armbrust, E Virginia; Bachy, Charles; Bell, Callum J; Bharti, Arvind; Dyhrman, Sonya T; Guida, Stephanie M; Heidelberg, Karla B; Kaye, Jonathan Z; Metzner, Julia; Smith, Sarah R; Worden, Alexandra Z

2017-01-01

Protists, which are single-celled eukaryotes, critically influence the ecology and chemistry of marine ecosystems, but genome-based studies of these organisms have lagged behind those of other microorganisms. However, recent transcriptomic studies of cultured species, complemented by meta-omics analyses of natural communities, have increased the amount of genetic information available for poorly represented branches on the tree of eukaryotic life. This information is providing insights into the adaptations and interactions between protists and other microorganisms and macroorganisms, but many of the genes sequenced show no similarity to sequences currently available in public databases. A better understanding of these newly discovered genes will lead to a deeper appreciation of the functional diversity and metabolic processes in the ocean. In this Review, we summarize recent developments in our understanding of the ecology, physiology and evolution of protists, derived from transcriptomic studies of cultured strains and natural communities, and discuss how these novel large-scale genetic datasets will be used in the future.
A SAGE based approach to human glomerular endothelium: defining the transcriptome, finding a novel molecule and highlighting endothelial diversity.

PubMed

Sengoelge, Guerkan; Winnicki, Wolfgang; Kupczok, Anne; von Haeseler, Arndt; Schuster, Michael; Pfaller, Walter; Jennings, Paul; Weltermann, Ansgar; Blake, Sophia; Sunder-Plassmann, Gere

2014-08-27

Large scale transcript analysis of human glomerular microvascular endothelial cells (HGMEC) has never been accomplished. We designed this study to define the transcriptome of HGMEC and facilitate a better characterization of these endothelial cells with unique features. Serial analysis of gene expression (SAGE) was used for its unbiased approach to quantitative acquisition of transcripts. We generated a HGMEC SAGE library consisting of 68,987 transcript tags. Then taking advantage of large public databases and advanced bioinformatics we compared the HGMEC SAGE library with a SAGE library of non-cultured ex vivo human glomeruli (44,334 tags) which contained endothelial cells. The 823 tags common to both which would have the potential to be expressed in vivo were subsequently checked against 822,008 tags from 16 non-glomerular endothelial SAGE libraries. This resulted in 268 transcript tags differentially overexpressed in HGMEC compared to non-glomerular endothelia. These tags were filtered using a set of criteria: never before shown in kidney or any type of endothelial cell, absent in all nephron regions except the glomerulus, more highly expressed than statistically expected in HGMEC. Neurogranin, a direct target of thyroid hormone action which had been thought to be brain specific and never shown in endothelial cells before, fulfilled these criteria. Its expression in glomerular endothelium in vitro and in vivo was then verified by real-time-PCR, sequencing and immunohistochemistry. Our results represent an extensive molecular characterization of HGMEC beyond a mere database, underline the endothelial heterogeneity, and propose neurogranin as a potential link in the kidney-thyroid axis.
Large-Scale SRM Screen of Urothelial Bladder Cancer Candidate Biomarkers in Urine.

PubMed

Duriez, Elodie; Masselon, Christophe D; Mesmin, Cédric; Court, Magali; Demeure, Kevin; Allory, Yves; Malats, Núria; Matondo, Mariette; Radvanyi, François; Garin, Jérôme; Domon, Bruno

2017-04-07

Urothelial bladder cancer is a condition associated with high recurrence and substantial morbidity and mortality. Noninvasive urinary tests that would detect bladder cancer and tumor recurrence are required to significantly improve patient care. Over the past decade, numerous bladder cancer candidate biomarkers have been identified in the context of extensive proteomics or transcriptomics studies. To translate these findings in clinically useful biomarkers, the systematic evaluation of these candidates remains the bottleneck. Such evaluation involves large-scale quantitative LC-SRM (liquid chromatography-selected reaction monitoring) measurements, targeting hundreds of signature peptides by monitoring thousands of transitions in a single analysis. The design of highly multiplexed SRM analyses is driven by several factors: throughput, robustness, selectivity and sensitivity. Because of the complexity of the samples to be analyzed, some measurements (transitions) can be interfered by coeluting isobaric species resulting in biased or inconsistent estimated peptide/protein levels. Thus the assessment of the quality of SRM data is critical to allow flagging these inconsistent data. We describe an efficient and robust method to process large SRM data sets, including the processing of the raw data, the detection of low-quality measurements, the normalization of the signals for each protein, and the estimation of protein levels. Using this methodology, a variety of proteins previously associated with bladder cancer have been assessed through the analysis of urine samples from a large cohort of cancer patients and corresponding controls in an effort to establish a priority list of most promising candidates to guide subsequent clinical validation studies.
The Physcomitrella patens gene atlas project: large-scale RNA-seq based expression data.

PubMed

Perroud, Pierre-François; Haas, Fabian B; Hiss, Manuel; Ullrich, Kristian K; Alboresi, Alessandro; Amirebrahimi, Mojgan; Barry, Kerrie; Bassi, Roberto; Bonhomme, Sandrine; Chen, Haodong; Coates, Juliet C; Fujita, Tomomichi; Guyon-Debast, Anouchka; Lang, Daniel; Lin, Junyan; Lipzen, Anna; Nogué, Fabien; Oliver, Melvin J; Ponce de León, Inés; Quatrano, Ralph S; Rameau, Catherine; Reiss, Bernd; Reski, Ralf; Ricca, Mariana; Saidi, Younousse; Sun, Ning; Szövényi, Péter; Sreedasyam, Avinash; Grimwood, Jane; Stacey, Gary; Schmutz, Jeremy; Rensing, Stefan A

2018-07-01

High-throughput RNA sequencing (RNA-seq) has recently become the method of choice to define and analyze transcriptomes. For the model moss Physcomitrella patens, although this method has been used to help analyze specific perturbations, no overall reference dataset has yet been established. In the framework of the Gene Atlas project, the Joint Genome Institute selected P. patens as a flagship genome, opening the way to generate the first comprehensive transcriptome dataset for this moss. The first round of sequencing described here is composed of 99 independent libraries spanning 34 different developmental stages and conditions. Upon dataset quality control and processing through read mapping, 28 509 of the 34 361 v3.3 gene models (83%) were detected to be expressed across the samples. Differentially expressed genes (DEGs) were calculated across the dataset to permit perturbation comparisons between conditions. The analysis of the three most distinct and abundant P. patens growth stages - protonema, gametophore and sporophyte - allowed us to define both general transcriptional patterns and stage-specific transcripts. As an example of variation of physico-chemical growth conditions, we detail here the impact of ammonium supplementation under standard growth conditions on the protonemal transcriptome. Finally, the cooperative nature of this project allowed us to analyze inter-laboratory variation, as 13 different laboratories around the world provided samples. We compare differences in the replication of experiments in a single laboratory and between different laboratories. © 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.
Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

PubMed Central

Joyce, Blake L.; Haug-Baltzell, Asher K.; Hulvey, Jonathan P.; McCarthy, Fiona; Devisetty, Upendra Kumar; Lyons, Eric

2017-01-01

This workflow allows novice researchers to leverage advanced computational resources such as cloud computing to carry out pairwise comparative transcriptomics. It also serves as a primer for biologists to develop data scientist computational skills, e.g. executing bash commands, visualization and management of large data sets. All command line code and further explanations of each command or step can be found on the wiki (https://wiki.cyverse.org/wiki/x/dgGtAQ). The Discovery Environment and Atmosphere platforms are connected together through the CyVerse Data Store. As such, once the initial raw sequencing data has been uploaded there is no more need to transfer large data files over an Internet connection, minimizing the amount of time needed to conduct analyses. This protocol is designed to analyze only two experimental treatments or conditions. Differential gene expression analysis is conducted through pairwise comparisons, and will not be suitable to test multiple factors. This workflow is also designed to be manual rather than automated. Each step must be executed and investigated by the user, yielding a better understanding of data and analytical outputs, and therefore better results for the user. Once complete, this protocol will yield de novo assembled transcriptome(s) for underserved (non-model) organisms without the need to map to previously assembled reference genomes (which are usually not available in underserved organism). These de novo transcriptomes are further used in pairwise differential gene expression analysis to investigate genes differing between two experimental conditions. Differentially expressed genes are then functionally annotated to understand the genetic response organisms have to experimental conditions. In total, the data derived from this protocol is used to test hypotheses about biological responses of underserved organisms. PMID:28518075
An interactive web application for the dissemination of human systems immunology data.

PubMed

Speake, Cate; Presnell, Scott; Domico, Kelly; Zeitner, Brad; Bjork, Anna; Anderson, David; Mason, Michael J; Whalen, Elizabeth; Vargas, Olivia; Popov, Dimitry; Rinchai, Darawan; Jourde-Chiche, Noemie; Chiche, Laurent; Quinn, Charlie; Chaussabel, Damien

2015-06-19

Systems immunology approaches have proven invaluable in translational research settings. The current rate at which large-scale datasets are generated presents unique challenges and opportunities. Mining aggregates of these datasets could accelerate the pace of discovery, but new solutions are needed to integrate the heterogeneous data types with the contextual information that is necessary for interpretation. In addition, enabling tools and technologies facilitating investigators' interaction with large-scale datasets must be developed in order to promote insight and foster knowledge discovery. State of the art application programming was employed to develop an interactive web application for browsing and visualizing large and complex datasets. A collection of human immune transcriptome datasets were loaded alongside contextual information about the samples. We provide a resource enabling interactive query and navigation of transcriptome datasets relevant to human immunology research. Detailed information about studies and samples are displayed dynamically; if desired the associated data can be downloaded. Custom interactive visualizations of the data can be shared via email or social media. This application can be used to browse context-rich systems-scale data within and across systems immunology studies. This resource is publicly available online at [Gene Expression Browser Landing Page ( https://gxb.benaroyaresearch.org/dm3/landing.gsp )]. The source code is also available openly [Gene Expression Browser Source Code ( https://github.com/BenaroyaResearch/gxbrowser )]. We have developed a data browsing and visualization application capable of navigating increasingly large and complex datasets generated in the context of immunological studies. This intuitive tool ensures that, whether taken individually or as a whole, such datasets generated at great effort and expense remain interpretable and a ready source of insight for years to come.

Differential Response to Heat Stress in Outer and Inner Onion Bulb Scales.

PubMed

Galsurker, Ortal; Doron-Faigenboim, Adi; Teper-Bamnolker, Paula; Daus, Avinoam; Lers, Amnon; Eshel, Dani

2018-05-18

Brown protective skin formation in onion bulbs can be induced by rapid postharvest heat treatment. Onions that were peeled to different depths and were exposed to heat stress showed that only the outer scale formed dry brown skin, whereas the inner scales maintained high water content and did not change color. Our results reveal that browning of the outer scale during heat treatment is due to an enzymatic process that is associated with high levels of oxidation components, such as peroxidase and quercetin glucoside. De-novo transcriptome analysis revealed differential molecular responses of the outer and inner scales to the heat stress. Genes involved in lipid metabolism, oxidation pathways and cell-wall modification were highly expressed in the outer scale during heating. Defense-response-related genes such as those encoding heat-shock proteins, antioxidative stress defense or production of osmoprotectant metabolites were mostly induced in the inner scale in response to the heat exposure. These transcriptomic data led to a conceptual model that suggests sequential processes for browning development and desiccation of the outer scales versus processes associated with defense response and heat tolerance in the inner scale. Thus, the observed physiological differences between the outer and inner scales is supported by the identified molecular differences.
TRAM (Transcriptome Mapper): database-driven creation and analysis of transcriptome maps from multiple sources

PubMed Central

2011-01-01

Background Several tools have been developed to perform global gene expression profile data analysis, to search for specific chromosomal regions whose features meet defined criteria as well as to study neighbouring gene expression. However, most of these tools are tailored for a specific use in a particular context (e.g. they are species-specific, or limited to a particular data format) and they typically accept only gene lists as input. Results TRAM (Transcriptome Mapper) is a new general tool that allows the simple generation and analysis of quantitative transcriptome maps, starting from any source listing gene expression values for a given gene set (e.g. expression microarrays), implemented as a relational database. It includes a parser able to assign univocal and updated gene symbols to gene identifiers from different data sources. Moreover, TRAM is able to perform intra-sample and inter-sample data normalization, including an original variant of quantile normalization (scaled quantile), useful to normalize data from platforms with highly different numbers of investigated genes. When in 'Map' mode, the software generates a quantitative representation of the transcriptome of a sample (or of a pool of samples) and identifies if segments of defined lengths are over/under-expressed compared to the desired threshold. When in 'Cluster' mode, the software searches for a set of over/under-expressed consecutive genes. Statistical significance for all results is calculated with respect to genes localized on the same chromosome or to all genome genes. Transcriptome maps, showing differential expression between two sample groups, relative to two different biological conditions, may be easily generated. We present the results of a biological model test, based on a meta-analysis comparison between a sample pool of human CD34+ hematopoietic progenitor cells and a sample pool of megakaryocytic cells. Biologically relevant chromosomal segments and gene clusters with differential expression during the differentiation toward megakaryocyte were identified. Conclusions TRAM is designed to create, and statistically analyze, quantitative transcriptome maps, based on gene expression data from multiple sources. The release includes FileMaker Pro database management runtime application and it is freely available at http://apollo11.isto.unibo.it/software/, along with preconfigured implementations for mapping of human, mouse and zebrafish transcriptomes. PMID:21333005
Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing

PubMed Central

Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li

2010-01-01

Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome. PMID:20392818
Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing.

PubMed

Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li

2010-08-01

Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome.
Analyses of transcriptome sequences reveal multiple ancient large-scale duplication events in the ancestor of Sphagnopsida (Bryophyta)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Devos, Nicolas; Szövényi, Péter; Weston, David J.

In this study, the goal of this research was to investigate whether there has been a whole-genome duplication (WGD) in the ancestry of Sphagnum (peatmoss) or the class Sphagnopsida, and to determine if the timing of any such duplication(s) and patterns of paralog retention could help explain the rapid radiation and current ecological dominance of peatmosses.
Analyses of transcriptome sequences reveal multiple ancient large-scale duplication events in the ancestor of Sphagnopsida (Bryophyta)

DOE PAGES

Devos, Nicolas; Szövényi, Péter; Weston, David J.; ...

2016-02-22

In this study, the goal of this research was to investigate whether there has been a whole-genome duplication (WGD) in the ancestry of Sphagnum (peatmoss) or the class Sphagnopsida, and to determine if the timing of any such duplication(s) and patterns of paralog retention could help explain the rapid radiation and current ecological dominance of peatmosses.
The large-scale investigation of gene expression in Leymus chinensis stigmas provides a valuable resource for understanding the mechanisms of poaceae self-incompatibility.

PubMed

Zhou, Qingyuan; Jia, Junting; Huang, Xing; Yan, Xueqing; Cheng, Liqin; Chen, Shuangyan; Li, Xiaoxia; Peng, Xianjun; Liu, Gongshe

2014-05-26

Many Poaceae species show a gametophytic self-incompatibility (GSI) system, which is controlled by at least two independent and multiallelic loci, S and Z. Until currently, the gene products for S and Z were unknown. Grass SI plant stigmas discriminate between pollen grains that land on its surface and support compatible pollen tube growth and penetration into the stigma, whereas recognizing incompatible pollen and thus inhibiting pollination behaviors. Leymus chinensis (Trin.) Tzvel. (sheepgrass) is a Poaceae SI species. A comprehensive analysis of sheepgrass stigma transcriptome may provide valuable information for understanding the mechanism of pollen-stigma interactions and grass SI. The transcript abundance profiles of mature stigmas, mature ovaries and leaves were examined using high-throughput next generation sequencing technology. A comparative transcriptomic analysis of these tissues identified 1,025 specifically or preferentially expressed genes in sheepgrass stigmas. These genes contained a significant proportion of genes predicted to function in cell-cell communication and signal transduction. We identified 111 putative transcription factors (TFs) genes and the most abundant groups were MYB, C2H2, C3H, FAR1, MADS. Comparative analysis of the sheepgrass, rice and Arabidopsis stigma-specific or preferential datasets showed broad similarities and some differences in the proportion of genes in the Gene Ontology (GO) functional categories. Potential SI candidate genes identified in other grasses were also detected in the sheepgrass stigma-specific or preferential dataset. Quantitative real-time PCR experiments validated the expression pattern of stigma preferential genes including homologous grass SI candidate genes. This study represents the first large-scale investigation of gene expression in the stigmas of an SI grass species. We uncovered many notable genes that are potentially involved in pollen-stigma interactions and SI mechanisms, including genes encoding receptor-like protein kinases (RLK), CBL (calcineurin B-like proteins) interacting protein kinases, calcium-dependent protein kinase, expansins, pectinesterase, peroxidases and various transcription factors. The availability of a pool of stigma-specific or preferential genes for L. chinensis offers an opportunity to elucidate the mechanisms of SI in Poaceae.
Understanding and utilising mammalian venom via a platypus venom transcriptome.

PubMed

Whittington, Camilla M; Koh, Jennifer M S; Warren, Wesley C; Papenfuss, Anthony T; Torres, Allan M; Kuchel, Philip W; Belov, Katherine

2009-03-06

Only five mammalian species are known to be venomous, and while a large amount of research has been carried out on reptile venom, mammalian venom has been poorly studied to date. Here we describe the status of current research into the venom of the platypus, a semi-aquatic egg-laying Australian mammal, and discuss our approach to platypus venom transcriptomics. We propose that such construction and analysis of mammalian venom transcriptomes from small samples of venom gland, in tandem with proteomics studies, will allow the identification of the full range of mammalian venom components. Functional studies and pharmacological evaluation of the identified toxins will then lay the foundations for the future development of novel biomedical substances. A large range of useful molecules have already been identified in snake venom, and many of these are currently in use in human medicine. It is therefore hoped that this basic research to identify the constituents of platypus venom will eventually yield novel drugs and new targets for painkillers.
Determining the optimal number of independent components for reproducible transcriptomic data analysis.

PubMed

Kairov, Ulykbek; Cantini, Laura; Greco, Alessandro; Molkenov, Askhat; Czerwinska, Urszula; Barillot, Emmanuel; Zinovyev, Andrei

2017-09-11

Independent Component Analysis (ICA) is a method that models gene expression data as an action of a set of statistically independent hidden factors. The output of ICA depends on a fundamental parameter: the number of components (factors) to compute. The optimal choice of this parameter, related to determining the effective data dimension, remains an open question in the application of blind source separation techniques to transcriptomic data. Here we address the question of optimizing the number of statistically independent components in the analysis of transcriptomic data for reproducibility of the components in multiple runs of ICA (within the same or within varying effective dimensions) and in multiple independent datasets. To this end, we introduce ranking of independent components based on their stability in multiple ICA computation runs and define a distinguished number of components (Most Stable Transcriptome Dimension, MSTD) corresponding to the point of the qualitative change of the stability profile. Based on a large body of data, we demonstrate that a sufficient number of dimensions is required for biological interpretability of the ICA decomposition and that the most stable components with ranks below MSTD have more chances to be reproduced in independent studies compared to the less stable ones. At the same time, we show that a transcriptomics dataset can be reduced to a relatively high number of dimensions without losing the interpretability of ICA, even though higher dimensions give rise to components driven by small gene sets. We suggest a protocol of ICA application to transcriptomics data with a possibility of prioritizing components with respect to their reproducibility that strengthens the biological interpretation. Computing too few components (much less than MSTD) is not optimal for interpretability of the results. The components ranked within MSTD range have more chances to be reproduced in independent studies.
De Novo Characterization of the Spleen Transcriptome of the Large Yellow Croaker (Pseudosciaena crocea) and Analysis of the Immune Relevant Genes and Pathways Involved in the Antiviral Response

PubMed Central

Ding, Yang; Ao, Jingqun; Hu, Songnian; Chen, Xinhua

2014-01-01

The large yellow croaker (Pseudosciaena crocea) is an economically important marine fish in China. To understand the molecular basis for antiviral defense in this species, we used Illumia paired-end sequencing to characterize the spleen transcriptome of polyriboinosinic:polyribocytidylic acid [poly(I:C)]-induced large yellow croakers. The library produced 56,355,728 reads and assembled into 108,237 contigs. As a result, 15,192 unigenes were found from this transcriptome. Gene ontology analysis showed that 4,759 genes were involved in three major functional categories: biological process, cellular component, and molecular function. We further ascertained that numerous consensus sequences were homologous to known immune-relevant genes. Kyoto Encyclopedia of Genes and Genomes orthology mapping annotated 5,389 unigenes and identified numerous immune-relevant pathways. These immune-relevant genes and pathways revealed major antiviral immunity effectors, including but not limited to: pattern recognition receptors, adaptors and signal transducers, the interferons and interferon-stimulated genes, inflammatory cytokines and receptors, complement components, and B-cell and T-cell antigen activation molecules. Moreover, the partial genes of Toll-like receptor signaling pathway, RIG-I-like receptors signaling pathway, Janus kinase-Signal Transducer and Activator of Transcription (JAK-STAT) signaling pathway, and T-cell receptor (TCR) signaling pathway were found to be changed after poly(I:C) induction by real-time polymerase chain reaction (PCR) analysis, suggesting that these signaling pathways may be regulated by poly(I:C), a viral mimic. Overall, the antivirus-related genes and signaling pathways that were identified in response to poly(I:C) challenge provide valuable leads for further investigation of the antiviral defense mechanism in the large yellow croaker. PMID:24820969
The prediction of a pathogenesis-related secretome of Puccinia helianthi through high-throughput transcriptome analysis.

PubMed

Jing, Lan; Guo, Dandan; Hu, Wenjie; Niu, Xiaofan

2017-03-11

Many plant pathogen secretory proteins are known to be elicitors or pathogenic factors,which play an important role in the host-pathogen interaction process. Bioinformatics approaches make possible the large scale prediction and analysis of secretory proteins from the Puccinia helianthi transcriptome. The internet-based software SignalP v4.1, TargetP v1.01, Big-PI predictor, TMHMM v2.0 and ProtComp v9.0 were utilized to predict the signal peptides and the signal peptide-dependent secreted proteins among the 35,286 ORFs of the P. helianthi transcriptome. 908 ORFs (accounting for 2.6% of the total proteins) were identified as putative secretory proteins containing signal peptides. The length of the majority of proteins ranged from 51 to 300 amino acids (aa), while the signal peptides were from 18 to 20 aa long. Signal peptidase I (SpI) cleavage sites were found in 463 of these putative secretory signal peptides. 55 proteins contained the lipoprotein signal peptide recognition site of signal peptidase II (SpII). Out of 908 secretory proteins, 581 (63.8%) have functions related to signal recognition and transduction, metabolism, transport and catabolism. Additionally, 143 putative secretory proteins were categorized into 27 functional groups based on Gene Ontology terms, including 14 groups in biological process, seven in cellular component, and six in molecular function. Gene ontology analysis of the secretory proteins revealed an enrichment of hydrolase activity. Pathway associations were established for 82 (9.0%) secretory proteins. A number of cell wall degrading enzymes and three homologous proteins specific to Phytophthora sojae effectors were also identified, which may be involved in the pathogenicity of the sunflower rust pathogen. This investigation proposes a new approach for identifying elicitors and pathogenic factors. The eventual identification and characterization of 908 extracellularly secreted proteins will advance our understanding of the molecular mechanisms of interactions between sunflower and rust pathogen and will enhance our ability to intervene in disease states.
From cacti to carnivores: Improved phylotranscriptomic sampling and hierarchical homology inference provide further insight into the evolution of Caryophyllales.

PubMed

Walker, Joseph F; Yang, Ya; Feng, Tao; Timoneda, Alfonso; Mikenas, Jessica; Hutchison, Vera; Edwards, Caroline; Wang, Ning; Ahluwalia, Sonia; Olivieri, Julia; Walker-Hale, Nathanael; Majure, Lucas C; Puente, Raúl; Kadereit, Gudrun; Lauterbach, Maximilian; Eggli, Urs; Flores-Olvera, Hilda; Ochoterena, Helga; Brockington, Samuel F; Moore, Michael J; Smith, Stephen A

2018-03-01

The Caryophyllales contain ~12,500 species and are known for their cosmopolitan distribution, convergence of trait evolution, and extreme adaptations. Some relationships within the Caryophyllales, like those of many large plant clades, remain unclear, and phylogenetic studies often recover alternative hypotheses. We explore the utility of broad and dense transcriptome sampling across the order for resolving evolutionary relationships in Caryophyllales. We generated 84 transcriptomes and combined these with 224 publicly available transcriptomes to perform a phylogenomic analysis of Caryophyllales. To overcome the computational challenge of ortholog detection in such a large data set, we developed an approach for clustering gene families that allowed us to analyze >300 transcriptomes and genomes. We then inferred the species relationships using multiple methods and performed gene-tree conflict analyses. Our phylogenetic analyses resolved many clades with strong support, but also showed significant gene-tree discordance. This discordance is not only a common feature of phylogenomic studies, but also represents an opportunity to understand processes that have structured phylogenies. We also found taxon sampling influences species-tree inference, highlighting the importance of more focused studies with additional taxon sampling. Transcriptomes are useful both for species-tree inference and for uncovering evolutionary complexity within lineages. Through analyses of gene-tree conflict and multiple methods of species-tree inference, we demonstrate that phylogenomic data can provide unparalleled insight into the evolutionary history of Caryophyllales. We also discuss a method for overcoming computational challenges associated with homolog clustering in large data sets. © 2018 The Authors. American Journal of Botany is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America.
pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

PubMed

Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

2013-08-01

With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
Laser assisted microdissection, an efficient technique to understand tissue specific gene expression patterns and functional genomics in plants.

PubMed

Gautam, Vibhav; Sarkar, Ananda K

2015-04-01

Laser assisted microdissection (LAM) is an advanced technology used to perform tissue or cell-specific expression profiling of genes and proteins, owing to its ability to isolate the desired tissue or cell type from a heterogeneous population. Due to the specificity and high efficiency acquired during its pioneering use in medical science, the LAM technique has quickly been adopted for use in many biological researches. Today, it has become a potent tool to address a wide range of questions in diverse field of plant biology. Beginning with comparative transcriptome analysis of different tissues such as reproductive parts, meristems, lateral organs, roots etc., LAM has also been extensively used in plant-pathogen interaction studies, proteomics, and metabolomics. In combination with next generation sequencing and proteomics analysis, LAM has opened up promising opportunities in the area of large scale functional studies in plants. Ever since the advent of this technique, significant improvements have been achieved in term of its instrumentation and method, which has made LAM a more efficient tool applicable in wider research areas. Here, we discuss the advancement of LAM technique with special emphasis on its methodology and highlight its scope in modern research areas of plant biology. Although we put emphasis on use of LAM in transcriptome studies, which is mostly used, we also discuss its recent application and scope in proteome and metabolome studies.
Large-scale transcriptome characterization and mass discovery of SNPs in globe artichoke and its related taxa.

PubMed

Scaglione, Davide; Lanteri, Sergio; Acquadro, Alberto; Lai, Zhao; Knapp, Steven J; Rieseberg, Loren; Portis, Ezio

2012-10-01

Cynara cardunculus (2n = 2× = 34) is a member of the Asteraceae family that contributes significantly to the agricultural economy of the Mediterranean basin. The species includes two cultivated varieties, globe artichoke and cardoon, which are grown mainly for food. Cynara cardunculus is an orphan crop species whose genome/transcriptome has been relatively unexplored, especially in comparison to other Asteraceae crops. Hence, there is a significant need to improve its genomic resources through the identification of novel genes and sequence-based markers, to design new breeding schemes aimed at increasing quality and crop productivity. We report the outcome of cDNA sequencing and assembly for eleven accessions of C. cardunculus. Sequencing of three mapping parental genotypes using Roche 454-Titanium technology generated 1.7 × 10⁶ reads, which were assembled into 38,726 reference transcripts covering 32 Mbp. Putative enzyme-encoding genes were annotated using the KEGG-database. Transcription factors and candidate resistance genes were surveyed as well. Paired-end sequencing was done for cDNA libraries of eight other representative C. cardunculus accessions on an Illumina Genome Analyzer IIx, generating 46 × 10⁶ reads. Alignment of the IGA and 454 reads to reference transcripts led to the identification of 195,400 SNPs with a Bayesian probability exceeding 95%; a validation rate of 90% was obtained by Sanger-sequencing of a subset of contigs. These results demonstrate that the integration of data from different NGS platforms enables large-scale transcriptome characterization, along with massive SNP discovery. This information will contribute to the dissection of key agricultural traits in C. cardunculus and facilitate the implementation of marker-assisted selection programs. © 2012 The Authors. Plant Biotechnology Journal © 2012 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Herbivory-induced changes in the small-RNA transcriptome and phytohormone signaling in Nicotiana attenuata

PubMed Central

Pandey, Shree P.; Shahi, Priyanka; Gase, Klaus; Baldwin, Ian T.

2008-01-01

Phytohormones mediate the perception of insect-specific signals and the elicitation of defenses during insect attack. Large-scale changes in a plant's transcriptome ensue, but how these changes are regulated remains unknown. Silencing of RNA-directed RNA polymerase 1 (RdR1) makes Nicotiana attenuata highly susceptible to insect herbivores, suggesting that defense elicitation is under the direct control of small-RNAs (smRNAs). Using 454-sequencing, we characterized N. attenuata's smRNA transcriptome before and after insect-specific elicitation in wild-type (WT) and RdR1-silenced (irRdR1) plants. We predicted the targets of N. attenuata smRNAs in the genes related to phytohormone signaling (jasmonic acid, JA-Ile, and ethylene) known to mediate resistance responses, and we measured the elicited dynamics of phytohormone biosynthetic transcripts and phytohormone levels in time-course experiments with field- and glasshouse-grown plants. RdR1 silencing severely altered the induced transcript accumulation of 8 of the 10 genes, reduced JA, and enhanced ethylene levels after elicitation. Adding JA completely restored the insect resistance of irRdR1 plants. irRdR1 plants had photosynthetic rates, growth, and reproductive output indistinguishable from that of WT plants, suggesting unaltered primary metabolism. We conclude that the susceptibility of irRdR1 plants to herbivores is due to altered phytohormone signaling and that smRNAs play a central role in coordinating the large-scale transcriptional changes that occur after herbivore attack. Given the diversity of smRNAs that are elicited after insect attack and the recent demonstration of the ability of ingested smRNAs to silence transcript accumulation in lepidopteran larvae midguts, the smRNA responses of plants may also function as direct defenses. PMID:18339806
Collection and processing of lymph nodes from large animals for RNA analysis: preparing for lymph node transcriptomic studies of large animal species

USDA-ARS?s Scientific Manuscript database

Large animals (both livestock and wildlife) serve as important reservoirs of zoonotic pathogens, including Brucella, Salmonella, and E. coli, as well as useful models for the study of pathogenesis and/or spread of the bacteria in non-murine hosts. With the key function of lymph nodes in the host imm...
Analyses of transcriptome sequences reveal multiple ancient large-scale duplication events in the ancestor of Sphagnopsida (Bryophyta).

PubMed

Devos, Nicolas; Szövényi, Péter; Weston, David J; Rothfels, Carl J; Johnson, Matthew G; Shaw, A Jonathan

2016-07-01

The goal of this research was to investigate whether there has been a whole-genome duplication (WGD) in the ancestry of Sphagnum (peatmoss) or the class Sphagnopsida, and to determine if the timing of any such duplication(s) and patterns of paralog retention could help explain the rapid radiation and current ecological dominance of peatmosses. RNA sequencing (RNA-seq) data were generated for nine taxa in Sphagnopsida (Bryophyta). Analyses of frequency plots for synonymous substitutions per synonymous site (Ks ) between paralogous gene pairs and reconciliation of 578 gene trees were conducted to assess evidence of large-scale or genome-wide duplication events in each transcriptome. Both Ks frequency plots and gene tree-based analyses indicate multiple duplication events in the history of the Sphagnopsida. The most recent WGD event predates divergence of Sphagnum from the two other genera of Sphagnopsida. Duplicate retention is highly variable across species, which might be best explained by local adaptation. Our analyses indicate that the last WGD could have been an important factor underlying the diversification of peatmosses and facilitated their rise to ecological dominance in peatlands. The timing of the duplication events and their significance in the evolutionary history of peat mosses are discussed. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
In silico mining and PCR-based approaches to transcription factor discovery in non-model plants: gene discovery of the WRKY transcription factors in conifers.

PubMed

Liu, Jun-Jun; Xiang, Yu

2011-01-01

WRKY transcription factors are key regulators of numerous biological processes in plant growth and development, as well as plant responses to abiotic and biotic stresses. Research on biological functions of plant WRKY genes has focused in the past on model plant species or species with largely characterized transcriptomes. However, a variety of non-model plants, such as forest conifers, are essential as feed, biofuel, and wood or for sustainable ecosystems. Identification of WRKY genes in these non-model plants is equally important for understanding the evolutionary and function-adaptive processes of this transcription factor family. Because of limited genomic information, the rarity of regulatory gene mRNAs in transcriptomes, and the sequence divergence to model organism genes, identification of transcription factors in non-model plants using methods similar to those generally used for model plants is difficult. This chapter describes a gene family discovery strategy for identification of WRKY transcription factors in conifers by a combination of in silico-based prediction and PCR-based experimental approaches. Compared to traditional cDNA library screening or EST sequencing at transcriptome scales, this integrated gene discovery strategy provides fast, simple, reliable, and specific methods to unveil the WRKY gene family at both genome and transcriptome levels in non-model plants.
Identification of the pheromone biosynthesis genes from the sex pheromone gland transcriptome of the diamondback moth, Plutella xylostella.

PubMed

Chen, Da-Song; Dai, Jian-Qing; Han, Shi-Chou

2017-11-24

The diamondback moth was estimated to increase costs to the global agricultural economy as the global area increase of Brassica vegetable crops and oilseed rape. Sex pheromones traps are outstanding tools available in Integrated Pest Management for many years and provides an effective approach for DBM population monitoring and control. The ratio of two major sex pheromone compounds shows geographical variations. However, the limitation of our information in the DBM pheromone biosynthesis dampens our understanding of the ratio diversity of pheromone compounds. Here, we constructed a transcriptomic library from the DBM pheromone gland and identified genes putatively involved in the fatty acid biosynthesis, pheromones functional group transfer, and β-oxidation enzymes. In addition, odorant binding protein, chemosensory protein and pheromone binding protein genes encoded in the pheromone gland transcriptome, suggest that female DBM moths may receive odors or pheromone compounds via their pheromone gland and ovipositor system. Tissue expression profiles further revealed that two ALR, three DES and one FAR5 genes were pheromone gland tissue biased, while some chemoreception genes expressed extensively in PG, pupa, antenna and legs tissues. Finally, the candidate genes from large-scale transcriptome information may be useful for characterizing a presumed biosynthetic pathway of the DBM sex pheromone.

PASTA: splice junction identification from RNA-Sequencing data

PubMed Central

2013-01-01

Background Next generation transcriptome sequencing (RNA-Seq) is emerging as a powerful experimental tool for the study of alternative splicing and its regulation, but requires ad-hoc analysis methods and tools. PASTA (Patterned Alignments for Splicing and Transcriptome Analysis) is a splice junction detection algorithm specifically designed for RNA-Seq data, relying on a highly accurate alignment strategy and on a combination of heuristic and statistical methods to identify exon-intron junctions with high accuracy. Results Comparisons against TopHat and other splice junction prediction software on real and simulated datasets show that PASTA exhibits high specificity and sensitivity, especially at lower coverage levels. Moreover, PASTA is highly configurable and flexible, and can therefore be applied in a wide range of analysis scenarios: it is able to handle both single-end and paired-end reads, it does not rely on the presence of canonical splicing signals, and it uses organism-specific regression models to accurately identify junctions. Conclusions PASTA is a highly efficient and sensitive tool to identify splicing junctions from RNA-Seq data. Compared to similar programs, it has the ability to identify a higher number of real splicing junctions, and provides highly annotated output files containing detailed information about their location and characteristics. Accurate junction data in turn facilitates the reconstruction of the splicing isoforms and the analysis of their expression levels, which will be performed by the remaining modules of the PASTA pipeline, still under development. Use of PASTA can therefore enable the large-scale investigation of transcription and alternative splicing. PMID:23557086
Computational analysis of conserved RNA secondary structure in transcriptomes and genomes.

PubMed

Eddy, Sean R

2014-01-01

Transcriptomics experiments and computational predictions both enable systematic discovery of new functional RNAs. However, many putative noncoding transcripts arise instead from artifacts and biological noise, and current computational prediction methods have high false positive rates. I discuss prospects for improving computational methods for analyzing and identifying functional RNAs, with a focus on detecting signatures of conserved RNA secondary structure. An interesting new front is the application of chemical and enzymatic experiments that probe RNA structure on a transcriptome-wide scale. I review several proposed approaches for incorporating structure probing data into the computational prediction of RNA secondary structure. Using probabilistic inference formalisms, I show how all these approaches can be unified in a well-principled framework, which in turn allows RNA probing data to be easily integrated into a wide range of analyses that depend on RNA secondary structure inference. Such analyses include homology search and genome-wide detection of new structural RNAs.
Transcriptomic profiling as a screening tool to detect trenbolone treatment in beef cattle.

PubMed

Pegolo, S; Cannizzo, F T; Biolatti, B; Castagnaro, M; Bargelloni, L

2014-06-01

The effects of steroid hormone implants containing trenbolone alone (Finaplix-H), combined with 17β-oestradiol (17β-E; Revalor-H), or with 17β-E and dexamethasone (Revalor-H plus dexamethasone per os) on the bovine muscle transcriptome were examined by DNA-microarray. Overall, large sets of genes were shown to be modulated by the different growth promoters (GPs) and the regulated pathways and biological processes were mostly shared among the treatment groups. Using the Prediction Analysis of Microarray program, GP-treated animals were accurately identified by a small number of predictive genes. A meta-analysis approach was also carried out for the Revalor group to potentially increase the robustness of class prediction analysis. After data pre-processing, a high level of accuracy (90%) was obtained in the classification of samples, using 105 predictive gene markers. Transcriptomics could thus help in the identification of indirect biomarkers for anabolic treatment in beef cattle to be applied for the screening of muscle samples collected after slaughtering. Copyright © 2014 Elsevier Ltd. All rights reserved.
Computational Selection of Transcriptomics Experiments Improves Guilt-by-Association Analyses

PubMed Central

Bhat, Prajwal; Yang, Haixuan; Bögre, László; Devoto, Alessandra; Paccanaro, Alberto

2012-01-01

The Guilt-by-Association (GBA) principle, according to which genes with similar expression profiles are functionally associated, is widely applied for functional analyses using large heterogeneous collections of transcriptomics data. However, the use of such large collections could hamper GBA functional analysis for genes whose expression is condition specific. In these cases a smaller set of condition related experiments should instead be used, but identifying such functionally relevant experiments from large collections based on literature knowledge alone is an impractical task. We begin this paper by analyzing, both from a mathematical and a biological point of view, why only condition specific experiments should be used in GBA functional analysis. We are able to show that this phenomenon is independent of the functional categorization scheme and of the organisms being analyzed. We then present a semi-supervised algorithm that can select functionally relevant experiments from large collections of transcriptomics experiments. Our algorithm is able to select experiments relevant to a given GO term, MIPS FunCat term or even KEGG pathways. We extensively test our algorithm on large dataset collections for yeast and Arabidopsis. We demonstrate that: using the selected experiments there is a statistically significant improvement in correlation between genes in the functional category of interest; the selected experiments improve GBA-based gene function prediction; the effectiveness of the selected experiments increases with annotation specificity; our algorithm can be successfully applied to GBA-based pathway reconstruction. Importantly, the set of experiments selected by the algorithm reflects the existing literature knowledge about the experiments. [A MATLAB implementation of the algorithm and all the data used in this paper can be downloaded from the paper website: http://www.paccanarolab.org/papers/CorrGene/]. PMID:22879875
Identification of Novel Pax8 Targets in FRTL-5 Thyroid Cells by Gene Silencing and Expression Microarray Analysis

PubMed Central

Di Palma, Tina; Conti, Anna; de Cristofaro, Tiziana; Scala, Serena; Nitsch, Lucio; Zannini, Mariastella

2011-01-01

Background The differentiation program of thyroid follicular cells (TFCs), by far the most abundant cell population of the thyroid gland, relies on the interplay between sequence-specific transcription factors and transcriptional coregulators with the basal transcriptional machinery of the cell. However, the molecular mechanisms leading to the fully differentiated thyrocyte are still the object of intense study. The transcription factor Pax8, a member of the Paired-box gene family, has been demonstrated to be a critical regulator required for proper development and differentiation of thyroid follicular cells. Despite being Pax8 well-characterized with respect to its role in regulating genes involved in thyroid differentiation, genomics approaches aiming at the identification of additional Pax8 targets are lacking and the biological pathways controlled by this transcription factor are largely unknown. Methodology/Principal Findings To identify unique downstream targets of Pax8, we investigated the genome-wide effect of Pax8 silencing comparing the transcriptome of silenced versus normal differentiated FRTL-5 thyroid cells. In total, 2815 genes were found modulated 72 h after Pax8 RNAi, induced or repressed. Genes previously reported to be regulated by Pax8 in FRTL-5 cells were confirmed. In addition, novel targets genes involved in functional processes such as DNA replication, anion transport, kinase activity, apoptosis and cellular processes were newly identified. Transcriptome analysis highlighted that Pax8 is a key molecule for thyroid morphogenesis and differentiation. Conclusions/Significance This is the first large-scale study aimed at the identification of new genes regulated by Pax8, a master regulator of thyroid development and differentiation. The biological pathways and target genes controlled by Pax8 will have considerable importance to understand thyroid disease progression as well as to set up novel therapeutic strategies. PMID:21966443
Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns.

PubMed

Shen, Hui; Jin, Dongmei; Shu, Jiang-Ping; Zhou, Xi-Le; Lei, Ming; Wei, Ran; Shang, Hui; Wei, Hong-Jin; Zhang, Rui; Liu, Li; Gu, Yu-Feng; Zhang, Xian-Chun; Yan, Yue-Hong

2018-02-01

Ferns, originated about 360 million years ago, are the sister group of seed plants. Despite the remarkable progress in our understanding of fern phylogeny, with conflicting molecular evidence and different morphological interpretations, relationships among major fern lineages remain controversial. With the aim to obtain a robust fern phylogeny, we carried out a large-scale phylogenomic analysis using high-quality transcriptome sequencing data, which covered 69 fern species from 38 families and 11 orders. Both coalescent-based and concatenation-based methods were applied to both nucleotide and amino acid sequences in species tree estimation. The resulting topologies are largely congruent with each other, except for the placement of Angiopteris fokiensis, Cheiropleuria bicuspis, Diplaziopsis brunoniana, Matteuccia struthiopteris, Elaphoglossum mcclurei, and Tectaria subpedata. Our result confirmed that Equisetales is sister to the rest of ferns, and Dennstaedtiaceae is sister to eupolypods. Moreover, our result strongly supported some relationships different from the current view of fern phylogeny, including that Marattiaceae may be sister to the monophyletic clade of Psilotaceae and Ophioglossaceae; that Gleicheniaceae and Hymenophyllaceae form a monophyletic clade sister to Dipteridaceae; and that Aspleniaceae is sister to the rest of the groups in eupolypods II. These results were interpreted with morphological traits, especially sporangia characters, and a new evolutionary route of sporangial annulus in ferns was suggested. This backbone phylogeny in ferns sets a foundation for further studies in biology and evolution in ferns, and therefore in plants. © The Authors 2017. Published by Oxford University Press.
Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns

PubMed Central

Shen, Hui; Jin, Dongmei; Shu, Jiang-Ping; Zhou, Xi-Le; Lei, Ming; Wei, Ran; Shang, Hui; Wei, Hong-Jin; Zhang, Rui; Liu, Li; Gu, Yu-Feng; Zhang, Xian-Chun; Yan, Yue-Hong

2018-01-01

Abstract Background Ferns, originated about 360 million years ago, are the sister group of seed plants. Despite the remarkable progress in our understanding of fern phylogeny, with conflicting molecular evidence and different morphological interpretations, relationships among major fern lineages remain controversial. Results With the aim to obtain a robust fern phylogeny, we carried out a large-scale phylogenomic analysis using high-quality transcriptome sequencing data, which covered 69 fern species from 38 families and 11 orders. Both coalescent-based and concatenation-based methods were applied to both nucleotide and amino acid sequences in species tree estimation. The resulting topologies are largely congruent with each other, except for the placement of Angiopteris fokiensis, Cheiropleuria bicuspis, Diplaziopsis brunoniana, Matteuccia struthiopteris, Elaphoglossum mcclurei, and Tectaria subpedata. Conclusions Our result confirmed that Equisetales is sister to the rest of ferns, and Dennstaedtiaceae is sister to eupolypods. Moreover, our result strongly supported some relationships different from the current view of fern phylogeny, including that Marattiaceae may be sister to the monophyletic clade of Psilotaceae and Ophioglossaceae; that Gleicheniaceae and Hymenophyllaceae form a monophyletic clade sister to Dipteridaceae; and that Aspleniaceae is sister to the rest of the groups in eupolypods II. These results were interpreted with morphological traits, especially sporangia characters, and a new evolutionary route of sporangial annulus in ferns was suggested. This backbone phylogeny in ferns sets a foundation for further studies in biology and evolution in ferns, and therefore in plants. PMID:29186447
Transcriptomics of coping strategies in free-swimming Lepeophtheirus salmonis (Copepoda) larvae responding to abiotic stress.

PubMed

Sutherland, Ben J G; Jantzen, Stuart G; Yasuike, Motoshige; Sanderson, Dan S; Koop, Ben F; Jones, Simon R M

2012-12-01

The salmon louse Lepeophtheirus salmonis is a marine ectoparasite of wild and farmed salmon in the Northern Hemisphere. Infections of farmed salmon are of economic and ecological concern. Nauplius and copepodid salmon lice larvae are free-swimming and disperse in the water column until they encounter a host. In this study, we characterized the sublethal stress responses of L. salmonis copepodid larvae by applying a 38K oligonucleotide microarray to profile transcriptomes following 24 h exposures to suboptimal salinity (30-10 parts per thousand (‰)) or temperature (16-4 °C) environments. Hyposalinity exposure resulted in large-scale gene expression changes relative to those elicited by a thermal gradient. Subsequently, transcriptome responses to a more finely resolved salinity gradient between 30 ‰ and 25 ‰ were profiled. Minimal changes occurred at 29 ‰ or 28 ‰, a threshold of response was identified at 27 ‰, and the largest response was at 25 ‰. Differentially expressed genes were clustered by pattern of expression, and clusters were characterized by functional enrichment analysis. Results indicate larval copepods adopt two distinct coping strategies in response to short-term hyposaline stress: a primary response using molecular chaperones and catabolic processes at 27 ‰; and a secondary response up-regulating ion pumps, transporters, a different suite of chaperones and apoptosis-related transcripts at 26 ‰ and 25 ‰. The results further our understanding of the tolerances of L. salmonis copepodids to salinity and temperature gradients and may assist in the development of salmon louse management strategies. © 2012 Blackwell Publishing Ltd.
Combined Large-Scale Phenotyping and Transcriptomics in Maize Reveals a Robust Growth Regulatory Network1[OPEN

PubMed Central

Herman, Dorota; Slabbinck, Bram; Pè, Mario Enrico

2016-01-01

Leaves are vital organs for biomass and seed production because of their role in the generation of metabolic energy and organic compounds. A better understanding of the molecular networks underlying leaf development is crucial to sustain global requirements for food and renewable energy. Here, we combined transcriptome profiling of proliferative leaf tissue with in-depth phenotyping of the fourth leaf at later stages of development in 197 recombinant inbred lines of two different maize (Zea mays) populations. Previously, correlation analysis in a classical biparental mapping population identified 1,740 genes correlated with at least one of 14 traits. Here, we extended these results with data from a multiparent advanced generation intercross population. As expected, the phenotypic variability was found to be larger in the latter population than in the biparental population, although general conclusions on the correlations among the traits are comparable. Data integration from the two diverse populations allowed us to identify a set of 226 genes that are robustly associated with diverse leaf traits. This set of genes is enriched for transcriptional regulators and genes involved in protein synthesis and cell wall metabolism. In order to investigate the molecular network context of the candidate gene set, we integrated our data with publicly available functional genomics data and identified a growth regulatory network of 185 genes. Our results illustrate the power of combining in-depth phenotyping with transcriptomics in mapping populations to dissect the genetic control of complex traits and present a set of candidate genes for use in biomass improvement. PMID:26754667
Transcriptome Analysis of Green Peach Aphid (Myzus persicae): Insight into Developmental Regulation and Inter-Species Divergence

PubMed Central

Ji, Rui; Wang, Yujun; Cheng, Yanbin; Zhang, Meiping; Zhang, Hong-Bin; Zhu, Li; Fang, Jichao; Zhu-Salzman, Keyan

2016-01-01

Green peach aphid (Myzus persicae) and pea aphid (Acyrthosiphon pisum) are two phylogenetically closely related agricultural pests. While pea aphid is restricted to Fabaceae, green peach aphid feeds on hundreds of plant species from more than 40 families. Transcriptome comparison could shed light on the genetic factors underlying the difference in host range between the two species. Furthermore, a large scale study contrasting gene expression between immature nymphs and fully developed adult aphids would fill a previous knowledge gap. Here, we obtained transcriptomic sequences of green peach aphid nymphs and adults, respectively, using Illumina sequencing technology. A total of 2244 genes were found to be differentially expressed between the two developmental stages, many of which were associated with detoxification, hormone production, cuticle formation, metabolism, food digestion, and absorption. When searched against publically available pea aphid mRNA sequences, 13,752 unigenes were found to have no homologous counterparts. Interestingly, many of these unigenes that could be annotated in other databases were involved in the “xenobiotics biodegradation and metabolism” pathway, suggesting the two aphids differ in their adaptation to secondary metabolites of host plants. Conversely, 3989 orthologous gene pairs between the two species were subjected to calculations of synonymous and nonsynonymous substitutions, and 148 of the genes potentially evolved in response to positive selection. Some of these genes were predicted to be associated with insect-plant interactions. Our study has revealed certain molecular events related to aphid development, and provided some insight into biological variations in two aphid species, possibly as a result of host plant adaptation. PMID:27812361
Combined Large-Scale Phenotyping and Transcriptomics in Maize Reveals a Robust Growth Regulatory Network.

PubMed

Baute, Joke; Herman, Dorota; Coppens, Frederik; De Block, Jolien; Slabbinck, Bram; Dell'Acqua, Matteo; Pè, Mario Enrico; Maere, Steven; Nelissen, Hilde; Inzé, Dirk

2016-03-01

Leaves are vital organs for biomass and seed production because of their role in the generation of metabolic energy and organic compounds. A better understanding of the molecular networks underlying leaf development is crucial to sustain global requirements for food and renewable energy. Here, we combined transcriptome profiling of proliferative leaf tissue with in-depth phenotyping of the fourth leaf at later stages of development in 197 recombinant inbred lines of two different maize (Zea mays) populations. Previously, correlation analysis in a classical biparental mapping population identified 1,740 genes correlated with at least one of 14 traits. Here, we extended these results with data from a multiparent advanced generation intercross population. As expected, the phenotypic variability was found to be larger in the latter population than in the biparental population, although general conclusions on the correlations among the traits are comparable. Data integration from the two diverse populations allowed us to identify a set of 226 genes that are robustly associated with diverse leaf traits. This set of genes is enriched for transcriptional regulators and genes involved in protein synthesis and cell wall metabolism. In order to investigate the molecular network context of the candidate gene set, we integrated our data with publicly available functional genomics data and identified a growth regulatory network of 185 genes. Our results illustrate the power of combining in-depth phenotyping with transcriptomics in mapping populations to dissect the genetic control of complex traits and present a set of candidate genes for use in biomass improvement. © 2016 American Society of Plant Biologists. All Rights Reserved.
PIVOT: platform for interactive analysis and visualization of transcriptomics data.

PubMed

Zhu, Qin; Fisher, Stephen A; Dueck, Hannah; Middleton, Sarah; Khaladkar, Mugdha; Kim, Junhyong

2018-01-05

Many R packages have been developed for transcriptome analysis but their use often requires familiarity with R and integrating results of different packages requires scripts to wrangle the datatypes. Furthermore, exploratory data analyses often generate multiple derived datasets such as data subsets or data transformations, which can be difficult to track. Here we present PIVOT, an R-based platform that wraps open source transcriptome analysis packages with a uniform user interface and graphical data management that allows non-programmers to interactively explore transcriptomics data. PIVOT supports more than 40 popular open source packages for transcriptome analysis and provides an extensive set of tools for statistical data manipulations. A graph-based visual interface is used to represent the links between derived datasets, allowing easy tracking of data versions. PIVOT further supports automatic report generation, publication-quality plots, and program/data state saving, such that all analysis can be saved, shared and reproduced. PIVOT will allow researchers with broad background to easily access sophisticated transcriptome analysis tools and interactively explore transcriptome datasets.
TCW: Transcriptome Computational Workbench

PubMed Central

Soderlund, Carol; Nelson, William; Willer, Mark; Gang, David R.

2013-01-01

Background The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. Methodology The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. Conclusion It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw. PMID:23874959
TCW: transcriptome computational workbench.

PubMed

Soderlund, Carol; Nelson, William; Willer, Mark; Gang, David R

2013-01-01

The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw.
Trinity | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

Trinity Cancer Transcriptome Analysis Toolkit (CTAT) including de novo transcriptome assembly with downstream support for expression analysis and focused analyses on cancer transcriptomes, incorporating mutation and fusion transcript discovery, and single cell analysis.
Gene expression analysis of induced pluripotent stem cells from aneuploid chromosomal syndromes

PubMed Central

2013-01-01

Background Human aneuploidy is the leading cause of early pregnancy loss, mental retardation, and multiple congenital anomalies. Due to the high mortality associated with aneuploidy, the pathophysiological mechanisms of aneuploidy syndrome remain largely unknown. Previous studies focused mostly on whether dosage compensation occurs, and the next generation transcriptomics sequencing technology RNA-seq is expected to eventually uncover the mechanisms of gene expression regulation and the related pathological phenotypes in human aneuploidy. Results Using next generation transcriptomics sequencing technology RNA-seq, we profiled the transcriptomes of four human aneuploid induced pluripotent stem cell (iPSC) lines generated from monosomy × (Turner syndrome), trisomy 8 (Warkany syndrome 2), trisomy 13 (Patau syndrome), and partial trisomy 11:22 (Emanuel syndrome) as well as two umbilical cord matrix iPSC lines as euploid controls to examine how phenotypic abnormalities develop with aberrant karyotype. A total of 466 M (50-bp) reads were obtained from the six iPSC lines, and over 13,000 mRNAs were identified by gene annotation. Global analysis of gene expression profiles and functional analysis of differentially expressed (DE) genes were implemented. Over 5000 DE genes are determined between aneuploidy and euploid iPSCs respectively while 9 KEGG pathways are overlapped enriched in four aneuploidy samples. Conclusions Our results demonstrate that the extra or missing chromosome has extensive effects on the whole transcriptome. Functional analysis of differentially expressed genes reveals that the genes most affected in aneuploid individuals are related to central nervous system development and tumorigenesis. PMID:24564826
Early transcriptomic response to Fe supply in Fe-deficient tomato plants is strongly influenced by the nature of the chelating agent.

PubMed

Zamboni, Anita; Zanin, Laura; Tomasi, Nicola; Avesani, Linda; Pinton, Roberto; Varanini, Zeno; Cesco, Stefano

2016-01-07

It is well known that in the rhizosphere soluble Fe sources available for plants are mainly represented by a mixture of complexes between the micronutrient and organic ligands such as carboxylates and phytosiderophores (PS) released by roots, as well as fractions of humified organic matter. The use by roots of these three natural Fe sources (Fe-citrate, Fe-PS and Fe complexed to water-extractable humic substances, Fe-WEHS) have been already studied at physiological level but the knowledge about the transcriptomic aspects is still lacking. The (59)Fe concentration recorded after 24 h in tissues of tomato Fe-deficient plants supplied with (59)Fe complexed to WEHS reached values about 2 times higher than those measured in response to the supply with Fe-citrate and Fe-PS. However, after 1 h no differences among the three Fe-chelates were observed considering the (59)Fe concentration and the root Fe(III) reduction activity. A large-scale transcriptional analysis of root tissue after 1 h of Fe supply showed that Fe-WEHS modulated only two transcripts leaving the transcriptome substantially identical to Fe-deficient plants. On the other hand, Fe-citrate and Fe-PS affected 728 and 408 transcripts, respectively, having 289 a similar transcriptional behaviour in response to both Fe sources. The root transcriptional response to the Fe supply depends on the nature of chelating agents (WEHS, citrate and PS). The supply of Fe-citrate and Fe-PS showed not only a fast back regulation of molecular mechanisms modulated by Fe deficiency but also specific responses due to the uptake of the chelating molecule. Plants fed with Fe-WEHS did not show relevant changes in the root transcriptome with respect to the Fe-deficient plants, indicating that roots did not sense the restored cellular Fe accumulation.
Genomic structural differences between cattle and river buffalo identified through a combination and genomic and transcriptomic analysis

USDA-ARS?s Scientific Manuscript database

Water buffalo (Bubalus bubalis L.) is an important livestock species worldwide. Like many other livestock species, water buffalo lacks high quality and continuous reference genome assembly required for fine-scale comparative genomics studies. In this work, we present a dataset, which characterizes g...
A curated compendium of monocyte transcriptome datasets of relevance to human monocyte immunobiology research

PubMed Central

Rinchai, Darawan; Boughorbel, Sabri; Presnell, Scott; Quinn, Charlie; Chaussabel, Damien

2016-01-01

Systems-scale profiling approaches have become widely used in translational research settings. The resulting accumulation of large-scale datasets in public repositories represents a critical opportunity to promote insight and foster knowledge discovery. However, resources that can serve as an interface between biomedical researchers and such vast and heterogeneous dataset collections are needed in order to fulfill this potential. Recently, we have developed an interactive data browsing and visualization web application, the Gene Expression Browser (GXB). This tool can be used to overlay deep molecular phenotyping data with rich contextual information about analytes, samples and studies along with ancillary clinical or immunological profiling data. In this note, we describe a curated compendium of 93 public datasets generated in the context of human monocyte immunological studies, representing a total of 4,516 transcriptome profiles. Datasets were uploaded to an instance of GXB along with study description and sample annotations. Study samples were arranged in different groups. Ranked gene lists were generated based on relevant group comparisons. This resource is publicly available online at http://monocyte.gxbsidra.org/dm3/landing.gsp. PMID:27158452
Comparative Transcriptome Analysis Identifies Putative Genes Involved in the Biosynthesis of Xanthanolides in Xanthium strumarium L.

PubMed

Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng

2016-01-01

Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides.

Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data

PubMed Central

Müller, Christian; Schillert, Arne; Röthemeier, Caroline; Trégouët, David-Alexandre; Proust, Carole; Binder, Harald; Pfeiffer, Norbert; Beutel, Manfred; Lackner, Karl J.; Schnabel, Renate B.; Tiret, Laurence; Wild, Philipp S.; Blankenberg, Stefan

2016-01-01

Technical variation plays an important role in microarray-based gene expression studies, and batch effects explain a large proportion of this noise. It is therefore mandatory to eliminate technical variation while maintaining biological variability. Several strategies have been proposed for the removal of batch effects, although they have not been evaluated in large-scale longitudinal gene expression data. In this study, we aimed at identifying a suitable method for batch effect removal in a large study of microarray-based longitudinal gene expression. Monocytic gene expression was measured in 1092 participants of the Gutenberg Health Study at baseline and 5-year follow up. Replicates of selected samples were measured at both time points to identify technical variability. Deming regression, Passing-Bablok regression, linear mixed models, non-linear models as well as ReplicateRUV and ComBat were applied to eliminate batch effects between replicates. In a second step, quantile normalization prior to batch effect correction was performed for each method. Technical variation between batches was evaluated by principal component analysis. Associations between body mass index and transcriptomes were calculated before and after batch removal. Results from association analyses were compared to evaluate maintenance of biological variability. Quantile normalization, separately performed in each batch, combined with ComBat successfully reduced batch effects and maintained biological variability. ReplicateRUV performed perfectly in the replicate data subset of the study, but failed when applied to all samples. All other methods did not substantially reduce batch effects in the replicate data subset. Quantile normalization plus ComBat appears to be a valuable approach for batch correction in longitudinal gene expression data. PMID:27272489
The opportunities and challenges of large-scale molecular approaches to songbird neurobiology

PubMed Central

Mello, C.V.; Clayton, D.F.

2014-01-01

High-through put methods for analyzing genome structure and function are having a large impact in song-bird neurobiology. Methods include genome sequencing and annotation, comparative genomics, DNA microarrays and transcriptomics, and the development of a brain atlas of gene expression. Key emerging findings include the identification of complex transcriptional programs active during singing, the robust brain expression of non-coding RNAs, evidence of profound variations in gene expression across brain regions, and the identification of molecular specializations within song production and learning circuits. Current challenges include the statistical analysis of large datasets, effective genome curations, the efficient localization of gene expression changes to specific neuronal circuits and cells, and the dissection of behavioral and environmental factors that influence brain gene expression. The field requires efficient methods for comparisons with organisms like chicken, which offer important anatomical, functional and behavioral contrasts. As sequencing costs plummet, opportunities emerge for comparative approaches that may help reveal evolutionary transitions contributing to vocal learning, social behavior and other properties that make songbirds such compelling research subjects. PMID:25280907
SCPortalen: human and mouse single-cell centric database

PubMed Central

Noguchi, Shuhei; Böttcher, Michael; Hasegawa, Akira; Kouno, Tsukasa; Kato, Sachi; Tada, Yuhki; Ura, Hiroki; Abe, Kuniya; Shin, Jay W; Plessy, Charles; Carninci, Piero

2018-01-01

Abstract Published single-cell datasets are rich resources for investigators who want to address questions not originally asked by the creators of the datasets. The single-cell datasets might be obtained by different protocols and diverse analysis strategies. The main challenge in utilizing such single-cell data is how we can make the various large-scale datasets to be comparable and reusable in a different context. To challenge this issue, we developed the single-cell centric database ‘SCPortalen’ (http://single-cell.clst.riken.jp/). The current version of the database covers human and mouse single-cell transcriptomics datasets that are publicly available from the INSDC sites. The original metadata was manually curated and single-cell samples were annotated with standard ontology terms. Following that, common quality assessment procedures were conducted to check the quality of the raw sequence. Furthermore, primary data processing of the raw data followed by advanced analyses and interpretation have been performed from scratch using our pipeline. In addition to the transcriptomics data, SCPortalen provides access to single-cell image files whenever available. The target users of SCPortalen are all researchers interested in specific cell types or population heterogeneity. Through the web interface of SCPortalen users are easily able to search, explore and download the single-cell datasets of their interests. PMID:29045713
Transcriptome sequences resolve deep relationships of the grape family.

PubMed

Wen, Jun; Xiong, Zhiqiang; Nie, Ze-Long; Mao, Likai; Zhu, Yabing; Kan, Xian-Zhao; Ickert-Bond, Stefanie M; Gerrath, Jean; Zimmer, Elizabeth A; Fang, Xiao-Dong

2013-01-01

Previous phylogenetic studies of the grape family (Vitaceae) yielded poorly resolved deep relationships, thus impeding our understanding of the evolution of the family. Next-generation sequencing now offers access to protein coding sequences very easily, quickly and cost-effectively. To improve upon earlier work, we extracted 417 orthologous single-copy nuclear genes from the transcriptomes of 15 species of the Vitaceae, covering its phylogenetic diversity. The resulting transcriptome phylogeny provides robust support for the deep relationships, showing the phylogenetic utility of transcriptome data for plants over a time scale at least since the mid-Cretaceous. The pros and cons of transcriptome data for phylogenetic inference in plants are also evaluated.
Transcriptome Analysis of Thapsia laciniata Rouy Provides Insights into Terpenoid Biosynthesis and Diversity in Apiaceae

PubMed Central

Drew, Damian Paul; Dueholm, Bjørn; Weitzel, Corinna; Zhang, Ye; Sensen, Christoph W.; Simonsen, Henrik Toft

2013-01-01

Thapsia laciniata Rouy (Apiaceae) produces irregular and regular sesquiterpenoids with thapsane and guaiene carbon skeletons, as found in other Apiaceae species. A transcriptomic analysis utilizing Illumina next-generation sequencing enabled the identification of novel genes involved in the biosynthesis of terpenoids in Thapsia. From 66.78 million HQ paired-end reads obtained from T. laciniata roots, 64.58 million were assembled into 76,565 contigs (N50: 1261 bp). Seventeen contigs were annotated as terpene synthases and five of these were predicted to be sesquiterpene synthases. Of the 67 contigs annotated as cytochromes P450, 18 of these are part of the CYP71 clade that primarily performs hydroxylations of specialized metabolites. Three contigs annotated as aldehyde dehydrogenases grouped phylogenetically with the characterized ALDH1 from Artemisia annua and three contigs annotated as alcohol dehydrogenases grouped with the recently described ADH1 from A. annua. ALDH1 and ADH1 were characterized as part of the artemisinin biosynthesis. We have produced a comprehensive EST dataset for T. laciniata roots, which contains a large sample of the T. laciniata transcriptome. These transcriptome data provide the foundation for future research into the molecular basis for terpenoid biosynthesis in Thapsia and on the evolution of terpenoids in Apiaceae. PMID:23698765
Multiplexed transcriptome analysis to detect ALK, ROS1 and RET rearrangements in lung cancer

PubMed Central

Rogers, Toni-Maree; Arnau, Gisela Mir; Ryland, Georgina L.; Huang, Stephen; Lira, Maruja E.; Emmanuel, Yvette; Perez, Omar D.; Irwin, Darryl; Fellowes, Andrew P.; Wong, Stephen Q.; Fox, Stephen B.

2017-01-01

ALK, ROS1 and RET gene fusions are important predictive biomarkers for tyrosine kinase inhibitors in lung cancer. Currently, the gold standard method for gene fusion detection is Fluorescence In Situ Hybridization (FISH) and while highly sensitive and specific, it is also labour intensive, subjective in analysis, and unable to screen a large numbers of gene fusions. Recent developments in high-throughput transcriptome-based methods may provide a suitable alternative to FISH as they are compatible with multiplexing and diagnostic workflows. However, the concordance between these different methods compared with FISH has not been evaluated. In this study we compared the results from three transcriptome-based platforms (Nanostring Elements, Agena LungFusion panel and ThermoFisher NGS fusion panel) to those obtained from ALK, ROS1 and RET FISH on 51 clinical specimens. Overall agreement of results ranged from 86–96% depending on the platform used. While all platforms were highly sensitive, both the Agena panel and Thermo Fisher NGS fusion panel reported minor fusions that were not detectable by FISH. Our proof–of–principle study illustrates that transcriptome-based analyses are sensitive and robust methods for detecting actionable gene fusions in lung cancer and could provide a robust alternative to FISH testing in the diagnostic setting. PMID:28181564
Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.

PubMed

Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong

2014-05-01

We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.
Mining genes involved in insecticide resistance of Liposcelis bostrychophila Badonnel by transcriptome and expression profile analysis.

PubMed

Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun

2013-01-01

Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids.
Mining Genes Involved in Insecticide Resistance of Liposcelis bostrychophila Badonnel by Transcriptome and Expression Profile Analysis

PubMed Central

Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun

2013-01-01

Background Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. Methodology and Principal Findings In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. Conclusion The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids. PMID:24278202
Transcriptome Analysis of Fat Bodies from Two Brown Planthopper (Nilaparvata lugens) Populations with Different Virulence Levels in Rice

PubMed Central

Chen, Hongdan; Lai, Wenxiang; Fu, Qiang; Lou, Yonggen

2014-01-01

Background The brown planthopper (BPH), Nilaparvata lugens (Stål), one of the most serious rice insect pests in Asia, can quickly overcome rice resistance by evolving new virulent populations. The insect fat body plays essential roles in the life cycles of insects and in plant-insect interactions. However, whether differences in fat body transcriptomes exist between insect populations with different virulence levels and whether the transcriptomic differences are related to insect virulence remain largely unknown. Methodology/Principal Findings In this study, we performed transcriptome-wide analyses on the fat bodies of two BPH populations with different virulence levels in rice. The populations were derived from rice variety TN1 (TN1 population) and Mudgo (M population). In total, 33,776 and 32,332 unigenes from the fat bodies of TN1 and M populations, respectively, were generated using Illumina technology. Gene ontology annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology classifications indicated that genes related to metabolism and immunity were significantly active in the fat bodies. In addition, a total of 339 unigenes showed homology to genes of yeast-like symbionts (YLSs) from 12 genera and endosymbiotic bacteria Wolbachia. A comparative analysis of the two transcriptomes generated 7,860 differentially expressed genes. GO annotations and enrichment analysis of KEGG pathways indicated these differentially expressed transcripts might be involved in metabolism and immunity. Finally, 105 differentially expressed genes from YLSs and Wolbachia were identified, genes which might be associated with the formation of different virulent populations. Conclusions/Significance This study was the first to compare the fat-body transcriptomes of two BPH populations having different virulence traits and to find genes that may be related to this difference. Our findings provide a molecular resource for future investigations of fat bodies and will be useful in examining the interactions between the fat body and virulence variation in the BPH. PMID:24533099
Transcriptome analysis of fat bodies from two brown planthopper (Nilaparvata lugens) populations with different virulence levels in rice.

PubMed

Yu, Haixin; Ji, Rui; Ye, Wenfeng; Chen, Hongdan; Lai, Wenxiang; Fu, Qiang; Lou, Yonggen

2014-01-01

The brown planthopper (BPH), Nilaparvata lugens (Stål), one of the most serious rice insect pests in Asia, can quickly overcome rice resistance by evolving new virulent populations. The insect fat body plays essential roles in the life cycles of insects and in plant-insect interactions. However, whether differences in fat body transcriptomes exist between insect populations with different virulence levels and whether the transcriptomic differences are related to insect virulence remain largely unknown. In this study, we performed transcriptome-wide analyses on the fat bodies of two BPH populations with different virulence levels in rice. The populations were derived from rice variety TN1 (TN1 population) and Mudgo (M population). In total, 33,776 and 32,332 unigenes from the fat bodies of TN1 and M populations, respectively, were generated using Illumina technology. Gene ontology annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology classifications indicated that genes related to metabolism and immunity were significantly active in the fat bodies. In addition, a total of 339 unigenes showed homology to genes of yeast-like symbionts (YLSs) from 12 genera and endosymbiotic bacteria Wolbachia. A comparative analysis of the two transcriptomes generated 7,860 differentially expressed genes. GO annotations and enrichment analysis of KEGG pathways indicated these differentially expressed transcripts might be involved in metabolism and immunity. Finally, 105 differentially expressed genes from YLSs and Wolbachia were identified, genes which might be associated with the formation of different virulent populations. This study was the first to compare the fat-body transcriptomes of two BPH populations having different virulence traits and to find genes that may be related to this difference. Our findings provide a molecular resource for future investigations of fat bodies and will be useful in examining the interactions between the fat body and virulence variation in the BPH.
Unravelling the complexity of microRNA-mediated gene regulation in black pepper (Piper nigrum L.) using high-throughput small RNA profiling.

PubMed

Asha, Srinivasan; Sreekumar, Sweda; Soniya, E V

2016-01-01

Analysis of high-throughput small RNA deep sequencing data, in combination with black pepper transcriptome sequences revealed microRNA-mediated gene regulation in black pepper ( Piper nigrum L.). Black pepper is an important spice crop and its berries are used worldwide as a natural food additive that contributes unique flavour to foods. In the present study to characterize microRNAs from black pepper, we generated a small RNA library from black pepper leaf and sequenced it by Illumina high-throughput sequencing technology. MicroRNAs belonging to a total of 303 conserved miRNA families were identified from the sRNAome data. Subsequent analysis from recently sequenced black pepper transcriptome confirmed precursor sequences of 50 conserved miRNAs and four potential novel miRNA candidates. Stem-loop qRT-PCR experiments demonstrated differential expression of eight conserved miRNAs in black pepper. Computational analysis of targets of the miRNAs showed 223 potential black pepper unigene targets that encode diverse transcription factors and enzymes involved in plant development, disease resistance, metabolic and signalling pathways. RLM-RACE experiments further mapped miRNA-mediated cleavage at five of the mRNA targets. In addition, miRNA isoforms corresponding to 18 miRNA families were also identified from black pepper. This study presents the first large-scale identification of microRNAs from black pepper and provides the foundation for the future studies of miRNA-mediated gene regulation of stress responses and diverse metabolic processes in black pepper.
Integration of a constraint-based metabolic model of Brassica napus developing seeds with 13C-metabolic flux analysis

PubMed Central

Hay, Jordan O.; Shi, Hai; Heinzel, Nicolas; Hebbelmann, Inga; Rolletschek, Hardy; Schwender, Jorg

2014-01-01

The use of large-scale or genome-scale metabolic reconstructions for modeling and simulation of plant metabolism and integration of those models with large-scale omics and experimental flux data is becoming increasingly important in plant metabolic research. Here we report an updated version of bna572, a bottom-up reconstruction of oilseed rape (Brassica napus L.; Brassicaceae) developing seeds with emphasis on representation of biomass-component biosynthesis. New features include additional seed-relevant pathways for isoprenoid, sterol, phenylpropanoid, flavonoid, and choline biosynthesis. Being now based on standardized data formats and procedures for model reconstruction, bna572+ is available as a COBRA-compliant Systems Biology Markup Language (SBML) model and conforms to the Minimum Information Requested in the Annotation of Biochemical Models (MIRIAM) standards for annotation of external data resources. Bna572+ contains 966 genes, 671 reactions, and 666 metabolites distributed among 11 subcellular compartments. It is referenced to the Arabidopsis thaliana genome, with gene-protein-reaction (GPR) associations resolving subcellular localization. Detailed mass and charge balancing and confidence scoring were applied to all reactions. Using B. napus seed specific transcriptome data, expression was verified for 78% of bna572+ genes and 97% of reactions. Alongside bna572+ we also present a revised carbon centric model for 13C-Metabolic Flux Analysis (13C-MFA) with all its reactions being referenced to bna572+ based on linear projections. By integration of flux ratio constraints obtained from 13C-MFA and by elimination of infinite flux bounds around thermodynamically infeasible loops based on COBRA loopless methods, we demonstrate improvements in predictive power of Flux Variability Analysis (FVA). Using this combined approach we characterize the difference in metabolic flux of developing seeds of two B. napus genotypes contrasting in starch and oil content. PMID:25566296
Profiling the venom gland transcriptomes of Costa Rican snakes by 454 pyrosequencing

PubMed Central

2011-01-01

Background A long term research goal of venomics, of applied importance for improving current antivenom therapy, but also for drug discovery, is to understand the pharmacological potential of venoms. Individually or combined, proteomic and transcriptomic studies have demonstrated their feasibility to explore in depth the molecular diversity of venoms. In the absence of genome sequence, transcriptomes represent also valuable searchable databases for proteomic projects. Results The venom gland transcriptomes of 8 Costa Rican taxa from 5 genera (Crotalus, Bothrops, Atropoides, Cerrophidion, and Bothriechis) of pitvipers were investigated using high-throughput 454 pyrosequencing. 100,394 out of 330,010 masked reads produced significant hits in the available databases. 5.165,220 nucleotides (8.27%) were masked by RepeatMasker, the vast majority of which corresponding to class I (retroelements) and class II (DNA transposons) mobile elements. BLAST hits included 79,991 matches to entries of the taxonomic suborder Serpentes, of which 62,433 displayed similarity to documented venom proteins. Strong discrepancies between the transcriptome-computed and the proteome-gathered toxin compositions were obvious at first sight. Although the reasons underlaying this discrepancy are elusive, since no clear trend within or between species is apparent, the data indicate that individual mRNA species may be translationally controlled in a species-dependent manner. The minimum number of genes from each toxin family transcribed into the venom gland transcriptome of each species was calculated from multiple alignments of reads matched to a full-length reference sequence of each toxin family. Reads encoding ORF regions of Kazal-type inhibitor-like proteins were uniquely found in Bothriechis schlegelii and B. lateralis transcriptomes, suggesting a genus-specific recruitment event during the early-Middle Miocene. A transcriptome-based cladogram supports the large divergence between A. mexicanus and A. picadoi, and a closer kinship between A. mexicanus and C. godmani. Conclusions Our comparative next-generation sequencing (NGS) analysis reveals taxon-specific trends governing the formulation of the venom arsenal. Knowledge of the venom proteome provides hints on the translation efficiency of toxin-coding transcripts, contributing thereby to a more accurate interpretation of the transcriptome. The application of NGS to the analysis of snake venom transcriptomes, may represent the tool for opening the door to systems venomics. PMID:21605378
Transcriptomic immune response of Tenebrio molitor pupae to parasitization by Scleroderma guani.

PubMed

Zhu, Jia-Ying; Yang, Pu; Zhang, Zhong; Wu, Guo-Xing; Yang, Bin

2013-01-01

Host and parasitoid interaction is one of the most fascinating relationships of insects, which is currently receiving an increasing interest. Understanding the mechanisms evolved by the parasitoids to evade or suppress the host immune system is important for dissecting this interaction, while it was still poorly known. In order to gain insight into the immune response of Tenebrio molitor to parasitization by Scleroderma guani, the transcriptome of T. molitor pupae was sequenced with focus on immune-related gene, and the non-parasitized and parasitized T. molitor pupae were analyzed by digital gene expression (DGE) analysis with special emphasis on parasitoid-induced immune-related genes using Illumina sequencing. In a single run, 264,698 raw reads were obtained. De novo assembly generated 71,514 unigenes with mean length of 424 bp. Of those unigenes, 37,373 (52.26%) showed similarity to the known proteins in the NCBI nr database. Via analysis of the transcriptome data in depth, 430 unigenes related to immunity were identified. DGE analysis revealed that parasitization by S. guani had considerable impacts on the transcriptome profile of T. molitor pupae, as indicated by the significant up- or down-regulation of 3,431 parasitism-responsive transcripts. The expression of a total of 74 unigenes involved in immune response of T. molitor was significantly altered after parasitization. obtained T. molitor transcriptome, in addition to establishing a fundamental resource for further research on functional genomics, has allowed the discovery of a large group of immune genes that might provide a meaningful framework to better understand the immune response in this species and other beetles. The DGE profiling data provides comprehensive T. molitor immune gene expression information at the transcriptional level following parasitization, and sheds valuable light on the molecular understanding of the host-parasitoid interaction.
Transcriptomic analysis of apple fruit ripening and texture attributes

USDA-ARS?s Scientific Manuscript database

Molecular events regulating cultivar-specific apple fruit ripening and sensory quality are largely unknown. Such knowledge is essential for genomic-assisted apple breeding and postharvest quality management. The ripening behavior and texture attributes of two apple cultivars, ‘Pink Lady’ and ‘Honey...
The bench scientist's guide to RNA-Seq analysis

USDA-ARS?s Scientific Manuscript database

RNA sequencing (RNA-Seq) is emerging as a highly accurate method to quantify transcript abundance. However, analyses of the large data sets obtained by sequencing the entire transcriptome of organisms have generally been performed by bioinformatic specialists. Here we outline a methods strategy desi...
Transcriptome Sequencing and Developmental Regulation of Gene Expression in Anopheles aquasalis

PubMed Central

Silva, Maria C. P.; Lopes, Adriana R.; Barros, Michele S.; Sá-Nunes, Anderson; Kojin, Bianca B.; Carvalho, Eneas; Suesdek, Lincoln; Silva-Neto, Mário Alberto C.; James, Anthony A.; Capurro, Margareth L.

2014-01-01

Background Anopheles aquasalis is a major malaria vector in coastal areas of South and Central America where it breeds preferentially in brackish water. This species is very susceptible to Plasmodium vivax and it has been already incriminated as responsible vector in malaria outbreaks. There has been no high-throughput investigation into the sequencing of An. aquasalis genes, transcripts and proteins despite its epidemiological relevance. Here we describe the sequencing, assembly and annotation of the An. aquasalis transcriptome. Methodology/Principal Findings A total of 419 thousand cDNA sequence reads, encompassing 164 million nucleotides, were assembled in 7544 contigs of ≥2 sequences, and 1999 singletons. The majority of the An. aquasalis transcripts encode proteins with their closest counterparts in another neotropical malaria vector, An. darlingi. Several analyses in different protein databases were used to annotate and predict the putative functions of the deduced An. aquasalis proteins. Larval and adult-specific transcripts were represented by 121 and 424 contig sequences, respectively. Fifty-one transcripts were only detected in blood-fed females. The data also reveal a list of transcripts up- or down-regulated in adult females after a blood meal. Transcripts associated with immunity, signaling networks and blood feeding and digestion are discussed. Conclusions/Significance This study represents the first large-scale effort to sequence the transcriptome of An. aquasalis. It provides valuable information that will facilitate studies on the biology of this species and may lead to novel strategies to reduce malaria transmission on the South American continent. The An. aquasalis transcriptome is accessible at http://exon.niaid.nih.gov/transcriptome/An_aquasalis/Anaquexcel.xlsx. PMID:25033462
mRNA-Seq Analysis of the Pseudoperonospora cubensis Transcriptome During Cucumber (Cucumis sativus L.) Infection

PubMed Central

Hamilton, John P.; Vaillancourt, Brieanne; Buell, C. Robin; Day, Brad

2012-01-01

Pseudoperonospora cubensis, an oomycete, is the causal agent of cucurbit downy mildew, and is responsible for significant losses on cucurbit crops worldwide. While other oomycete plant pathogens have been extensively studied at the molecular level, Ps. cubensis and the molecular basis of its interaction with cucurbit hosts has not been well examined. Here, we present the first large-scale global gene expression analysis of Ps. cubensis infection of a susceptible Cucumis sativus cultivar, ‘Vlaspik’, and identification of genes with putative roles in infection, growth, and pathogenicity. Using high throughput whole transcriptome sequencing, we captured differential expression of 2383 Ps. cubensis genes in sporangia and at 1, 2, 3, 4, 6, and 8 days post-inoculation (dpi). Additionally, comparison of Ps. cubensis expression profiles with expression profiles from an infection time course of the oomycete pathogen Phytophthora infestans on Solanum tuberosum revealed similarities in expression patterns of 1,576–6,806 orthologous genes suggesting a substantial degree of overlap in molecular events in virulence between the biotrophic Ps. cubensis and the hemi-biotrophic P. infestans. Co-expression analyses identified distinct modules of Ps. cubensis genes that were representative of early, intermediate, and late infection stages. Collectively, these expression data have advanced our understanding of key molecular and genetic events in the virulence of Ps. cubensis and thus, provides a foundation for identifying mechanism(s) by which to engineer or effect resistance in the host. PMID:22545137
Differential transcriptome analysis supports Rhodnius montenegrensis and Rhodnius robustus (Hemiptera, Reduviidae, Triatominae) as distinct species.

PubMed

de Carvalho, Danila Blanco; Congrains, Carlos; Chahad-Ehlers, Samira; Pinotti, Heloisa; Brito, Reinaldo Alves de; da Rosa, João Aristeu

2017-01-01

Chagas disease is one of the main parasitic diseases found in Latin America and it is estimated that between six and seven million people are infected worldwide. Its etiologic agent, the protozoan Trypanosoma cruzi, is transmitted by triatomines, some of which from the genus Rhodnius. Twenty species are currently recognized in this genus, including some closely related species with low levels of morphological differentiation, such as Rhodnius montenegrensis and Rhodnius robustus. In order to investigate genetic differences between these two species, we generated large-scale RNA-sequencing data (consisting of four RNA-seq libraries) from the heads and salivary glands of males of R. montenegrensis and R. robustus. Transcriptome assemblies produced for each species resulted in 64,952 contigs for R. montenegrensis and 70,894 contigs for R. robustus, with N50 of approximately 2,100 for both species. SNP calling based on the more complete R. robustus assembly revealed 3,055 fixed interspecific differences and 216 transcripts with high levels of divergence which contained only fixed differences between the two species. A gene ontology enrichment analysis revealed that these highly differentiated transcripts were enriched for eight GO terms related to AP-2 adaptor complex, as well as other interesting genes that could be involved in their differentiation. The results show that R. montenegrensis and R. robustus have a substantial quantity of fixed interspecific polymorphisms, which suggests a high degree of genetic divergence between the two species and likely corroborates the species status of R. montenegrensis.

Identification of Immunity-Related Genes in Dialeurodes citri against Entomopathogenic Fungus Lecanicillium attenuatum by RNA-Seq Analysis.

PubMed

Yu, Shijiang; Ding, Lili; Luo, Ren; Li, Xiaojiao; Yang, Juan; Liu, Haoqiang; Cong, Lin; Ran, Chun

2016-01-01

Dialeurodes citri is a major pest in citrus producing areas, and large-scale outbreaks have occurred increasingly often in recent years. Lecanicillium attenuatum is an important entomopathogenic fungus that can parasitize and kill D. citri. We separated the fungus from corpses of D. citri larvae. However, the sound immune defense system of pests makes infection by an entomopathogenic fungus difficult. Here we used RNA sequencing technology (RNA-Seq) to build a transcriptome database for D. citri and performed digital gene expression profiling to screen genes that act in the immune defense of D. citri larvae infected with a pathogenic fungus. De novo assembly generated 84,733 unigenes with mean length of 772 nt. All unigenes were searched against GO, Nr, Swiss-Prot, COG, and KEGG databases and a total of 28,190 (33.3%) unigenes were annotated. We identified 129 immunity-related unigenes in transcriptome database that were related to pattern recognition receptors, information transduction factors and response factors. From the digital gene expression profile, we identified 441 unigenes that were differentially expressed in D. citri infected with L. attenuatum. Through calculated Log2Ratio values, we identified genes for which fold changes in expression were obvious, including cuticle protein, vitellogenin, cathepsin, prophenoloxidase, clip-domain serine protease, lysozyme, and others. Subsequent quantitative real-time polymerase chain reaction analysis verified the results. The identified genes may serve as target genes for microbial control of D. citri.
Identification of Immunity-Related Genes in Dialeurodes citri against Entomopathogenic Fungus Lecanicillium attenuatum by RNA-Seq Analysis

PubMed Central

Yu, Shijiang; Ding, Lili; Luo, Ren; Li, Xiaojiao; Yang, Juan; Liu, Haoqiang; Cong, Lin; Ran, Chun

2016-01-01

Dialeurodes citri is a major pest in citrus producing areas, and large-scale outbreaks have occurred increasingly often in recent years. Lecanicillium attenuatum is an important entomopathogenic fungus that can parasitize and kill D. citri. We separated the fungus from corpses of D. citri larvae. However, the sound immune defense system of pests makes infection by an entomopathogenic fungus difficult. Here we used RNA sequencing technology (RNA-Seq) to build a transcriptome database for D. citri and performed digital gene expression profiling to screen genes that act in the immune defense of D. citri larvae infected with a pathogenic fungus. De novo assembly generated 84,733 unigenes with mean length of 772 nt. All unigenes were searched against GO, Nr, Swiss-Prot, COG, and KEGG databases and a total of 28,190 (33.3%) unigenes were annotated. We identified 129 immunity-related unigenes in transcriptome database that were related to pattern recognition receptors, information transduction factors and response factors. From the digital gene expression profile, we identified 441 unigenes that were differentially expressed in D. citri infected with L. attenuatum. Through calculated Log2Ratio values, we identified genes for which fold changes in expression were obvious, including cuticle protein, vitellogenin, cathepsin, prophenoloxidase, clip-domain serine protease, lysozyme, and others. Subsequent quantitative real-time polymerase chain reaction analysis verified the results. The identified genes may serve as target genes for microbial control of D. citri. PMID:27644092
Discovering Functions of Unannotated Genes from a Transcriptome Survey of Wild Fungal Isolates

PubMed Central

Ellison, Christopher E.; Kowbel, David; Glass, N. Louise; Taylor, John W.

2014-01-01

ABSTRACT Most fungal genomes are poorly annotated, and many fungal traits of industrial and biomedical relevance are not well suited to classical genetic screens. Assigning genes to phenotypes on a genomic scale thus remains an urgent need in the field. We developed an approach to infer gene function from expression profiles of wild fungal isolates, and we applied our strategy to the filamentous fungus Neurospora crassa. Using transcriptome measurements in 70 strains from two well-defined clades of this microbe, we first identified 2,247 cases in which the expression of an unannotated gene rose and fell across N. crassa strains in parallel with the expression of well-characterized genes. We then used image analysis of hyphal morphologies, quantitative growth assays, and expression profiling to test the functions of four genes predicted from our population analyses. The results revealed two factors that influenced regulation of metabolism of nonpreferred carbon and nitrogen sources, a gene that governed hyphal architecture, and a gene that mediated amino acid starvation resistance. These findings validate the power of our population-transcriptomic approach for inference of novel gene function, and we suggest that this strategy will be of broad utility for genome-scale annotation in many fungal systems. PMID:24692637
A conserved BDNF, glutamate- and GABA-enriched gene module related to human depression identified by coexpression meta-analysis and DNA variant genome-wide association studies.

PubMed

Chang, Lun-Ching; Jamain, Stephane; Lin, Chien-Wei; Rujescu, Dan; Tseng, George C; Sibille, Etienne

2014-01-01

Large scale gene expression (transcriptome) analysis and genome-wide association studies (GWAS) for single nucleotide polymorphisms have generated a considerable amount of gene- and disease-related information, but heterogeneity and various sources of noise have limited the discovery of disease mechanisms. As systematic dataset integration is becoming essential, we developed methods and performed meta-clustering of gene coexpression links in 11 transcriptome studies from postmortem brains of human subjects with major depressive disorder (MDD) and non-psychiatric control subjects. We next sought enrichment in the top 50 meta-analyzed coexpression modules for genes otherwise identified by GWAS for various sets of disorders. One coexpression module of 88 genes was consistently and significantly associated with GWAS for MDD, other neuropsychiatric disorders and brain functions, and for medical illnesses with elevated clinical risk of depression, but not for other diseases. In support of the superior discriminative power of this novel approach, we observed no significant enrichment for GWAS-related genes in coexpression modules extracted from single studies or in meta-modules using gene expression data from non-psychiatric control subjects. Genes in the identified module encode proteins implicated in neuronal signaling and structure, including glutamate metabotropic receptors (GRM1, GRM7), GABA receptors (GABRA2, GABRA4), and neurotrophic and development-related proteins [BDNF, reelin (RELN), Ephrin receptors (EPHA3, EPHA5)]. These results are consistent with the current understanding of molecular mechanisms of MDD and provide a set of putative interacting molecular partners, potentially reflecting components of a functional module across cells and biological pathways that are synchronously recruited in MDD, other brain disorders and MDD-related illnesses. Collectively, this study demonstrates the importance of integrating transcriptome data, gene coexpression modules and GWAS results for providing novel and complementary approaches to investigate the molecular pathology of MDD and other complex brain disorders.
Analysis of the Olive Fruit Fly Bactrocera oleae Transcriptome and Phylogenetic Classification of the Major Detoxification Gene Families

PubMed Central

Rombauts, Stephane; Chrisargiris, Antonis; Van Leeuwen, Thomas; Vontas, John

2013-01-01

The olive fruit fly Bactrocera oleae has a unique ability to cope with olive flesh, and is the most destructive pest of olives worldwide. Its control has been largely based on the use of chemical insecticides, however, the selection of insecticide resistance against several insecticides has evolved. The study of detoxification mechanisms, which allow the olive fruit fly to defend against insecticides, and/or phytotoxins possibly present in the mesocarp, has been hampered by the lack of genomic information in this species. In the NCBI database less than 1,000 nucleotide sequences have been deposited, with less than 10 detoxification gene homologues in total. We used 454 pyrosequencing to produce, for the first time, a large transcriptome dataset for B. oleae. A total of 482,790 reads were assembled into 14,204 contigs. More than 60% of those contigs (8,630) were larger than 500 base pairs, and almost half of them matched with genes of the order of the Diptera. Analysis of the Gene Ontology (GO) distribution of unique contigs, suggests that, compared to other insects, the assembly is broadly representative for the B. oleae transcriptome. Furthermore, the transcriptome was found to contain 55 P450, 43 GST-, 15 CCE- and 18 ABC transporter-genes. Several of those detoxification genes, may putatively be involved in the ability of the olive fruit fly to deal with xenobiotics, such as plant phytotoxins and insecticides. In summary, our study has generated new data and genomic resources, which will substantially facilitate molecular studies in B. oleae, including elucidation of detoxification mechanisms of xenobiotic, as well as other important aspects of olive fruit fly biology. PMID:23824998
Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric

2010-03-23

Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities tomore » known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.« less
[Progress in omics research of Aspergillus niger].

PubMed

Sui, Yufei; Ouyang, Liming; Lu, Hongzhong; Zhuang, Yingping; Zhang, Siliang

2016-08-25

Aspergillus niger, as an important industrial fermentation strain, is widely applied in the production of organic acids and industrial enzymes. With the development of diverse omics technologies, the data of genome, transcriptome, proteome and metabolome of A. niger are increasing continuously, which declared the coming era of big data for the research in fermentation process of A. niger. The data analysis from single omics and the comparison of multi-omics, to the integrations of multi-omics based on the genome-scale metabolic network model largely extends the intensive and systematic understanding of the efficient production mechanism of A. niger. It also provides possibilities for the reasonable global optimization of strain performance by genetic modification and process regulation. We reviewed and summarized progress in omics research of A. niger, and proposed the development direction of omics research on this cell factory.
Multiple plant hormones and cell wall metabolism regulate apple fruit maturation patterns and texture attributes

USDA-ARS?s Scientific Manuscript database

Molecular events regulating apple fruit ripening and sensory quality are largely unknown. Such knowledge is essential for genomic-assisted apple breeding and postharvest quality management. In this study, a parallel transcriptome profile analysis, scanning electron microscopic (SEM) examination and...
Revealing Less Derived Nature of Cartilaginous Fish Genomes with Their Evolutionary Time Scale Inferred with Nuclear Genes

PubMed Central

Renz, Adina J.; Meyer, Axel; Kuraku, Shigehiro

2013-01-01

Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon. PMID:23825540
Revealing less derived nature of cartilaginous fish genomes with their evolutionary time scale inferred with nuclear genes.

PubMed

Renz, Adina J; Meyer, Axel; Kuraku, Shigehiro

2013-01-01

Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon.
Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

PubMed

Hsu, Ju-Chun; Chien, Ting-Ying; Hu, Chia-Cheng; Chen, Mei-Ju May; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S; Chen, Chien-Yu

2012-01-01

Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to characterize putative polypeptide translational products and associate them with specific genes and protein functions.
Transcriptome Sequences Resolve Deep Relationships of the Grape Family

PubMed Central

Wen, Jun; Xiong, Zhiqiang; Nie, Ze-Long; Mao, Likai; Zhu, Yabing; Kan, Xian-Zhao; Ickert-Bond, Stefanie M.; Gerrath, Jean; Zimmer, Elizabeth A.; Fang, Xiao-Dong

2013-01-01

Previous phylogenetic studies of the grape family (Vitaceae) yielded poorly resolved deep relationships, thus impeding our understanding of the evolution of the family. Next-generation sequencing now offers access to protein coding sequences very easily, quickly and cost-effectively. To improve upon earlier work, we extracted 417 orthologous single-copy nuclear genes from the transcriptomes of 15 species of the Vitaceae, covering its phylogenetic diversity. The resulting transcriptome phylogeny provides robust support for the deep relationships, showing the phylogenetic utility of transcriptome data for plants over a time scale at least since the mid-Cretaceous. The pros and cons of transcriptome data for phylogenetic inference in plants are also evaluated. PMID:24069307
Transcriptome discovery in non-model wild fish species for the development of quantitative transcript abundance assays

USGS Publications Warehouse

Hahn, Cassidy M.; Iwanowicz, Luke R.; Cornman, Robert S.; Mazik, Patricia M.; Blazer, Vicki S.

2016-01-01

Environmental studies increasingly identify the presence of both contaminants of emerging concern (CECs) and legacy contaminants in aquatic environments; however, the biological effects of these compounds on resident fishes remain largely unknown. High throughput methodologies were employed to establish partial transcriptomes for three wild-caught, non-model fish species; smallmouth bass (Micropterus dolomieu), white sucker (Catostomus commersonii) and brown bullhead (Ameiurus nebulosus). Sequences from these transcriptome databases were utilized in the development of a custom nCounter CodeSet that allowed for direct multiplexed measurement of 50 transcript abundance endpoints in liver tissue. Sequence information was also utilized in the development of quantitative real-time PCR (qPCR) primers. Cross-species hybridization allowed the smallmouth bass nCounter CodeSet to be used for quantitative transcript abundance analysis of an additional non-model species, largemouth bass (Micropterus salmoides). We validated the nCounter analysis data system with qPCR for a subset of genes and confirmed concordant results. Changes in transcript abundance biomarkers between sexes and seasons were evaluated to provide baseline data on transcript modulation for each species of interest.
Overexpression of HvIcy6 in Barley Enhances Resistance against Tetranychus urticae and Entails Partial Transcriptomic Reprogramming.

PubMed

Santamaria, M Estrella; Diaz-Mendoza, Mercedes; Perez-Herguedas, David; Hensel, Goetz; Kumlehn, Jochen; Diaz, Isabel; Martinez, Manuel

2018-03-01

Cystatins have been largely used for pest control against phytophagous species. However, cystatins have not been commonly overexpressed in its cognate plant species to test their pesticide capacity. Since the inhibitory role of barley HvCPI-6 cystatin against the phytophagous mite Tetranychus urticae has been previously demonstrated, the purpose of our study was to determine if barley transgenic lines overexpressing its own HvIcy6 gene were more resistant against this phytophagous infestation. Besides, a transcriptomic analysis was done to find differential expressed genes among wild-type and transformed barley plants. Barley plants overexpressing HvIcy6 cystatin gene remained less susceptible to T. urticae attack when compared to wild-type plants, with a significant lesser foliar damaged area and a lower presence of the mite. Transcriptomic analysis revealed a certain reprogramming of cellular metabolism and a lower expression of several genes related to photosynthetic activity. Therefore, although caution should be taken to discard potential deleterious pleiotropic effects, cystatins may be used as transgenes with impact on agricultural crops by conferring enhanced levels of resistance to phytophagous pests.
Transcriptomic Analysis of the Adaptation of Listeria monocytogenes to Lagoon and Soil Matrices Associated with a Piggery Environment: Comparison of Expression Profiles

PubMed Central

Vivant, Anne-Laure; Desneux, Jeremy; Pourcher, Anne-Marie; Piveteau, Pascal

2017-01-01

Understanding how Listeria monocytogenes, the causative agent of listeriosis, adapts to the environment is crucial. Adaptation to new matrices requires regulation of gene expression. To determine how the pathogen adapts to lagoon effluent and soil, two matrices where L. monocytogenes has been isolated, we compared the transcriptomes of L. monocytogenes CIP 110868 20 min and 24 h after its transfer to effluent and soil extract. Results showed major variations in the transcriptome of L. monocytogenes in the lagoon effluent but only minor modifications in the soil. In both the lagoon effluent and in the soil, genes involved in mobility and chemotaxis and in the transport of carbohydrates were the most frequently represented in the set of genes with higher transcript levels, and genes with phage-related functions were the most represented in the set of genes with lower transcript levels. A modification of the cell envelop was only found in the lagoon environment. Finally, the differential analysis included a large proportion of regulators, regulons, and ncRNAs. PMID:29018416
Transcriptomic Analysis of the Adaptation of Listeria monocytogenes to Lagoon and Soil Matrices Associated with a Piggery Environment: Comparison of Expression Profiles.

PubMed

Vivant, Anne-Laure; Desneux, Jeremy; Pourcher, Anne-Marie; Piveteau, Pascal

2017-01-01

Understanding how Listeria monocytogenes , the causative agent of listeriosis, adapts to the environment is crucial. Adaptation to new matrices requires regulation of gene expression. To determine how the pathogen adapts to lagoon effluent and soil, two matrices where L. monocytogenes has been isolated, we compared the transcriptomes of L. monocytogenes CIP 110868 20 min and 24 h after its transfer to effluent and soil extract. Results showed major variations in the transcriptome of L. monocytogenes in the lagoon effluent but only minor modifications in the soil. In both the lagoon effluent and in the soil, genes involved in mobility and chemotaxis and in the transport of carbohydrates were the most frequently represented in the set of genes with higher transcript levels, and genes with phage-related functions were the most represented in the set of genes with lower transcript levels. A modification of the cell envelop was only found in the lagoon environment. Finally, the differential analysis included a large proportion of regulators, regulons, and ncRNAs.
Metabolic modeling helps interpret transcriptomic changes during malaria.

PubMed

Tang, Yan; Gupta, Anuj; Garimalla, Swetha; Galinski, Mary R; Styczynski, Mark P; Fonseca, Luis L; Voit, Eberhard O

2018-06-01

Disease represents a specific case of malfunctioning within a complex system. Whereas it is often feasible to observe and possibly treat the symptoms of a disease, it is much more challenging to identify and characterize its molecular root causes. Even in infectious diseases that are caused by a known parasite, it is often impossible to pinpoint exactly which molecular profiles of components or processes are directly or indirectly altered. However, a deep understanding of such profiles is a prerequisite for rational, efficacious treatments. Modern omics methodologies are permitting large-scale scans of some molecular profiles, but these scans often yield results that are not intuitive and difficult to interpret. For instance, the comparison of healthy and diseased transcriptome profiles may point to certain sets of involved genes, but a host of post-transcriptional processes and regulatory mechanisms renders predictions regarding metabolic or physiological consequences of the observed changes in gene expression unreliable. Here we present proof of concept that dynamic models of metabolic pathway systems may offer a tool for interpreting transcriptomic profiles measured during disease. We illustrate this strategy with the interpretation of expression data of genes coding for enzymes associated with purine metabolism. These data were obtained during infections of rhesus macaques (Macaca mulatta) with the malaria parasite Plasmodium cynomolgi or P. coatneyi. The model-based interpretation reveals clear patterns of flux redistribution within the purine pathway that are consistent between the two malaria pathogens and are even reflected in data from humans infected with P. falciparum. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang. Copyright © 2017 Elsevier B.V. All rights reserved.
Transcriptome sequencing and whole genome expression profiling of chrysanthemum under dehydration stress

PubMed Central

2013-01-01

Background Chrysanthemum is one of the most important ornamental crops in the world and drought stress seriously limits its production and distribution. In order to generate a functional genomics resource and obtain a deeper understanding of the molecular mechanisms regarding chrysanthemum responses to dehydration stress, we performed large-scale transcriptome sequencing of chrysanthemum plants under dehydration stress using the Illumina sequencing technology. Results Two cDNA libraries constructed from mRNAs of control and dehydration-treated seedlings were sequenced by Illumina technology. A total of more than 100 million reads were generated and de novo assembled into 98,180 unique transcripts which were further extensively annotated by comparing their sequencing to different protein databases. Biochemical pathways were predicted from these transcript sequences. Furthermore, we performed gene expression profiling analysis upon dehydration treatment in chrysanthemum and identified 8,558 dehydration-responsive unique transcripts, including 307 transcription factors and 229 protein kinases and many well-known stress responsive genes. Gene ontology (GO) term enrichment and biochemical pathway analyses showed that dehydration stress caused changes in hormone response, secondary and amino acid metabolism, and light and photoperiod response. These findings suggest that drought tolerance of chrysanthemum plants may be related to the regulation of hormone biosynthesis and signaling, reduction of oxidative damage, stabilization of cell proteins and structures, and maintenance of energy and carbon supply. Conclusions Our transcriptome sequences can provide a valuable resource for chrysanthemum breeding and research and novel insights into chrysanthemum responses to dehydration stress and offer candidate genes or markers that can be used to guide future studies attempting to breed drought tolerant chrysanthemum cultivars. PMID:24074255
A joint analysis of transcriptomic and metabolomic data uncovers enhanced enzyme-metabolite coupling in breast cancer

NASA Astrophysics Data System (ADS)

Auslander, Noam; Yizhak, Keren; Weinstock, Adam; Budhu, Anuradha; Tang, Wei; Wang, Xin Wei; Ambs, Stefan; Ruppin, Eytan

2016-07-01

Disrupted regulation of cellular processes is considered one of the hallmarks of cancer. We analyze metabolomic and transcriptomic profiles jointly collected from breast cancer and hepatocellular carcinoma patients to explore the associations between the expression of metabolic enzymes and the levels of the metabolites participating in the reactions they catalyze. Surprisingly, both breast cancer and hepatocellular tumors exhibit an increase in their gene-metabolites associations compared to noncancerous adjacent tissues. Following, we build predictors of metabolite levels from the expression of the enzyme genes catalyzing them. Applying these predictors to a large cohort of breast cancer samples we find that depleted levels of key cancer-related metabolites including glucose, glycine, serine and acetate are significantly associated with improved patient survival. Thus, we show that the levels of a wide range of metabolites in breast cancer can be successfully predicted from the transcriptome, going beyond the limited set of those measured.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Schwender, Jorg; Konig, Christina; Klapperstuck, Matthias

An attempt has been made to define the extent to which metabolic flux in central plant metabolism is reflected by changes in the transcriptome and metabolome, based on an analysis of in vitro cultured immature embryos of two oilseed rape (Brassica napus) accessions which contrast for seed lipid accumulation. Metabolic flux analysis (MFA) was used to constrain a flux balance metabolic model which included 671 biochemical and transport reactions within the central metabolism. This highly confident flux information was eventually used for comparative analysis of flux vs. transcript (metabolite). Metabolite profiling succeeded in identifying 79 intermediates within the central metabolism,more » some of which differed quantitatively between the two accessions and displayed a significant shift corresponding to flux. An RNA-Seq based transcriptome analysis revealed a large number of genes which were differentially transcribed in the two accessions, including some enzymes/proteins active in major metabolic pathways. With a few exceptions, differential activity in the major pathways (glycolysis, TCA cycle, amino acid, and fatty acid synthesis) was not reflected in contrasting abundances of the relevant transcripts. The conclusion was that transcript abundance on its own cannot be used to infer metabolic activity/fluxes in central plant metabolism. Lastly, this limitation needs to be borne in mind in evaluating transcriptome data and designing metabolic engineering experiments.« less

E-Flux2 and SPOT: Validated Methods for Inferring Intracellular Metabolic Flux Distributions from Transcriptomic Data.

PubMed

Kim, Min Kyung; Lane, Anatoliy; Kelley, James J; Lun, Desmond S

2016-01-01

Several methods have been developed to predict system-wide and condition-specific intracellular metabolic fluxes by integrating transcriptomic data with genome-scale metabolic models. While powerful in many settings, existing methods have several shortcomings, and it is unclear which method has the best accuracy in general because of limited validation against experimentally measured intracellular fluxes. We present a general optimization strategy for inferring intracellular metabolic flux distributions from transcriptomic data coupled with genome-scale metabolic reconstructions. It consists of two different template models called DC (determined carbon source model) and AC (all possible carbon sources model) and two different new methods called E-Flux2 (E-Flux method combined with minimization of l2 norm) and SPOT (Simplified Pearson cOrrelation with Transcriptomic data), which can be chosen and combined depending on the availability of knowledge on carbon source or objective function. This enables us to simulate a broad range of experimental conditions. We examined E. coli and S. cerevisiae as representative prokaryotic and eukaryotic microorganisms respectively. The predictive accuracy of our algorithm was validated by calculating the uncentered Pearson correlation between predicted fluxes and measured fluxes. To this end, we compiled 20 experimental conditions (11 in E. coli and 9 in S. cerevisiae), of transcriptome measurements coupled with corresponding central carbon metabolism intracellular flux measurements determined by 13C metabolic flux analysis (13C-MFA), which is the largest dataset assembled to date for the purpose of validating inference methods for predicting intracellular fluxes. In both organisms, our method achieves an average correlation coefficient ranging from 0.59 to 0.87, outperforming a representative sample of competing methods. Easy-to-use implementations of E-Flux2 and SPOT are available as part of the open-source package MOST (http://most.ccib.rutgers.edu/). Our method represents a significant advance over existing methods for inferring intracellular metabolic flux from transcriptomic data. It not only achieves higher accuracy, but it also combines into a single method a number of other desirable characteristics including applicability to a wide range of experimental conditions, production of a unique solution, fast running time, and the availability of a user-friendly implementation.
A Comparative Analysis of Industrial Escherichia coli K–12 and B Strains in High-Glucose Batch Cultivations on Process-, Transcriptome- and Proteome Level

PubMed Central

Marisch, Karoline; Bayer, Karl; Scharl, Theresa; Mairhofer, Juergen; Krempl, Peter M.; Hummel, Karin; Razzazi-Fazeli, Ebrahim; Striedner, Gerald

2013-01-01

Escherichia coli K–12 and B strains are among the most frequently used bacterial hosts for production of recombinant proteins on an industrial scale. To improve existing processes and to accelerate bioprocess development, we performed a detailed host analysis. We investigated the different behaviors of the E. coli production strains BL21, RV308, and HMS174 in response to high-glucose concentrations. Tightly controlled cultivations were conducted under defined environmental conditions for the in-depth analysis of physiological behavior. In addition to acquisition of standard process parameters, we also used DNA microarray analysis and differential gel electrophoresis (EttanTM DIGE). Batch cultivations showed different yields of the distinct strains for cell dry mass and growth rate, which were highest for BL21. In addition, production of acetate, triggered by excess glucose supply, was much higher for the K–12 strains compared to the B strain. Analysis of transcriptome data showed significant alteration in 347 of 3882 genes common among all three hosts. These differentially expressed genes included, for example, those involved in transport, iron acquisition, and motility. The investigation of proteome patterns additionally revealed a high number of differentially expressed proteins among the investigated hosts. The subsequently selected 38 spots included proteins involved in transport and motility. The results of this comprehensive analysis delivered a full genomic picture of the three investigated strains. Differentially expressed groups for targeted host modification were identified like glucose transport or iron acquisition, enabling potential optimization of strains to improve yield and process quality. Dissimilar growth profiles of the strains confirm different genotypes. Furthermore, distinct transcriptome patterns support differential regulation at the genome level. The identified proteins showed high agreement with the transcriptome data and suggest similar regulation within a host at both levels for the identified groups. Such host attributes need to be considered in future process design and operation. PMID:23950949
A comparative analysis of industrial Escherichia coli K-12 and B strains in high-glucose batch cultivations on process-, transcriptome- and proteome level.

PubMed

Marisch, Karoline; Bayer, Karl; Scharl, Theresa; Mairhofer, Juergen; Krempl, Peter M; Hummel, Karin; Razzazi-Fazeli, Ebrahim; Striedner, Gerald

2013-01-01

Escherichia coli K-12 and B strains are among the most frequently used bacterial hosts for production of recombinant proteins on an industrial scale. To improve existing processes and to accelerate bioprocess development, we performed a detailed host analysis. We investigated the different behaviors of the E. coli production strains BL21, RV308, and HMS174 in response to high-glucose concentrations. Tightly controlled cultivations were conducted under defined environmental conditions for the in-depth analysis of physiological behavior. In addition to acquisition of standard process parameters, we also used DNA microarray analysis and differential gel electrophoresis (Ettan(TM) DIGE). Batch cultivations showed different yields of the distinct strains for cell dry mass and growth rate, which were highest for BL21. In addition, production of acetate, triggered by excess glucose supply, was much higher for the K-12 strains compared to the B strain. Analysis of transcriptome data showed significant alteration in 347 of 3882 genes common among all three hosts. These differentially expressed genes included, for example, those involved in transport, iron acquisition, and motility. The investigation of proteome patterns additionally revealed a high number of differentially expressed proteins among the investigated hosts. The subsequently selected 38 spots included proteins involved in transport and motility. The results of this comprehensive analysis delivered a full genomic picture of the three investigated strains. Differentially expressed groups for targeted host modification were identified like glucose transport or iron acquisition, enabling potential optimization of strains to improve yield and process quality. Dissimilar growth profiles of the strains confirm different genotypes. Furthermore, distinct transcriptome patterns support differential regulation at the genome level. The identified proteins showed high agreement with the transcriptome data and suggest similar regulation within a host at both levels for the identified groups. Such host attributes need to be considered in future process design and operation.
Comparative Transcriptome Analysis Identifies Putative Genes Involved in the Biosynthesis of Xanthanolides in Xanthium strumarium L.

PubMed Central

Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng

2016-01-01

Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides. PMID:27625674
Transcriptomics-based strain optimization tool for designing secondary metabolite overproducing strains of Streptomyces coelicolor.

PubMed

Kim, Minsuk; Yi, Jeong Sang; Lakshmanan, Meiyappan; Lee, Dong-Yup; Kim, Byung-Gee

2016-03-01

In silico model-driven analysis using genome-scale model of metabolism (GEM) has been recognized as a promising method for microbial strain improvement. However, most of the current GEM-based strain design algorithms based on flux balance analysis (FBA) heavily rely on the steady-state and optimality assumptions without considering any regulatory information. Thus, their practical usage is quite limited, especially in its application to secondary metabolites overproduction. In this study, we developed a transcriptomics-based strain optimization tool (tSOT) in order to overcome such limitations by integrating transcriptomic data into GEM. Initially, we evaluated existing algorithms for integrating transcriptomic data into GEM using Streptomyces coelicolor dataset, and identified iMAT algorithm as the only and the best algorithm for characterizing the secondary metabolism of S. coelicolor. Subsequently, we developed tSOT platform where iMAT is adopted to predict the reaction states, and successfully demonstrated its applicability to secondary metabolites overproduction by designing actinorhodin (ACT), a polyketide antibiotic, overproducing strain of S. coelicolor. Mutants overexpressing tSOT targets such as ribulose 5-phosphate 3-epimerase and NADP-dependent malic enzyme showed 2 and 1.8-fold increase in ACT production, thereby validating the tSOT prediction. It is expected that tSOT can be used for solving other metabolic engineering problems which could not be addressed by current strain design algorithms, especially for the secondary metabolite overproductions. © 2015 Wiley Periodicals, Inc.
Gene network-based analysis identifies two potential subtypes of small intestinal neuroendocrine tumors.

PubMed

Kidd, Mark; Modlin, Irvin M; Drozdov, Ignat

2014-07-15

Tumor transcriptomes contain information of critical value to understanding the different capacities of a cell at both a physiological and pathological level. In terms of clinical relevance, they provide information regarding the cellular "toolbox" e.g., pathways associated with malignancy and metastasis or drug dependency. Exploration of this resource can therefore be leveraged as a translational tool to better manage and assess neoplastic behavior. The availability of public genome-wide expression datasets, provide an opportunity to reassess neuroendocrine tumors at a more fundamental level. We hypothesized that stringent analysis of expression profiles as well as regulatory networks of the neoplastic cell would provide novel information that facilitates further delineation of the genomic basis of small intestinal neuroendocrine tumors. We re-analyzed two publically available small intestinal tumor transcriptomes using stringent quality control parameters and network-based approaches and validated expression of core secretory regulatory elements e.g., CPE, PCSK1, secretogranins, including genes involved in depolarization e.g., SCN3A, as well as transcription factors associated with neurodevelopment (NKX2-2, NeuroD1, INSM1) and glucose homeostasis (APLP1). The candidate metastasis-associated transcription factor, ST18, was highly expressed (>14-fold, p < 0.004). Genes previously associated with neoplasia, CEBPA and SDHD, were decreased in expression (-1.5 - -2, p < 0.02). Genomic interrogation indicated that intestinal tumors may consist of two different subtypes, serotonin-producing neoplasms and serotonin/substance P/tachykinin lesions. QPCR validation in an independent dataset (n = 13 neuroendocrine tumors), confirmed up-regulated expression of 87% of genes (13/15). An integrated cellular transcriptomic analysis of small intestinal neuroendocrine tumors identified that they are regulated at a developmental level, have key activation of hypoxic pathways (a known regulator of malignant stem cell phenotypes) as well as activation of genes involved in apoptosis and proliferation. Further refinement of these analyses by RNAseq studies of large-scale databases will enable definition of individual master regulators and facilitate the development of novel tissue and blood-based tools to better understand diagnose and treat tumors.
Transcriptome analysis of Stagonospora nodorum: gene models, effectors, metabolism and pantothenate dispensability

USDA-ARS?s Scientific Manuscript database

The wheat pathogen Stagonospora nodorum, causal organism of the wheat disease Stagonospora nodorum blotch, has emerged as a model for the Dothideomycetes, a large fungal taxon that includes many important plant pathogens. The initial annotation of the genome assembly included 16 586 nuclear gene mod...
Transcriptome analysis of Stagonospora nodorum; gene models, effectors, metabolism and pantothenate dispensability

USDA-ARS?s Scientific Manuscript database

The wheat pathogen Stagonospora nodorum, causal organism of the wheat disease Stagonospora nodorum blotch, has emerged as a model for the Dothideomycetes, a large fungal taxon that includes many important plant pathogens. The initial annotation of the genome assembly included 16 586 nuclear gene mod...
DTWscore: differential expression and cell clustering analysis for time-series single-cell RNA-seq data.

PubMed

Wang, Zhuo; Jin, Shuilin; Liu, Guiyou; Zhang, Xiurui; Wang, Nan; Wu, Deliang; Hu, Yang; Zhang, Chiping; Jiang, Qinghua; Xu, Li; Wang, Yadong

2017-05-23

The development of single-cell RNA sequencing has enabled profound discoveries in biology, ranging from the dissection of the composition of complex tissues to the identification of novel cell types and dynamics in some specialized cellular environments. However, the large-scale generation of single-cell RNA-seq (scRNA-seq) data collected at multiple time points remains a challenge to effective measurement gene expression patterns in transcriptome analysis. We present an algorithm based on the Dynamic Time Warping score (DTWscore) combined with time-series data, that enables the detection of gene expression changes across scRNA-seq samples and recovery of potential cell types from complex mixtures of multiple cell types. The DTWscore successfully classify cells of different types with the most highly variable genes from time-series scRNA-seq data. The study was confined to methods that are implemented and available within the R framework. Sample datasets and R packages are available at https://github.com/xiaoxiaoxier/DTWscore .
Transcriptome and ultrastructural changes in dystrophic Epidermolysis bullosa resemble skin aging

PubMed Central

Trost, Andrea; Weber, Manuela; Klausegger, Alfred; Gruber, Christina; Bruckner, Daniela; Reitsamer, Herbert A.; Bauer, Johann W.; Breitenbach, Michael

2015-01-01

The aging process of skin has been investigated recently with respect to mitochondrial function and oxidative stress. We have here observed striking phenotypic and clinical similarity between skin aging and recessive dystrophic Epidermolysis bullosa (RDEB), which is caused by recessive mutations in the gene coding for collagen VII, COL7A1. Ultrastructural changes, defects in wound healing, and inflammation markers are in part shared with aged skin. We have here compared the skin transcriptomes of young adults suffering from RDEB with that of sex‐ and age‐matched healthy probands. In parallel we have compared the skin transcriptome of healthy young adults with that of elderly healthy donors. Quite surprisingly, there was a large overlap of the two gene lists that concerned a limited number of functional protein families. Most prominent among the proteins found are a number of proteins of the cornified envelope or proteins mechanistically involved in cornification and other skin proteins. Further, the overlap list contains a large number of genes with a known role in inflammation. We are documenting some of the most prominent ultrastructural and protein changes by immunofluorescence analysis of skin sections from patients, old individuals, and healthy controls. PMID:26143532
Transcriptome and ultrastructural changes in dystrophic Epidermolysis bullosa resemble skin aging.

PubMed

Breitenbach, Jenny S; Rinnerthaler, Mark; Trost, Andrea; Weber, Manuela; Klausegger, Alfred; Gruber, Christina; Bruckner, Daniela; Reitsamer, Herbert A; Bauer, Johann W; Breitenbach, Michael

2015-06-01

The aging process of skin has been investigated recently with respect to mitochondrial function and oxidative stress. We have here observed striking phenotypic and clinical similarity between skin aging and recessive dystrophic Epidermolysis bullosa (RDEB), which is caused by recessive mutations in the gene coding for collagen VII,COL7A1. Ultrastructural changes, defects in wound healing, and inflammation markers are in part shared with aged skin. We have here compared the skin transcriptomes of young adults suffering from RDEB with that of sex- and age-matched healthy probands. In parallel we have compared the skin transcriptome of healthy young adults with that of elderly healthy donors. Quite surprisingly, there was a large overlap of the two gene lists that concerned a limited number of functional protein families. Most prominent among the proteins found are a number of proteins of the cornified envelope or proteins mechanistically involved in cornification and other skin proteins. Further, the overlap list contains a large number of genes with a known role in inflammation. We are documenting some of the most prominent ultrastructural and protein changes by immunofluorescence analysis of skin sections from patients, old individuals, and healthy controls.
Microarray analysis and scale-free gene networks identify candidate regulators in drought-stressed roots of loblolly pine (P. taeda L.)

PubMed Central

2011-01-01

Background Global transcriptional analysis of loblolly pine (Pinus taeda L.) is challenging due to limited molecular tools. PtGen2, a 26,496 feature cDNA microarray, was fabricated and used to assess drought-induced gene expression in loblolly pine propagule roots. Statistical analysis of differential expression and weighted gene correlation network analysis were used to identify drought-responsive genes and further characterize the molecular basis of drought tolerance in loblolly pine. Results Microarrays were used to interrogate root cDNA populations obtained from 12 genotype × treatment combinations (four genotypes, three watering regimes). Comparison of drought-stressed roots with roots from the control treatment identified 2445 genes displaying at least a 1.5-fold expression difference (false discovery rate = 0.01). Genes commonly associated with drought response in pine and other plant species, as well as a number of abiotic and biotic stress-related genes, were up-regulated in drought-stressed roots. Only 76 genes were identified as differentially expressed in drought-recovered roots, indicating that the transcript population can return to the pre-drought state within 48 hours. Gene correlation analysis predicts a scale-free network topology and identifies eleven co-expression modules that ranged in size from 34 to 938 members. Network topological parameters identified a number of central nodes (hubs) including those with significant homology (E-values ≤ 2 × 10-30) to 9-cis-epoxycarotenoid dioxygenase, zeatin O-glucosyltransferase, and ABA-responsive protein. Identified hubs also include genes that have been associated previously with osmotic stress, phytohormones, enzymes that detoxify reactive oxygen species, and several genes of unknown function. Conclusion PtGen2 was used to evaluate transcriptome responses in loblolly pine and was leveraged to identify 2445 differentially expressed genes responding to severe drought stress in roots. Many of the genes identified are known to be up-regulated in response to osmotic stress in pine and other plant species and encode proteins involved in both signal transduction and stress tolerance. Gene expression levels returned to control values within a 48-hour recovery period in all but 76 transcripts. Correlation network analysis indicates a scale-free network topology for the pine root transcriptome and identifies central nodes that may serve as drivers of drought-responsive transcriptome dynamics in the roots of loblolly pine. PMID:21609476
International network of cancer genome projects

PubMed Central

2010-01-01

The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumors from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of over 25,000 cancer genomes at the genomic, epigenomic, and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically-relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies. PMID:20393554
Resolving Relationships among the Megadiverse Butterflies and Moths with a Novel Pipeline for Anchored Phylogenomics.

PubMed

Breinholt, Jesse W; Earl, Chandra; Lemmon, Alan R; Lemmon, Emily Moriarty; Xiao, Lei; Kawahara, Akito Y

2018-01-01

The advent of next-generation sequencing technology has allowed for thecollection of large portions of the genome for phylogenetic analysis. Hybrid enrichment and transcriptomics are two techniques that leverage next-generation sequencing and have shown much promise. However, methods for processing hybrid enrichment data are still limited. We developed a pipeline for anchored hybrid enrichment (AHE) read assembly, orthology determination, contamination screening, and data processing for sequences flanking the target "probe" region. We apply this approach to study the phylogeny of butterflies and moths (Lepidoptera), a megadiverse group of more than 157,000 described species with poorly understood deep-level phylogenetic relationships. We introduce a new, 855 locus AHE kit for Lepidoptera phylogenetics and compare resulting trees to those from transcriptomes. The enrichment kit was designed from existing genomes, transcriptomes, and expressed sequence tags and was used to capture sequence data from 54 species from 23 lepidopteran families. Phylogenies estimated from AHE data were largely congruent with trees generated from transcriptomes, with strong support for relationships at all but the deepest taxonomic levels. We combine AHE and transcriptomic data to generate a new Lepidoptera phylogeny, representing 76 exemplar species in 42 families. The tree provides robust support for many relationships, including those among the seven butterfly families. The addition of AHE data to an existing transcriptomic dataset lowers node support along the Lepidoptera backbone, but firmly places taxa with AHE data on the phylogeny. Combining taxa sequenced for AHE with existing transcriptomes and genomes resulted in a tree with strong support for (Calliduloidea $+$ Gelechioidea $+$ Thyridoidea) $+$ (Papilionoidea $+$ Pyraloidea $+$ Macroheterocera). To examine the efficacy of AHE at a shallow taxonomic level, phylogenetic analyses were also conducted on a sister group representing a more recent divergence, the Saturniidae and Sphingidae. These analyses utilized sequences from the probe region and data flanking it, nearly doubled the size of the dataset; resulting trees supported new phylogenetics relationships, especially within the Saturniidae and Sphingidae (e.g., Hemarina derived in the latter). We hope that our data processing pipeline, hybrid enrichment gene set, and approach of combining AHE data with transcriptomes will be useful for the broader systematics community. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A RNA-Seq Analysis of the Rat Supraoptic Nucleus Transcriptome: Effects of Salt Loading on Gene Expression

PubMed Central

Salinas, Yasmmyn D.; Shi, YiJun; Greenwood, Michael; Hoe, See Ziau; Murphy, David; Gainer, Harold

2015-01-01

Magnocellular neurons (MCNs) in the hypothalamo-neurohypophysial system (HNS) are highly specialized to release large amounts of arginine vasopressin (Avp) or oxytocin (Oxt) into the blood stream and play critical roles in the regulation of body fluid homeostasis. The MCNs are osmosensory neurons and are excited by exposure to hypertonic solutions and inhibited by hypotonic solutions. The MCNs respond to systemic hypertonic and hypotonic stimulation with large changes in the expression of their Avp and Oxt genes, and microarray studies have shown that these osmotic perturbations also cause large changes in global gene expression in the HNS. In this paper, we examine gene expression in the rat supraoptic nucleus (SON) under normosmotic and chronic salt-loading SL) conditions by the first time using “new-generation”, RNA sequencing (RNA-Seq) methods. We reliably detect 9,709 genes as present in the SON by RNA-Seq, and 552 of these genes were changed in expression as a result of chronic SL. These genes reflect diverse functions, and 42 of these are involved in either transcriptional or translational processes. In addition, we compare the SON transcriptomes resolved by RNA-Seq methods with the SON transcriptomes determined by Affymetrix microarray methods in rats under the same osmotic conditions, and find that there are 6,466 genes present in the SON that are represented in both data sets, although 1,040 of the expressed genes were found only in the microarray data, and 2,762 of the expressed genes are selectively found in the RNA-Seq data and not the microarray data. These data provide the research community a comprehensive view of the transcriptome in the SON under normosmotic conditions and the changes in specific gene expression evoked by salt loading. PMID:25897513
Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome.

PubMed

Morine, Melissa J; McMonagle, Jolene; Toomey, Sinead; Reynolds, Clare M; Moloney, Aidan P; Gormley, Isobel C; Gaora, Peadar O; Roche, Helen M

2010-10-07

Currently, a number of bioinformatics methods are available to generate appropriate lists of genes from a microarray experiment. While these lists represent an accurate primary analysis of the data, fewer options exist to contextualise those lists. The development and validation of such methods is crucial to the wider application of microarray technology in the clinical setting. Two key challenges in clinical bioinformatics involve appropriate statistical modelling of dynamic transcriptomic changes, and extraction of clinically relevant meaning from very large datasets. Here, we apply an approach to gene set enrichment analysis that allows for detection of bi-directional enrichment within a gene set. Furthermore, we apply canonical correlation analysis and Fisher's exact test, using plasma marker data with known clinical relevance to aid identification of the most important gene and pathway changes in our transcriptomic dataset. After a 28-day dietary intervention with high-CLA beef, a range of plasma markers indicated a marked improvement in the metabolic health of genetically obese mice. Tissue transcriptomic profiles indicated that the effects were most dramatic in liver (1270 genes significantly changed; p < 0.05), followed by muscle (601 genes) and adipose (16 genes). Results from modified GSEA showed that the high-CLA beef diet affected diverse biological processes across the three tissues, and that the majority of pathway changes reached significance only with the bi-directional test. Combining the liver tissue microarray results with plasma marker data revealed 110 CLA-sensitive genes showing strong canonical correlation with one or more plasma markers of metabolic health, and 9 significantly overrepresented pathways among this set; each of these pathways was also significantly changed by the high-CLA diet. Closer inspection of two of these pathways--selenoamino acid metabolism and steroid biosynthesis--illustrated clear diet-sensitive changes in constituent genes, as well as strong correlations between gene expression and plasma markers of metabolic syndrome independent of the dietary effect. Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of analysis has the potential to generate novel transcriptome-based biomarkers of disease.
Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome

PubMed Central

2010-01-01

Background Currently, a number of bioinformatics methods are available to generate appropriate lists of genes from a microarray experiment. While these lists represent an accurate primary analysis of the data, fewer options exist to contextualise those lists. The development and validation of such methods is crucial to the wider application of microarray technology in the clinical setting. Two key challenges in clinical bioinformatics involve appropriate statistical modelling of dynamic transcriptomic changes, and extraction of clinically relevant meaning from very large datasets. Results Here, we apply an approach to gene set enrichment analysis that allows for detection of bi-directional enrichment within a gene set. Furthermore, we apply canonical correlation analysis and Fisher's exact test, using plasma marker data with known clinical relevance to aid identification of the most important gene and pathway changes in our transcriptomic dataset. After a 28-day dietary intervention with high-CLA beef, a range of plasma markers indicated a marked improvement in the metabolic health of genetically obese mice. Tissue transcriptomic profiles indicated that the effects were most dramatic in liver (1270 genes significantly changed; p < 0.05), followed by muscle (601 genes) and adipose (16 genes). Results from modified GSEA showed that the high-CLA beef diet affected diverse biological processes across the three tissues, and that the majority of pathway changes reached significance only with the bi-directional test. Combining the liver tissue microarray results with plasma marker data revealed 110 CLA-sensitive genes showing strong canonical correlation with one or more plasma markers of metabolic health, and 9 significantly overrepresented pathways among this set; each of these pathways was also significantly changed by the high-CLA diet. Closer inspection of two of these pathways - selenoamino acid metabolism and steroid biosynthesis - illustrated clear diet-sensitive changes in constituent genes, as well as strong correlations between gene expression and plasma markers of metabolic syndrome independent of the dietary effect. Conclusion Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of analysis has the potential to generate novel transcriptome-based biomarkers of disease. PMID:20929581
Immune response of the Caribbean sea fan, Gorgonia ventalina, exposed to an Aplanochytrium parasite as revealed by transcriptome sequencing

PubMed Central

Burge, Colleen A.; Mouchka, Morgan E.; Harvell, C. Drew; Roberts, Steven

2013-01-01

Coral reef communities are undergoing marked declines due to a variety of stressors including disease. The sea fan coral, Gorgonia ventalina, is a tractable study system to investigate mechanisms of immunity to a naturally occurring pathogen. Functional studies in Gorgonia ventalina immunity indicate that several key pathways and cellular components are involved in response to natural microbial invaders, although to date the functional and regulatory pathways remain largely un-described. This study used short-read sequencing (Illumina GAIIx) to identify genes involved in the response of G. ventalina to a naturally occurring Aplanochytrium spp. parasite. De novo assembly of the G. ventalina transcriptome yielded 90,230 contigs of which 40,142 were annotated. RNA-Seq analysis revealed 210 differentially expressed genes in sea fans exposed to the Aplanochytrium parasite. Differentially expressed genes involved in immunity include pattern recognition molecules, anti-microbial peptides, and genes involved in wound repair and reactive oxygen species formation. Gene enrichment analysis indicated eight biological processes were enriched representing 36 genes, largely involved with protein translation and energy production. This is the first report using high-throughput sequencing to characterize the host response of a coral to a natural pathogen. Furthermore, we have generated the first transcriptome for a soft (octocoral or non-scleractinian) coral species. Expression analysis revealed genes important in invertebrate innate immune pathways, as well as those whose role is previously un-described in cnidarians. This resource will be valuable in characterizing G. ventalina immune response to infection and co-infection of pathogens in the context of environmental change. PMID:23898300
Transcriptomics as a tool for assessing the scalability of mammalian cell perfusion systems.

PubMed

Jayapal, Karthik P; Goudar, Chetan T

2014-01-01

DNA microarray-based transcriptomics have been used to determine the time course of laboratory and manufacturing-scale perfusion bioreactors in an attempt to characterize cell physiological state at these two bioreactor scales. Given the limited availability of genomic data for baby hamster kidney (BHK) cells, a Chinese hamster ovary (CHO)-based microarray was used following a feasibility assessment of cross-species hybridization. A heat shock experiment was performed using both BHK and CHO cells and resulting DNA microarray data were analyzed using a filtering criteria of perfect match (PM)/single base mismatch (MM) > 1.5 and PM-MM > 50 to exclude probes with low specificity or sensitivity for cross-species hybridizations. For BHK cells, 8910 probe sets (39 %) passed the cutoff criteria, whereas 12,961 probe sets (56 %) passed the cutoff criteria for CHO cells. Yet, the data from BHK cells allowed distinct clustering of heat shock and control samples as well as identification of biologically relevant genes as being differentially expressed, indicating the utility of cross-species hybridization. Subsequently, DNA microarray analysis was performed on time course samples from laboratory- and manufacturing-scale perfusion bioreactors that were operated under the same conditions. A majority of the variability (37 %) was associated with the first principal component (PC-1). Although PC-1 changed monotonically with culture duration, the trends were very similar in both the laboratory and manufacturing-scale bioreactors. Therefore, despite time-related changes to the cell physiological state, transcriptomic fingerprints were similar across the two bioreactor scales at any given instance in culture. Multiple genes were identified with time-course expression profiles that were very highly correlated (> 0.9) with bioprocess variables of interest. Although the current incomplete annotation limits the biological interpretation of these observations, their full potential may be realized in due course when richer genomic data become available. By taking a pragmatic approach of transcriptome fingerprinting, we have demonstrated the utility of systems biology to support the comparability of laboratory and manufacturing-scale perfusion systems. Scale-down model qualification is the first step in process characterization and hence is an integral component of robust regulatory filings. Augmenting the current paradigm, which relies primarily on cell culture and product quality information, with gene expression data can help make a substantially stronger case for similarity. With continued advances in systems biology approaches, we expect them to be seamlessly integrated into bioprocess development, which can translate into more robust and high yielding processes that can ultimately reduce cost of care for patients.
Novel transcriptome assembly and comparative toxicity pathway analysis in mahi-mahi (Coryphaena hippurus) embryos and larvae exposed to Deepwater Horizon oil

NASA Astrophysics Data System (ADS)

Xu, Elvis Genbo; Mager, Edward M.; Grosell, Martin; Hazard, E. Starr; Hardiman, Gary; Schlenk, Daniel

2017-03-01

The impacts of Deepwater Horizon (DWH) oil on morphology and function during embryonic development have been documented for a number of fish species, including the economically and ecologically important pelagic species, mahi-mahi (Coryphaena hippurus). However, further investigations on molecular events and pathways responsible for developmental toxicity have been largely restricted due to the limited molecular data available for this species. We sought to establish the de novo transcriptomic database from the embryos and larvae of mahi-mahi exposed to water accommodated fractions (HEWAFs) of two DWH oil types (weathered and source oil), in an effort to advance our understanding of the molecular aspects involved during specific toxicity responses. By high throughput sequencing (HTS), we obtained the first de novo transcriptome of mahi-mahi, with 60,842 assembled transcripts and 30,518 BLAST hits. Among them, 2,345 genes were significantly regulated in 96hpf larvae after exposure to weathered oil. With comparative analysis to a reference-transcriptome-guided approach on gene ontology and tox-pathways, we confirmed the novel approach effective for exploring tox-pathways in non-model species, and also identified a list of co-expressed genes as potential biomarkers which will provide information for the construction of an Adverse Outcome Pathway which could be useful in Ecological Risk Assessments.

Expression of a non-coding RNA in ectromelia virus is required for normal plaque formation.

PubMed

Esteban, David J; Upton, Chris; Bartow-McKenney, Casey; Buller, R Mark L; Chen, Nanhai G; Schriewer, Jill; Lefkowitz, Elliot J; Wang, Chunlin

2014-02-01

Poxviruses are dsDNA viruses with large genomes. Many genes in the genome remain uncharacterized, and recent studies have demonstrated that the poxvirus transcriptome includes numerous so-called anomalous transcripts not associated with open reading frames. Here, we characterize the expression and role of an apparently non-coding RNA in orthopoxviruses, which we call viral hairpin RNA (vhRNA). Using a bioinformatics approach, we predicted expression of a transcript not associated with an open reading frame that is likely to form a stem-loop structure due to the presence of a 21 nt palindromic sequence. Expression of the transcript as early as 2 h post-infection was confirmed by northern blot and analysis of publicly available vaccinia virus infected cell transcriptomes. The transcription start site was determined by RACE PCE and transcriptome analysis, and early and late promoter sequences were identified. Finally, to test the function of the transcript we generated an ectromelia virus knockout, which failed to form plaques in cell culture. The important role of the transcript in viral replication was further demonstrated using siRNA. Although the function of the transcript remains unknown, our work contributes to evidence of an increasingly complex poxvirus transcriptome, suggesting that transcripts such as vhRNA not associated with an annotated open reading frame can play an important role in viral replication.
Transcriptome sequencing and de novo analysis of the copepod Calanus sinicus using 454 GS FLX.

PubMed

Ning, Juan; Wang, Minxiao; Li, Chaolun; Sun, Song

2013-01-01

Despite their species abundance and primary economic importance, genomic information about copepods is still limited. In particular, genomic resources are lacking for the copepod Calanus sinicus, which is a dominant species in the coastal waters of East Asia. In this study, we performed de novo transcriptome sequencing to produce a large number of expressed sequence tags for the copepod C. sinicus. Copepodid larvae and adults were used as the basic material for transcriptome sequencing. Using 454 pyrosequencing, a total of 1,470,799 reads were obtained, which were assembled into 56,809 high quality expressed sequence tags. Based on their sequence similarity to known proteins, about 14,000 different genes were identified, including members of all major conserved signaling pathways. Transcripts that were putatively involved with growth, lipid metabolism, molting, and diapause were also identified among these genes. Differentially expressed genes related to several processes were found in C. sinicus copepodid larvae and adults. We detected 284,154 single nucleotide polymorphisms (SNPs) that provide a resource for gene function studies. Our data provide the most comprehensive transcriptome resource available for C. sinicus. This resource allowed us to identify genes associated with primary physiological processes and SNPs in coding regions, which facilitated the quantitative analysis of differential gene expression. These data should provide foundation for future genetic and genomic studies of this and related species.
Doubled Haploid ‘CUDH2107’ as a Reference for Bulb Onion (Allium cepa L.) Research: Development of a Transcriptome Catalogue and Identification of Transcripts Associated with Male Fertility

PubMed Central

Khosa, Jiffinvir S.; Lee, Robyn; Bräuning, Sophia; Lord, Janice; Pither-Joyce, Meeghan; McCallum, John; Macknight, Richard C.

2016-01-01

Researchers working on model plants have derived great benefit from developing genomic and genetic resources using ‘reference’ genotypes. Onion has a large and highly heterozygous genome making the sharing of germplasm and analysis of sequencing data complicated. To simplify the discovery and analysis of genes underlying important onion traits, we are promoting the use of the homozygous double haploid line ‘CUDH2107’ by the onion research community. In the present investigation, we performed transcriptome sequencing on vegetative and reproductive tissues of CUDH2107 to develop a multi-organ reference transcriptome catalogue. A total of 396 million 100 base pair paired reads was assembled using the Trinity pipeline, resulting in 271,665 transcript contigs. This dataset was analysed for gene ontology and transcripts were classified on the basis of putative biological processes, molecular function and cellular localization. Significant differences were observed in transcript expression profiles between different tissues. To demonstrate the utility of our CUDH2107 transcriptome catalogue for understanding the genetic and molecular basis of various traits, we identified orthologues of rice genes involved in male fertility and flower development. These genes provide an excellent starting point for studying the molecular regulation, and the engineering of reproductive traits. PMID:27861615
Development of Transcriptomic Resources for Interrogating the Biosynthesis of Monoterpene Indole Alkaloids in Medicinal Plant Species

PubMed Central

Góngora-Castillo, Elsa; Childs, Kevin L.; Fedewa, Greg; Hamilton, John P.; Liscombe, David K.; Magallanes-Lundback, Maria; Mandadi, Kranthi K.; Nims, Ezekiel; Runguphan, Weerawat; Vaillancourt, Brieanne; Varbanova-Herde, Marina; DellaPenna, Dean; McKnight, Thomas D.; O’Connor, Sarah; Buell, C. Robin

2012-01-01

The natural diversity of plant metabolism has long been a source for human medicines. One group of plant-derived compounds, the monoterpene indole alkaloids (MIAs), includes well-documented therapeutic agents used in the treatment of cancer (vinblastine, vincristine, camptothecin), hypertension (reserpine, ajmalicine), malaria (quinine), and as analgesics (7-hydroxymitragynine). Our understanding of the biochemical pathways that synthesize these commercially relevant compounds is incomplete due in part to a lack of molecular, genetic, and genomic resources for the identification of the genes involved in these specialized metabolic pathways. To address these limitations, we generated large-scale transcriptome sequence and expression profiles for three species of Asterids that produce medicinally important MIAs: Camptotheca acuminata, Catharanthus roseus, and Rauvolfia serpentina. Using next generation sequencing technology, we sampled the transcriptomes of these species across a diverse set of developmental tissues, and in the case of C. roseus, in cultured cells and roots following elicitor treatment. Through an iterative assembly process, we generated robust transcriptome assemblies for all three species with a substantial number of the assembled transcripts being full or near-full length. The majority of transcripts had a related sequence in either UniRef100, the Arabidopsis thaliana predicted proteome, or the Pfam protein domain database; however, we also identified transcripts that lacked similarity with entries in either database and thereby lack a known function. Representation of known genes within the MIA biosynthetic pathway was robust. As a diverse set of tissues and treatments were surveyed, expression abundances of transcripts in the three species could be estimated to reveal transcripts associated with development and response to elicitor treatment. Together, these transcriptomes and expression abundance matrices provide a rich resource for understanding plant specialized metabolism, and promotes realization of innovative production systems for plant-derived pharmaceuticals. PMID:23300689
De Novo Assembly and Annotation of the Transcriptome of the Agricultural Weed Ipomoea purpurea Uncovers Gene Expression Changes Associated with Herbicide Resistance

PubMed Central

Leslie, Trent; Baucom, Regina S.

2014-01-01

Human-mediated selection can lead to rapid evolution in very short time scales, and the evolution of herbicide resistance in agricultural weeds is an excellent example of this phenomenon. The common morning glory, Ipomoea purpurea, is resistant to the herbicide glyphosate, but genetic investigations of this trait have been hampered by the lack of genomic resources for this species. Here, we present the annotated transcriptome of the common morning glory, Ipomoea purpurea, along with an examination of whole genome expression profiling to assess potential gene expression differences between three artificially selected herbicide resistant lines and three susceptible lines. The assembled Ipomoea transcriptome reported in this work contains 65,459 assembled transcripts, ~28,000 of which were functionally annotated by assignment to Gene Ontology categories. Our RNA-seq survey using this reference transcriptome identified 19 differentially expressed genes associated with resistance—one of which, a cytochrome P450, belongs to a large plant family of genes involved in xenobiotic detoxification. The differentially expressed genes also broadly implicated receptor-like kinases, which were down-regulated in the resistant lines, and other growth and defense genes, which were up-regulated in resistant lines. Interestingly, the target of glyphosate—EPSP synthase—was not overexpressed in the resistant Ipomoea lines as in other glyphosate resistant weeds. Overall, this work identifies potential candidate resistance loci for future investigations and dramatically increases genomic resources for this species. The assembled transcriptome presented herein will also provide a valuable resource to the Ipomoea community, as well as to those interested in utilizing the close relationship between the Convolvulaceae and the Solanaceae for phylogenetic and comparative genomics examinations. PMID:25155274
De novo assembly and annotation of the transcriptome of the agricultural weed Ipomoea purpurea uncovers gene expression changes associated with herbicide resistance.

PubMed

Leslie, Trent; Baucom, Regina S

2014-08-25

Human-mediated selection can lead to rapid evolution in very short time scales, and the evolution of herbicide resistance in agricultural weeds is an excellent example of this phenomenon. The common morning glory, Ipomoea purpurea, is resistant to the herbicide glyphosate, but genetic investigations of this trait have been hampered by the lack of genomic resources for this species. Here, we present the annotated transcriptome of the common morning glory, Ipomoea purpurea, along with an examination of whole genome expression profiling to assess potential gene expression differences between three artificially selected herbicide resistant lines and three susceptible lines. The assembled Ipomoea transcriptome reported in this work contains 65,459 assembled transcripts, ~28,000 of which were functionally annotated by assignment to Gene Ontology categories. Our RNA-seq survey using this reference transcriptome identified 19 differentially expressed genes associated with resistance-one of which, a cytochrome P450, belongs to a large plant family of genes involved in xenobiotic detoxification. The differentially expressed genes also broadly implicated receptor-like kinases, which were down-regulated in the resistant lines, and other growth and defense genes, which were up-regulated in resistant lines. Interestingly, the target of glyphosate-EPSP synthase-was not overexpressed in the resistant Ipomoea lines as in other glyphosate resistant weeds. Overall, this work identifies potential candidate resistance loci for future investigations and dramatically increases genomic resources for this species. The assembled transcriptome presented herein will also provide a valuable resource to the Ipomoea community, as well as to those interested in utilizing the close relationship between the Convolvulaceae and the Solanaceae for phylogenetic and comparative genomics examinations. Copyright © 2014 Leslie and Baucom.
Diurnal Transcriptome and Gene Network Represented through Sparse Modeling in Brachypodium distachyon.

PubMed

Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi

2017-01-01

We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Metabolic engineering of Escherichia coli for the production of l-valine based on transcriptome analysis and in silico gene knockout simulation

PubMed Central

Park, Jin Hwan; Lee, Kwang Ho; Kim, Tae Yong; Lee, Sang Yup

2007-01-01

The l-valine production strain of Escherichia coli was constructed by rational metabolic engineering and stepwise improvement based on transcriptome analysis and gene knockout simulation of the in silico genome-scale metabolic network. Feedback inhibition of acetohydroxy acid synthase isoenzyme III by l-valine was removed by site-directed mutagenesis, and the native promoter containing the transcriptional attenuator leader regions of the ilvGMEDA and ilvBN operon was replaced with the tac promoter. The ilvA, leuA, and panB genes were deleted to make more precursors available for l-valine biosynthesis. This engineered Val strain harboring a plasmid overexpressing the ilvBN genes produced 1.31 g/liter l-valine. Comparative transcriptome profiling was performed during batch fermentation of the engineered and control strains. Among the down-regulated genes, the lrp and ygaZH genes, which encode a global regulator Lrp and l-valine exporter, respectively, were overexpressed. Amplification of the lrp, ygaZH, and lrp-ygaZH genes led to the enhanced production of l-valine by 21.6%, 47.1%, and 113%, respectively. Further improvement was achieved by using in silico gene knockout simulation, which identified the aceF, mdh, and pfkA genes as knockout targets. The VAMF strain (Val ΔaceF Δmdh ΔpfkA) overexpressing the ilvBN, ilvCED, ygaZH, and lrp genes was able to produce 7.55 g/liter l-valine from 20 g/liter glucose in batch culture, resulting in a high yield of 0.378 g of l-valine per gram of glucose. These results suggest that an industrially competitive strain can be efficiently developed by metabolic engineering based on combined rational modification, transcriptome profiling, and systems-level in silico analysis. PMID:17463081
Molecular approaches to improvement of Jatropha curcas Linn. as a sustainable energy crop.

PubMed

Sudhakar Johnson, T; Eswaran, Nalini; Sujatha, M

2011-09-01

With the increase in crude oil prices, climate change concerns and limited reserves of fossil fuel, attention has been diverted to alternate renewable energy sources such as biofuel and biomass. Among the potential biofuel crops, Jatropha curcas L, a non-domesticated shrub, has been gaining importance as the most promising oilseed, as it does not compete with the edible oil supplies. Economic relevance of J. curcas for biodiesel production has promoted world-wide prospecting of its germplasm for crop improvement and breeding. However, lack of adequate genetic variation and non-availability of improved varieties limited its prospects of being a successful energy crop. In this review, we present the progress made in molecular breeding approaches with particular reference to tissue culture and genetic transformation, genetic diversity assessment using molecular markers, large-scale transcriptome and proteome studies, identification of candidate genes for trait improvement, whole genome sequencing and the current interest by various public and private sector companies in commercial-scale cultivation, which highlights the revival of Jatropha as a sustainable energy crop. The information generated from molecular markers, transcriptome profiling and whole genome sequencing could accelerate the genetic upgradation of J. curcas through molecular breeding.
Comparative transcriptome analysis of lufenuron-resistant and susceptible strains of Spodoptera frugiperda (Lepidoptera: Noctuidae).

PubMed

do Nascimento, Antonio Rogério Bezerra; Fresia, Pablo; Cônsoli, Fernando Luis; Omoto, Celso

2015-11-21

The evolution of insecticide resistance in Spodoptera frugiperda (Lepidoptera: Noctuidae) has resulted in large economic losses and disturbances to the environment and agroecosystems. Resistance to lufenuron, a chitin biosynthesis inhibitor insecticide, was recently documented in Brazilian populations of S. frugiperda. Thus, we utilized large-scale cDNA sequencing (RNA-Seq analysis) to compare the pattern of gene expression between lufenuron-resistant (LUF-R) and susceptible (LUF-S) S. larvae in an attempt to identify the molecular basis behind the resistance mechanism(s) of S. frugiperda to this insecticide. A transcriptome was assembled using approximately 19.6 million 100 bp-long single-end reads, which generated 18,506 transcripts with a N50 of 996 bp. A search against the NCBI non-redundant database generated 51.1% (9,457) functionally annotated transcripts. A large portion of the alignments were homologous to insects, with the majority (45%) being similar to sequences of Bombyx mori (Lepidoptera: Bombycidae). Moreover, 10% of the alignments were similar to sequences of various species of Spodoptera (Lepidoptera: Noctuidae), with 3% of them being similar to sequences of S. frugiperda. A comparative analysis of the gene expression between LUF-R and LUF-S S. frugiperda larvae identified 940 differentially expressed transcripts (p ≤ 0.05, t-test; fold change ≥ 4). Six of them were associated with cuticle metabolism. Of those, four were overexpressed in LUF-R larvae. The machinery involved with the detoxification process was represented by 35 differentially expressed transcripts; 24 of them belonging to P450 monooxygenases, four to glutathione-S-transferases, six to carboxylases and one to sulfotransferases. RNA-Seq analysis was validated for a number of selected candidate transcripts by using quantitative real time PCR (qPCR). The gene expression profile of LUF-R larvae of S. frugiperda differs from LUF-S larvae. In general, gene expression is much higher in resistant larvae when compared to the susceptible ones, particularly for those genes involved with pathways for xenobiotic detoxification, mainly represented by P450 monooxygenases transcripts. Our data indicate that enzymes involved with the detoxification process, and mostly the P450, are one of the resistance mechanisms employed by the LUF-R S. frugiperda larvae against lufenuron.
Complex and extensive post-transcriptional regulation revealed by integrative proteomic and transcriptomic analysis of metabolite stress response in Clostridium acetobutylicum.

PubMed

Venkataramanan, Keerthi P; Min, Lie; Hou, Shuyu; Jones, Shawn W; Ralston, Matthew T; Lee, Kelvin H; Papoutsakis, E Terry

2015-01-01

Clostridium acetobutylicum is a model organism for both clostridial biology and solvent production. The organism is exposed to its own toxic metabolites butyrate and butanol, which trigger an adaptive stress response. Integrative analysis of proteomic and RNAseq data may provide novel insights into post-transcriptional regulation. The identified iTRAQ-based quantitative stress proteome is made up of 616 proteins with a 15 % genome coverage. The differentially expressed proteome correlated poorly with the corresponding differential RNAseq transcriptome. Up to 31 % of the differentially expressed proteins under stress displayed patterns opposite to those of the transcriptome, thus suggesting significant post-transcriptional regulation. The differential proteome of the translation machinery suggests that cells employ a different subset of ribosomal proteins under stress. Several highly upregulated proteins but with low mRNA levels possessed mRNAs with long 5'UTRs and strong RBS scores, thus supporting the argument that regulatory elements on the long 5'UTRs control their translation. For example, the oxidative stress response rubrerythrin was upregulated only at the protein level up to 40-fold without significant mRNA changes. We also identified many leaderless transcripts, several displaying different transcriptional start sites, thus suggesting mRNA-trimming mechanisms under stress. Downregulation of Rho and partner proteins pointed to changes in transcriptional elongation and termination under stress. The integrative proteomic-transcriptomic analysis demonstrated complex expression patterns of a large fraction of the proteome. Such patterns could not have been detected with one or the other omic analyses. Our analysis proposes the involvement of specific molecular mechanisms of post-transcriptional regulation to explain the observed complex stress response.
A-WINGS: an integrated genome database for Pleurocybella porrigens (Angel's wing oyster mushroom, Sugihiratake).

PubMed

Yamamoto, Naoki; Suzuki, Tomohiro; Kobayashi, Masaaki; Dohra, Hideo; Sasaki, Yohei; Hirai, Hirofumi; Yokoyama, Koji; Kawagishi, Hirokazu; Yano, Kentaro

2014-12-03

The angel's wing oyster mushroom (Pleurocybella porrigens, Sugihiratake) is a well-known delicacy. However, its potential risk in acute encephalopathy was recently revealed by a food poisoning incident. To disclose the genes underlying the accident and provide mechanistic insight, we seek to develop an information infrastructure containing omics data. In our previous work, we sequenced the genome and transcriptome using next-generation sequencing techniques. The next step in achieving our goal is to develop a web database to facilitate the efficient mining of large-scale omics data and identification of genes specifically expressed in the mushroom. This paper introduces a web database A-WINGS (http://bioinf.mind.meiji.ac.jp/a-wings/) that provides integrated genomic and transcriptomic information for the angel's wing oyster mushroom. The database contains structure and functional annotations of transcripts and gene expressions. Functional annotations contain information on homologous sequences from NCBI nr and UniProt, Gene Ontology, and KEGG Orthology. Digital gene expression profiles were derived from RNA sequencing (RNA-seq) analysis in the fruiting bodies and mycelia. The omics information stored in the database is freely accessible through interactive and graphical interfaces by search functions that include 'GO TREE VIEW' browsing, keyword searches, and BLAST searches. The A-WINGS database will accelerate omics studies on specific aspects of the angel's wing oyster mushroom and the family Tricholomataceae.
Proteomic insights into floral biology.

PubMed

Li, Xiaobai; Jackson, Aaron; Xie, Ming; Wu, Dianxing; Tsai, Wen-Chieh; Zhang, Sheng

2016-08-01

The flower is the most important biological structure for ensuring angiosperms reproductive success. Not only does the flower contain critical reproductive organs, but the wide variation in morphology, color, and scent has evolved to entice specialized pollinators, and arguably mankind in many cases, to ensure the successful propagation of its species. Recent proteomic approaches have identified protein candidates related to these flower traits, which has shed light on a number of previously unknown mechanisms underlying these traits. This review article provides a comprehensive overview of the latest advances in proteomic research in floral biology according to the order of flower structure, from corolla to male and female reproductive organs. It summarizes mainstream proteomic methods for plant research and recent improvements on two dimensional gel electrophoresis and gel-free workflows for both peptide level and protein level analysis. The recent advances in sequencing technologies provide a new paradigm for the ever-increasing genome and transcriptome information on many organisms. It is now possible to integrate genomic and transcriptomic data with proteomic results for large-scale protein characterization, so that a global understanding of the complex molecular networks in flower biology can be readily achieved. This article is part of a Special Issue entitled: Plant Proteomics--a bridge between fundamental processes and crop production, edited by Dr. Hans-Peter Mock. Copyright © 2016 Elsevier B.V. All rights reserved.
Analysis of the Transcriptomes Downstream of Eyeless and the Hedgehog, Decapentaplegic and Notch Signaling Pathways in Drosophila melanogaster

PubMed Central

Nfonsam, Landry E.; Cano, Carlos; Mudge, Joann; Schilkey, Faye D.; Curtiss, Jennifer

2012-01-01

Tissue-specific transcription factors are thought to cooperate with signaling pathways to promote patterned tissue specification, in part by co-regulating transcription. The Drosophila melanogaster Pax6 homolog Eyeless forms a complex, incompletely understood regulatory network with the Hedgehog, Decapentaplegic and Notch signaling pathways to control eye-specific gene expression. We report a combinatorial approach, including mRNAseq and microarray analyses, to identify targets co-regulated by Eyeless and Hedgehog, Decapentaplegic or Notch. Multiple analyses suggest that the transcriptomes resulting from co-misexpression of Eyeless+signaling factors provide a more complete picture of eye development compared to previous efforts involving Eyeless alone: (1) Principal components analysis and two-way hierarchical clustering revealed that the Eyeless+signaling factor transcriptomes are closer to the eye control transcriptome than when Eyeless is misexpressed alone; (2) more genes are upregulated at least three-fold in response to Eyeless+signaling factors compared to Eyeless alone; (3) based on gene ontology analysis, the genes upregulated in response to Eyeless+signaling factors had a greater diversity of functions compared to Eyeless alone. Through a secondary screen that utilized RNA interference, we show that the predicted gene CG4721 has a role in eye development. CG4721 encodes a neprilysin family metalloprotease that is highly up-regulated in response to Eyeless+Notch, confirming the validity of our approach. Given the similarity between D. melanogaster and vertebrate eye development, the large number of novel genes identified as potential targets of Ey+signaling factors will provide novel insights to our understanding of eye development in D. melanogaster and humans. PMID:22952997
Sympatric ecological speciation meets pyrosequencing: sampling the transcriptome of the apple maggot Rhagoletis pomonella

PubMed Central

2009-01-01

Background The full power of modern genetics has been applied to the study of speciation in only a small handful of genetic model species - all of which speciated allopatrically. Here we report the first large expressed sequence tag (EST) study of a candidate for ecological sympatric speciation, the apple maggot Rhagoletis pomonella, using massively parallel pyrosequencing on the Roche 454-FLX platform. To maximize transcript diversity we created and sequenced separate libraries from larvae, pupae, adult heads, and headless adult bodies. Results We obtained 239,531 sequences which assembled into 24,373 contigs. A total of 6810 unique protein coding genes were identified among the contigs and long singletons, corresponding to 48% of all known Drosophila melanogaster protein-coding genes. Their distribution across GO classes suggests that we have obtained a representative sample of the transcriptome. Among these sequences are many candidates for potential R. pomonella "speciation genes" (or "barrier genes") such as those controlling chemosensory and life-history timing processes. Furthermore, we identified important marker loci including more than 40,000 single nucleotide polymorphisms (SNPs) and over 100 microsatellites. An initial search for SNPs at which the apple and hawthorn host races differ suggested at least 75 loci warranting further work. We also determined that developmental expression differences remained even after normalization; transcripts expected to show different expression levels between larvae and pupae in D. melanogaster also did so in R. pomonella. Preliminary comparative analysis of transcript presences and absences revealed evidence of gene loss in Drosophila and gain in the higher dipteran clade Schizophora. Conclusions These data provide a much needed resource for exploring mechanisms of divergence in this important model for sympatric ecological speciation. Our description of ESTs from a substantial portion of the R. pomonella transcriptome will facilitate future functional studies of candidate genes for olfaction and diapause-related life history timing, and will enable large scale expression studies. Similarly, the identification of new SNP and microsatellite markers will facilitate future population and quantitative genetic studies of divergence between the apple and hawthorn-infesting host races. PMID:20035631
An RNA-seq transcriptome analysis of orthophosphate-deficient white lupin reveals novel insights into phosphorus acclimation in plants

USDA-ARS?s Scientific Manuscript database

Phosphorus (P) is one of the most limiting macronutrients in soils for plant growth and development. However, the whole genome molecular mechanisms contributing to plant acclimation to Pi-deficiency remains largely unknown. White lupin (Lupinus albus L.) has evolved unique adaptation systems for gro...
Discrete domains of gene expression in germinal layers distinguish the development of gyrencephaly

PubMed Central

de Juan Romero, Camino; Bruder, Carl; Tomasello, Ugo; Sanz-Anquela, José Miguel; Borrell, Víctor

2015-01-01

Gyrencephalic species develop folds in the cerebral cortex in a stereotypic manner, but the genetic mechanisms underlying this patterning process are unknown. We present a large-scale transcriptomic analysis of individual germinal layers in the developing cortex of the gyrencephalic ferret, comparing between regions prospective of fold and fissure. We find unique transcriptional signatures in each germinal compartment, where thousands of genes are differentially expressed between regions, including ∼80% of genes mutated in human cortical malformations. These regional differences emerge from the existence of discrete domains of gene expression, which occur at multiple locations across the developing cortex of ferret and human, but not the lissencephalic mouse. Complex expression patterns emerge late during development and map the eventual location of folds or fissures. Protomaps of gene expression within germinal layers may contribute to define cortical folds or functional areas, but our findings demonstrate that they distinguish the development of gyrencephalic cortices. PMID:25916825
Medulloblastoma: Tumor Biology and Relevance to Treatment and Prognosis Paradigm.

PubMed

Coluccia, Daniel; Figuereido, Carlyn; Isik, Semra; Smith, Christian; Rutka, James T

2016-05-01

Medulloblastoma is a malignant embryonic brain tumor arising in the posterior fossa and typically occurring in pediatric patients. Current multimodal treatment regimes have significantly improved the survival rates; however, a marked heterogeneity in therapy response is observed, and one third of all patients die within 5 years after diagnosis. Large-scale genetic and transcriptome analysis revealed four medulloblastoma subgroups (WNT, SHH, Group 3, and Group 4) associated with different demographic parameters, tumor manifestation, and clinical behavior. Future treatment protocols will integrate molecular classification schemes to evaluate subgroup-specific intensification or de-escalation of adjuvant therapies aimed to increase tumor control and reduce iatrogenic induced morbidity. Furthermore, the identification of genetic drivers allows assessing target therapies in order to increase the chemotherapeutic armamentarium. This review highlights the biology behind the current classification system and elucidates relevant aspects of the disease influencing forthcoming clinical trials.
Conservation genetics and genomics of amphibians and reptiles.

PubMed

Shaffer, H Bradley; Gidiş, Müge; McCartney-Melstad, Evan; Neal, Kevin M; Oyamaguchi, Hilton M; Tellez, Marisa; Toffelmier, Erin M

2015-01-01

Amphibians and reptiles as a group are often secretive, reach their greatest diversity often in remote tropical regions, and contain some of the most endangered groups of organisms on earth. Particularly in the past decade, genetics and genomics have been instrumental in the conservation biology of these cryptic vertebrates, enabling work ranging from the identification of populations subject to trade and exploitation, to the identification of cryptic lineages harboring critical genetic variation, to the analysis of genes controlling key life history traits. In this review, we highlight some of the most important ways that genetic analyses have brought new insights to the conservation of amphibians and reptiles. Although genomics has only recently emerged as part of this conservation tool kit, several large-scale data sources, including full genomes, expressed sequence tags, and transcriptomes, are providing new opportunities to identify key genes, quantify landscape effects, and manage captive breeding stocks of at-risk species.
The Transcriptome of Nacobbus aberrans Reveals Insights into the Evolution of Sedentary Endoparasitism in Plant-Parasitic Nematodes

PubMed Central

Eves-van den Akker, Sebastian; Lilley, Catherine J.; Danchin, Etienne G. J.; Rancurel, Corinne; Cock, Peter J. A.; Urwin, Peter E.; Jones, John T.

2014-01-01

Within the phylum Nematoda, plant-parasitism is hypothesized to have arisen independently on at least four occasions. The most economically damaging plant-parasitic nematode species, and consequently the most widely studied, are those that feed as they migrate destructively through host roots causing necrotic lesions (migratory endoparasites) and those that modify host root tissue to create a nutrient sink from which they feed (sedentary endoparasites). The false root-knot nematode Nacobbus aberrans is the only known species to have both migratory endoparasitic and sedentary endoparasitic stages within its life cycle. Moreover, its sedentary stage appears to have characteristics of both the root-knot and the cyst nematodes. We present the first large-scale genetic resource of any false-root knot nematode species. We use RNAseq to describe relative abundance changes in all expressed genes across the life cycle to provide interesting insights into the biology of this nematode as it transitions between modes of parasitism. A multigene phylogenetic analysis of N. aberrans with respect to plant-parasitic nematodes of all groups confirms its proximity to both cyst and root-knot nematodes. We present a transcriptome-wide analysis of both lateral gene transfer events and the effector complement. Comparing parasitism genes of typical root-knot and cyst nematodes to those of N. aberrans has revealed interesting similarities. Importantly, genes that were believed to be either cyst nematode, or root-knot nematode, “specific” have both been identified in N. aberrans. Our results provide insights into the characteristics of a common ancestor and the evolution of sedentary endoparasitism of plants by nematodes. PMID:25123114

Expansion of cytochrome P450 and cathepsin genes in the generalist herbivore brown marmorated stink bug.

PubMed

Bansal, Raman; Michel, Andy

2018-01-18

The brown marmorated stink bug (Halyomorpha halys) is an invasive pest in North America which causes severe economic losses on tree fruits, ornamentals, vegetables, and field crops. The H. halys is an extreme generalist and this feeding behaviour may have been a major contributor behind its establishment and successful adaptation in invasive habitats of North America. To develop an understanding into the mechanism of H. halys' generalist herbivory, here we specifically focused on genes putatively facilitating its adaptation on diverse host plants. We generated over 142 million reads via sequencing eight RNA-Seq libraries, each representing an individual H. halys adult. The de novo assembly contained 79,855 high quality transcripts, totalling 39,600,178 bases. Following a comprehensive transcriptome analysis, H. halys had an expanded suite of cytochrome P450 and cathepsin-L genes compared to other insects. Detailed characterization of P450 genes from the CYP6 family, known for herbivore adaptation on host plants, strongly hinted towards H. halys-specific expansions involving gene duplications. In subsequent RT-PCR experiments, both P450 and cathepsin genes exhibited tissue-specific or distinct expression patterns which supported their principal roles of detoxification and/or digestion in a particular tissue. Our analysis into P450 and cathepsin genes in H. halys offers new insights into potential mechanisms for understanding generalist herbivory and adaptation success in invasive habitats. Additionally, the large-scale transcriptomic resource developed here provides highly useful data for gene discovery; functional, population and comparative genomics as well as efforts to assemble and annotate the H. halys genome.
A high resolution atlas of gene expression in the domestic sheep (Ovis aries)

PubMed Central

Farquhar, Iseabail L.; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G.; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C. Bruce; Freeman, Tom C.; Archibald, Alan L.; Hume, David A.

2017-01-01

Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of ‘guilt by association’ was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages. PMID:28915238
A high resolution atlas of gene expression in the domestic sheep (Ovis aries).

PubMed

Clark, Emily L; Bush, Stephen J; McCulloch, Mary E B; Farquhar, Iseabail L; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G; Wu, Chunlei; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C Bruce; Freeman, Tom C; Summers, Kim M; Archibald, Alan L; Hume, David A

2017-09-01

Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of 'guilt by association' was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages.
EST-derived SNP discovery and selective pressure analysis in Pacific white shrimp ( Litopenaeus vannamei)

NASA Astrophysics Data System (ADS)

Liu, Chengzhang; Wang, Xia; Xiang, Jianhai; Li, Fuhua

2012-09-01

Pacific white shrimp has become a major aquaculture and fishery species worldwide. Although a large scale EST resource has been publicly available since 2008, the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure. In this study, a set of 155 411 expressed sequence tags (ESTs) from the NCBI database were computationally analyzed and 17 225 single nucleotide polymorphisms (SNPs) were predicted, including 9 546 transitions, 5 124 transversions and 2 481 indels. Among the 7 298 SNP substitutions located in functionally annotated contigs, 58.4% (4 262) are non-synonymous SNPs capable of introducing amino acid mutations. Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding. Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous, suggesting negative selection. Distribution of non-synonymous to synonymous substitutions (Ka/Ks) ratio ranges from 0 to 4.01, (average 0.42, median 0.26), suggesting that the majority of the affected genes are under purifying selection. Enrichment analysis identified multiple gene ontology categories under positive or negative selection. Categories involved in innate immune response and male gamete generation are rich in positively selected genes, which is similar to reports in Drosophila and primates. This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species. The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.
Whole Transcriptome Sequencing Enables Discovery and Analysis of Viruses in Archived Primary Central Nervous System Lymphomas

PubMed Central

DeBoever, Christopher; Reid, Erin G.; Smith, Erin N.; Wang, Xiaoyun; Dumaop, Wilmar; Harismendy, Olivier; Carson, Dennis; Richman, Douglas; Masliah, Eliezer; Frazer, Kelly A.

2013-01-01

Primary central nervous system lymphomas (PCNSL) have a dramatically increased prevalence among persons living with AIDS and are known to be associated with human Epstein Barr virus (EBV) infection. Previous work suggests that in some cases, co-infection with other viruses may be important for PCNSL pathogenesis. Viral transcription in tumor samples can be measured using next generation transcriptome sequencing. We demonstrate the ability of transcriptome sequencing to identify viruses, characterize viral expression, and identify viral variants by sequencing four archived AIDS-related PCNSL tissue samples and analyzing raw sequencing reads. EBV was detected in all four PCNSL samples and cytomegalovirus (CMV), JC polyomavirus (JCV), and HIV were also discovered, consistent with clinical diagnoses. CMV was found to express three long non-coding RNAs recently reported as expressed during active infection. Single nucleotide variants were observed in each of the viruses observed and three indels were found in CMV. No viruses were found in several control tumor types including 32 diffuse large B-cell lymphoma samples. This study demonstrates the ability of next generation transcriptome sequencing to accurately identify viruses, including DNA viruses, in solid human cancer tissue samples. PMID:24023918
Transcriptome Analysis of Three Sheep Intestinal Regions reveals Key Pathways and Hub Regulatory Genes of Large Intestinal Lipid Metabolism.

PubMed

Chao, Tianle; Wang, Guizhi; Ji, Zhibin; Liu, Zhaohua; Hou, Lei; Wang, Jin; Wang, Jianmin

2017-07-13

The large intestine, also known as the hindgut, is an important part of the animal digestive system. Recent studies on digestive system development in ruminants have focused on the rumen and the small intestine, but the molecular mechanisms underlying sheep large intestine metabolism remain poorly understood. To identify genes related to intestinal metabolism and to reveal molecular regulation mechanisms, we sequenced and compared the transcriptomes of mucosal epithelial tissues among the cecum, proximal colon and duodenum. A total of 4,221 transcripts from 3,254 genes were identified as differentially expressed transcripts. Between the large intestine and duodenum, differentially expressed transcripts were found to be significantly enriched in 6 metabolism-related pathways, among which PPAR signaling was identified as a key pathway. Three genes, CPT1A, LPL and PCK1, were identified as higher expression hub genes in the large intestine. Between the cecum and colon, differentially expressed transcripts were significantly enriched in 5 lipid metabolism related pathways, and CEPT1 and MBOAT1 were identified as hub genes. This study provides important information regarding the molecular mechanisms of intestinal metabolism in sheep and may provide a basis for further study.
Flow cytometric purification of Colletotrichum higginsianum biotrophic hyphae from Arabidopsis leaves for stage-specific transcriptome analysis.

PubMed

Takahara, Hiroyuki; Dolf, Andreas; Endl, Elmar; O'Connell, Richard

2009-08-01

Generation of stage-specific cDNA libraries is a powerful approach to identify pathogen genes that are differentially expressed during plant infection. Biotrophic pathogens develop specialized infection structures inside living plant cells, but sampling the transcriptome of these structures is problematic due to the low ratio of fungal to plant RNA, and the lack of efficient methods to isolate them from infected plants. Here we established a method, based on fluorescence-activated cell sorting (FACS), to purify the intracellular biotrophic hyphae of Colletotrichum higginsianum from homogenates of infected Arabidopsis leaves. Specific selection of viable hyphae using a fluorescent vital marker provided intact RNA for cDNA library construction. Pilot-scale sequencing showed that the library was enriched with plant-induced and pathogenicity-related fungal genes, including some encoding small, soluble secreted proteins that represent candidate fungal effectors. The high purity of the hyphae (94%) prevented contamination of the library by sequences derived from host cells or other fungal cell types. RT-PCR confirmed that genes identified in the FACS-purified hyphae were also expressed in planta. The method has wide applicability for isolating the infection structures of other plant pathogens, and will facilitate cell-specific transcriptome analysis via deep sequencing and microarray hybridization, as well as proteomic analyses.
Regulatory RNA binding proteins contribute to the transcriptome-wide splicing alterations in human cellular senescence.

PubMed

Dong, Qiongye; Wei, Lei; Zhang, Michael Q; Wang, Xiaowo

2018-06-24

Dysregulation of mRNA splicing has been observed in certain cellular senescence process. However, the common splicing alterations on the whole transcriptome shared by various types of senescence are poorly understood. In order to systematically identify senescence-associated transcriptomic changes in genome-wide scale, we collected RNA sequencing datasets of different human cell types with a variety of senescence-inducing methods from public databases and performed meta-analysis. First, we discovered that a group of RNA binding proteins were consistently down-regulated in diverse senescent samples and identified 406 senescence-associated common differential splicing events. Then, eight differentially expressed RNA binding proteins were predicted to regulate these senescence-associated splicing alterations through an enrichment analysis of their RNA binding information, including motif scanning and enhanced cross-linking immunoprecipitation data. In addition, we constructed the splicing regulatory modules that might contribute to senescence-associated biological processes. Finally, it was confirmed that knockdown of the predicted senescence-associated potential splicing regulators through shRNAs in HepG2 cell line could result in senescence-like splicing changes. Taken together, our work demonstrated a broad range of common changes in mRNA splicing switches and detected their central regulatory RNA binding proteins during senescence. These findings would help to better understand the coordinating splicing alterations in cellular senescence.
Revealing impaired pathways in the an11 mutant by high-throughput characterization of Petunia axillaris and Petunia inflata transcriptomes.

PubMed

Zenoni, Sara; D'Agostino, Nunzio; Tornielli, Giovanni B; Quattrocchio, Francesca; Chiusano, Maria L; Koes, Ronald; Zethof, Jan; Guzzo, Flavia; Delledonne, Massimo; Frusciante, Luigi; Gerats, Tom; Pezzotti, Mario

2011-10-01

Petunia is an excellent model system, especially for genetic, physiological and molecular studies. Thus far, however, genome-wide expression analysis has been applied rarely because of the lack of sequence information. We applied next-generation sequencing to generate, through de novo read assembly, a large catalogue of transcripts for Petunia axillaris and Petunia inflata. On the basis of both transcriptomes, comprehensive microarray chips for gene expression analysis were established and used for the analysis of global- and organ-specific gene expression in Petunia axillaris and Petunia inflata and to explore the molecular basis of the seed coat defects in a Petunia hybrida mutant, anthocyanin 11 (an11), lacking a WD40-repeat (WDR) transcription regulator. Among the transcripts differentially expressed in an11 seeds compared with wild type, many expected targets of AN11 were found but also several interesting new candidates that might play a role in morphogenesis of the seed coat. Our results validate the combination of next-generation sequencing with microarray analyses strategies to identify the transcriptome of two petunia species without previous knowledge of their genome, and to develop comprehensive chips as useful tools for the analysis of gene expression in P. axillaris, P. inflata and P. hybrida. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.
Transcript abundance on its own cannot be used to infer fluxes in central metabolism

DOE PAGES

Schwender, Jorg; Konig, Christina; Klapperstuck, Matthias; ...

2014-11-28

An attempt has been made to define the extent to which metabolic flux in central plant metabolism is reflected by changes in the transcriptome and metabolome, based on an analysis of in vitro cultured immature embryos of two oilseed rape (Brassica napus) accessions which contrast for seed lipid accumulation. Metabolic flux analysis (MFA) was used to constrain a flux balance metabolic model which included 671 biochemical and transport reactions within the central metabolism. This highly confident flux information was eventually used for comparative analysis of flux vs. transcript (metabolite). Metabolite profiling succeeded in identifying 79 intermediates within the central metabolism,more » some of which differed quantitatively between the two accessions and displayed a significant shift corresponding to flux. An RNA-Seq based transcriptome analysis revealed a large number of genes which were differentially transcribed in the two accessions, including some enzymes/proteins active in major metabolic pathways. With a few exceptions, differential activity in the major pathways (glycolysis, TCA cycle, amino acid, and fatty acid synthesis) was not reflected in contrasting abundances of the relevant transcripts. The conclusion was that transcript abundance on its own cannot be used to infer metabolic activity/fluxes in central plant metabolism. Lastly, this limitation needs to be borne in mind in evaluating transcriptome data and designing metabolic engineering experiments.« less
The low-abundance transcriptome reveals novel biomarkers, specific intracellular pathways and targetable genes associated with advanced gastric cancer.

PubMed

Bizama, Carolina; Benavente, Felipe; Salvatierra, Edgardo; Gutiérrez-Moraga, Ana; Espinoza, Jaime A; Fernández, Elmer A; Roa, Iván; Mazzolini, Guillermo; Sagredo, Eduardo A; Gidekel, Manuel; Podhajcer, Osvaldo L

2014-02-15

Studies on the low-abundance transcriptome are of paramount importance for identifying the intimate mechanisms of tumor progression that can lead to novel therapies. The aim of the present study was to identify novel markers and targetable genes and pathways in advanced human gastric cancer through analyses of the low-abundance transcriptome. The procedure involved an initial subtractive hybridization step, followed by global gene expression analysis using microarrays. We observed profound differences, both at the single gene and gene ontology levels, between the low-abundance transcriptome and the whole transcriptome. Analysis of the low-abundance transcriptome led to the identification and validation by tissue microarrays of novel biomarkers, such as LAMA3 and TTN; moreover, we identified cancer type-specific intracellular pathways and targetable genes, such as IRS2, IL17, IFNγ, VEGF-C, WISP1, FZD5 and CTBP1 that were not detectable by whole transcriptome analyses. We also demonstrated that knocking down the expression of CTBP1 sensitized gastric cancer cells to mainstay chemotherapeutic drugs. We conclude that the analysis of the low-abundance transcriptome provides useful insights into the molecular basis and treatment of cancer. © 2013 UICC.
Transcriptional analysis of the Escherichia coli ColV-Ia plasmid pS88 during growth in human serum and urine.

PubMed

Lemaître, Chloé; Bidet, Philippe; Bingen, Edouard; Bonacorsi, Stéphane

2012-06-21

The sequenced O45:K1:H7 Escherichia coli meningitis strain S88 harbors a large virulence plasmid. To identify possible genetic determinants of pS88 virulence, we examined the transcriptomes of 88 plasmidic ORFs corresponding to known and putative virulence genes, and 35 ORFs of unknown function. Quantification of plasmidic transcripts was obtained by quantitative real-time reverse transcription of extracted RNA, normalized on three housekeeping genes. The transcriptome of E. coli strain S88 grown in human serum and urine ex vivo were compared to that obtained during growth in Luria Bertani broth, with and without iron depletion. We also analyzed the transcriptome of a pS88-like plasmid recovered from a neonate with urinary tract infection. The transcriptome obtained after ex vivo growth in serum and urine was very similar to those obtained in iron-depleted LB broth. Genes encoding iron acquisition systems were strongly upregulated. ShiF and ORF 123, two ORFs encoding protein with hypothetical function and physically linked to aerobactin and salmochelin loci, respectively, were also highly expressed in iron-depleted conditions and may correspond to ancillary iron acquisition genes. Four ORFs were induced ex vivo, independently of the iron concentration. Other putative virulence genes such as iss, etsC, ompTp and hlyF were not upregulated in any of the conditions studied. Transcriptome analysis of the pS88-like plasmid recovered in vivo showed a similar pattern of induction but at much higher levels. We identify new pS88 genes potentially involved in the growth of E. coli meningitis strain S88 in human serum and urine.
Temporal transcriptome profiling reveals expression partitioning of homeologous genes contributing to heat and drought acclimation in wheat (Triticum aestivum L.).

PubMed

Liu, Zhenshan; Xin, Mingming; Qin, Jinxia; Peng, Huiru; Ni, Zhongfu; Yao, Yingyin; Sun, Qixin

2015-06-20

Hexaploid wheat (Triticum aestivum) is a globally important crop. Heat, drought and their combination dramatically reduce wheat yield and quality, but the molecular mechanisms underlying wheat tolerance to extreme environments, especially stress combination, are largely unknown. As an allohexaploid, wheat consists of three closely related subgenomes (A, B, and D), and was reported to show improved tolerance to stress conditions compared to tetraploid. But so far very little is known about how wheat coordinates the expression of homeologous genes to cope with various environmental constraints on the whole-genome level. To explore the transcriptional response of wheat to the individual and combined stress, we performed high-throughput transcriptome sequencing of seedlings under normal condition and subjected to drought stress (DS), heat stress (HS) and their combination (HD) for 1 h and 6 h, and presented global gene expression reprograms in response to these three stresses. Gene Ontology (GO) enrichment analysis of DS, HS and HD responsive genes revealed an overlap and complexity of functional pathways between each other. Moreover, 4,375 wheat transcription factors were identified on a whole-genome scale based on the released scaffold information by IWGSC, and 1,328 were responsive to stress treatments. Then, the regulatory network analysis of HSFs and DREBs implicated they were both involved in the regulation of DS, HS and HD response and indicated a cross-talk between heat and drought stress. Finally, approximately 68.4 % of homeologous genes were found to exhibit expression partitioning in response to DS, HS or HD, which was further confirmed by using quantitative RT-PCR and Nullisomic-Tetrasomic lines. A large proportion of wheat homeologs exhibited expression partitioning under normal and abiotic stresses, which possibly contributes to the wide adaptability and distribution of hexaploid wheat in response to various environmental constraints.
Transcriptome assembly and digital gene expression atlas of the rainbow trout

USDA-ARS?s Scientific Manuscript database

Background: Transcriptome analysis is a preferred method for gene discovery, marker development and gene expression profiling in non-model organisms. Previously, we sequenced a transcriptome reference using Sanger-based and 454-pyrosequencing, however, a transcriptome assembly is still incomplete an...
Fungal and host transcriptome analysis of pH-regulated genes during colonization of apple fruits by Penicillium expansum.

PubMed

Barad, Shiri; Sela, Noa; Kumar, Dilip; Kumar-Dubey, Amit; Glam-Matana, Nofar; Sherman, Amir; Prusky, Dov

2016-05-04

Penicillium expansum is a destructive phytopathogen that causes decay in deciduous fruits during postharvest handling and storage. During colonization the fungus secretes D-gluconic acid (GLA), which modulates environmental pH and regulates mycotoxin accumulation in colonized tissue. Till now no transcriptomic analysis has addressed the specific contribution of the pathogen's pH regulation to the P. expansum colonization process. For this purpose total RNA from the leading edge of P. expansum-colonized apple tissue of cv. 'Golden Delicious' and from fungal cultures grown under pH 4 or 7 were sequenced and their gene expression patterns were compared. We present a large-scale analysis of the transcriptome data of P. expansum and apple response to fungal colonization. The fungal analysis revealed nine different clusters of gene expression patterns that were divided among three major groups in which the colonized tissue showed, respectively: (i) differing transcript expression patterns between mycelial growth at pH 4 and pH 7; (ii) similar transcript expression patterns of mycelial growth at pH 4; and (iii) similar transcript expression patterns of mycelial growth at pH 7. Each group was functionally characterized in order to decipher genes that are important for pH regulation and also for colonization of apple fruits by Penicillium. Furthermore, comparison of gene expression of healthy apple tissue with that of colonized tissue showed that differentially expressed genes revealed up-regulation of the jasmonic acid and mevalonate pathways, and also down-regulation of the glycogen and starch biosynthesis pathways. Overall, we identified important genes and functionalities of P. expansum that were controlled by the environmental pH. Differential expression patterns of genes belonging to the same gene family suggest that genes were selectively activated according to their optimal environmental conditions (pH, in vitro or in vivo) to enable the fungus to cope with varying conditions and to make optimal use of available enzymes. Comparison between the activation of the colonized host's gene responses by alkalizing Colletotrichum gloeosporioides and acidifying P. expansum pathogens indicated similar gene response patterns, but stronger responses to P. expansum, suggesting the importance of acidification by P. expansum as a factor in its increased aggressiveness.
Preliminary profiling of blood transcriptome in a rat model of hemorrhagic shock.

PubMed

Braga, D; Barcella, M; D'Avila, F; Lupoli, S; Tagliaferri, F; Santamaria, M H; DeLano, F A; Baselli, G; Schmid-Schönbein, G W; Kistler, E B; Aletti, F; Barlassina, C

2017-08-01

Hemorrhagic shock is a leading cause of morbidity and mortality worldwide. Significant blood loss may lead to decreased blood pressure and inadequate tissue perfusion with resultant organ failure and death, even after replacement of lost blood volume. One reason for this high acuity is that the fundamental mechanisms of shock are poorly understood. Proteomic and metabolomic approaches have been used to investigate the molecular events occurring in hemorrhagic shock but, to our knowledge, a systematic analysis of the transcriptomic profile is missing. Therefore, a pilot analysis using paired-end RNA sequencing was used to identify changes that occur in the blood transcriptome of rats subjected to hemorrhagic shock after blood reinfusion. Hemorrhagic shock was induced using a Wigger's shock model. The transcriptome of whole blood from shocked animals shows modulation of genes related to inflammation and immune response (Tlr13, Il1b, Ccl6, Lgals3), antioxidant functions (Mt2A, Mt1), tissue injury and repair pathways (Gpnmb, Trim72) and lipid mediators (Alox5ap, Ltb4r, Ptger2) compared with control animals. These findings are congruent with results obtained in hemorrhagic shock analysis by other authors using metabolomics and proteomics. The analysis of blood transcriptome may be a valuable tool to understand the biological changes occurring in hemorrhagic shock and a promising approach for the identification of novel biomarkers and therapeutic targets. Impact statement This study provides the first pilot analysis of the changes occurring in transcriptome expression of whole blood in hemorrhagic shock (HS) rats. We showed that the analysis of blood transcriptome is a useful approach to investigate pathways and functional alterations in this disease condition. This pilot study encourages the possible application of transcriptome analysis in the clinical setting, for the molecular profiling of whole blood in HS patients.
Transcriptomic Immune Response of Tenebrio molitor Pupae to Parasitization by Scleroderma guani

PubMed Central

Zhu, Jia-Ying; Yang, Pu; Zhang, Zhong; Wu, Guo-Xing; Yang, Bin

2013-01-01

Background Host and parasitoid interaction is one of the most fascinating relationships of insects, which is currently receiving an increasing interest. Understanding the mechanisms evolved by the parasitoids to evade or suppress the host immune system is important for dissecting this interaction, while it was still poorly known. In order to gain insight into the immune response of Tenebrio molitor to parasitization by Scleroderma guani, the transcriptome of T. molitor pupae was sequenced with focus on immune-related gene, and the non-parasitized and parasitized T. molitor pupae were analyzed by digital gene expression (DGE) analysis with special emphasis on parasitoid-induced immune-related genes using Illumina sequencing. Methodology/Principal Findings In a single run, 264,698 raw reads were obtained. De novo assembly generated 71,514 unigenes with mean length of 424 bp. Of those unigenes, 37,373 (52.26%) showed similarity to the known proteins in the NCBI nr database. Via analysis of the transcriptome data in depth, 430 unigenes related to immunity were identified. DGE analysis revealed that parasitization by S. guani had considerable impacts on the transcriptome profile of T. molitor pupae, as indicated by the significant up- or down-regulation of 3,431 parasitism-responsive transcripts. The expression of a total of 74 unigenes involved in immune response of T. molitor was significantly altered after parasitization. Conclusions/Significance obtained T. molitor transcriptome, in addition to establishing a fundamental resource for further research on functional genomics, has allowed the discovery of a large group of immune genes that might provide a meaningful framework to better understand the immune response in this species and other beetles. The DGE profiling data provides comprehensive T. molitor immune gene expression information at the transcriptional level following parasitization, and sheds valuable light on the molecular understanding of the host-parasitoid interaction. PMID:23342153
Transcriptome discovery in non-model wild fish species for the development of quantitative transcript abundance assays.

PubMed

Hahn, Cassidy M; Iwanowicz, Luke R; Cornman, Robert S; Mazik, Patricia M; Blazer, Vicki S

2016-12-01

Environmental studies increasingly identify the presence of both contaminants of emerging concern (CECs) and legacy contaminants in aquatic environments; however, the biological effects of these compounds on resident fishes remain largely unknown. High throughput methodologies were employed to establish partial transcriptomes for three wild-caught, non-model fish species; smallmouth bass (Micropterus dolomieu), white sucker (Catostomus commersonii) and brown bullhead (Ameiurus nebulosus). Sequences from these transcriptome databases were utilized in the development of a custom nCounter CodeSet that allowed for direct multiplexed measurement of 50 transcript abundance endpoints in liver tissue. Sequence information was also utilized in the development of quantitative real-time PCR (qPCR) primers. Cross-species hybridization allowed the smallmouth bass nCounter CodeSet to be used for quantitative transcript abundance analysis of an additional non-model species, largemouth bass (Micropterus salmoides). We validated the nCounter analysis data system with qPCR for a subset of genes and confirmed concordant results. Changes in transcript abundance biomarkers between sexes and seasons were evaluated to provide baseline data on transcript modulation for each species of interest. Published by Elsevier Inc.
Transcriptomics Analysis on Excellent Meat Quality Traits of Skeletal Muscles of the Chinese Indigenous Min Pig Compared with the Large White Breed

PubMed Central

Liu, Yingzi; Yang, Xiuqin; Jing, Xiaoyan; He, Xinmiao; Wang, Liang; Liu, Yang; Liu, Di

2017-01-01

The Min pig (Sus scrofa) is a well-known indigenous breed in China. One of its main advantages over European breeds is its high meat quality. Additionally, different cuts of pig also show some different traits of meat quality. To explore the underlying mechanism responsible for the differences of meat quality between different breeds or cuts, the longissimus dorsi muscle (LM) and the biceps femoris muscle (BF) from Min and Large White pigs were investigated using transcriptome analysis. The gene expression profiling identified 1371 differentially expressed genes (DEGs) between LM muscles from Min and Large White pigs, and 114 DEGs between LM and BF muscles from the same Min pigs. Gene Ontology (GO) enrichment of biological functions and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis showed that the gene products were mainly involved in the IRS1/Akt/FoxO1 signaling pathway, adenosine 5′-monophosphate-activated protein kinase (AMPK) cascade effects, lipid metabolism and amino acid metabolism pathway. Such pathways contributed to fatty acid metabolism, intramuscular fat deposition, and skeletal muscle growth in Min pig. These results give an insight into the mechanisms underlying the formation of skeletal muscle and provide candidate genes for improving meat quality. It will contribute to improving meat quality of pigs through molecular breeding. PMID:29271915
FusionHub: A unified web platform for annotation and visualization of gene fusion events in human cancer.

PubMed

Panigrahi, Priyabrata; Jere, Abhay; Anamika, Krishanpal

2018-01-01

Gene fusion is a chromosomal rearrangement event which plays a significant role in cancer due to the oncogenic potential of the chimeric protein generated through fusions. At present many databases are available in public domain which provides detailed information about known gene fusion events and their functional role. Existing gene fusion detection tools, based on analysis of transcriptomics data usually report a large number of fusion genes as potential candidates, which could be either known or novel or false positives. Manual annotation of these putative genes is indeed time-consuming. We have developed a web platform FusionHub, which acts as integrated search engine interfacing various fusion gene databases and simplifies large scale annotation of fusion genes in a seamless way. In addition, FusionHub provides three ways of visualizing fusion events: circular view, domain architecture view and network view. Design of potential siRNA molecules through ensemble method is another utility integrated in FusionHub that could aid in siRNA-based targeted therapy. FusionHub is freely available at https://fusionhub.persistent.co.in.

Transcriptome analysis of Gossypium hirsutum flower buds infested by cotton boll weevil (Anthonomus grandis) larvae.

PubMed

Artico, Sinara; Ribeiro-Alves, Marcelo; Oliveira-Neto, Osmundo Brilhante; de Macedo, Leonardo Lima Pepino; Silveira, Sylvia; Grossi-de-Sa, Maria Fátima; Martinelli, Adriana Pinheiro; Alves-Ferreira, Marcio

2014-10-04

Cotton is a major fibre crop grown worldwide that suffers extensive damage from chewing insects, including the cotton boll weevil larvae (Anthonomus grandis). Transcriptome analysis was performed to understand the molecular interactions between Gossypium hirsutum L. and cotton boll weevil larvae. The Illumina HiSeq 2000 platform was used to sequence the transcriptome of cotton flower buds infested with boll weevil larvae. The analysis generated a total of 327,489,418 sequence reads that were aligned to the G. hirsutum reference transcriptome. The total number of expressed genes was over 21,697 per sample with an average length of 1,063 bp. The DEGseq analysis identified 443 differentially expressed genes (DEG) in cotton flower buds infected with boll weevil larvae. Among them, 402 (90.7%) were up-regulated, 41 (9.3%) were down-regulated and 432 (97.5%) were identified as orthologues of A. thaliana genes using Blastx. Mapman analysis of DEG indicated that many genes were involved in the biotic stress response spanning a range of functions, from a gene encoding a receptor-like kinase to genes involved in triggering defensive responses such as MAPK, transcription factors (WRKY and ERF) and signalling by ethylene (ET) and jasmonic acid (JA) hormones. Furthermore, the spatial expression pattern of 32 of the genes responsive to boll weevil larvae feeding was determined by "in situ" qPCR analysis from RNA isolated from two flower structures, the stamen and the carpel, by laser microdissection (LMD). A large number of cotton transcripts were significantly altered upon infestation by larvae. Among the changes in gene expression, we highlighted the transcription of receptors/sensors that recognise chitin or insect oral secretions; the altered regulation of transcripts encoding enzymes related to kinase cascades, transcription factors, Ca2+ influxes, and reactive oxygen species; and the modulation of transcripts encoding enzymes from phytohormone signalling pathways. These data will aid in the selection of target genes to genetically engineer cotton to control the cotton boll weevil.
Adaptation and evolution of deep-sea scale worms (Annelida: Polynoidae): insights from transcriptome comparison with a shallow-water species

NASA Astrophysics Data System (ADS)

Zhang, Yanjie; Sun, Jin; Chen, Chong; Watanabe, Hiromi K.; Feng, Dong; Zhang, Yu; Chiu, Jill M. Y.; Qian, Pei-Yuan; Qiu, Jian-Wen

2017-04-01

Polynoid scale worms (Polynoidae, Annelida) invaded deep-sea chemosynthesis-based ecosystems approximately 60 million years ago, but little is known about their genetic adaptation to the extreme deep-sea environment. In this study, we reported the first two transcriptomes of deep-sea polynoids (Branchipolynoe pettiboneae, Lepidonotopodium sp.) and compared them with the transcriptome of a shallow-water polynoid (Harmothoe imbricata). We determined codon and amino acid usage, positive selected genes, highly expressed genes and putative duplicated genes. Transcriptome assembly produced 98,806 to 225,709 contigs in the three species. There were more positively charged amino acids (i.e., histidine and arginine) and less negatively charged amino acids (i.e., aspartic acid and glutamic acid) in the deep-sea species. There were 120 genes showing clear evidence of positive selection. Among the 10% most highly expressed genes, there were more hemoglobin genes with high expression levels in both deep-sea species. The duplicated genes related to DNA recombination and metabolism, and gene expression were only enriched in deep-sea species. Deep-sea scale worms adopted two strategies of adaptation to hypoxia in the chemosynthesis-based habitats (i.e., rapid evolution of tetra-domain hemoglobin in Branchipolynoe or high expression of single-domain hemoglobin in Lepidonotopodium sp.).
Adaptation and evolution of deep-sea scale worms (Annelida: Polynoidae): insights from transcriptome comparison with a shallow-water species

PubMed Central

Zhang, Yanjie; Sun, Jin; Chen, Chong; Watanabe, Hiromi K.; Feng, Dong; Zhang, Yu; Chiu, Jill M.Y.; Qian, Pei-Yuan; Qiu, Jian-Wen

2017-01-01

Polynoid scale worms (Polynoidae, Annelida) invaded deep-sea chemosynthesis-based ecosystems approximately 60 million years ago, but little is known about their genetic adaptation to the extreme deep-sea environment. In this study, we reported the first two transcriptomes of deep-sea polynoids (Branchipolynoe pettiboneae, Lepidonotopodium sp.) and compared them with the transcriptome of a shallow-water polynoid (Harmothoe imbricata). We determined codon and amino acid usage, positive selected genes, highly expressed genes and putative duplicated genes. Transcriptome assembly produced 98,806 to 225,709 contigs in the three species. There were more positively charged amino acids (i.e., histidine and arginine) and less negatively charged amino acids (i.e., aspartic acid and glutamic acid) in the deep-sea species. There were 120 genes showing clear evidence of positive selection. Among the 10% most highly expressed genes, there were more hemoglobin genes with high expression levels in both deep-sea species. The duplicated genes related to DNA recombination and metabolism, and gene expression were only enriched in deep-sea species. Deep-sea scale worms adopted two strategies of adaptation to hypoxia in the chemosynthesis-based habitats (i.e., rapid evolution of tetra-domain hemoglobin in Branchipolynoe or high expression of single-domain hemoglobin in Lepidonotopodium sp.). PMID:28397791
Characterization and analysis of a transcriptome from the boreal spider crab Hyas araneus.

PubMed

Harms, Lars; Frickenhaus, Stephan; Schiffer, Melanie; Mark, Felix C; Storch, Daniela; Pörtner, Hans-Otto; Held, Christoph; Lucassen, Magnus

2013-12-01

Research investigating the genetic basis of physiological responses has significantly broadened our understanding of the mechanisms underlying organismic response to environmental change. However, genomic data are currently available for few taxa only, thus excluding physiological model species from this approach. In this study we report the transcriptome of the model organism Hyas araneus from Spitsbergen (Arctic). We generated 20,479 transcripts, using the 454 GS FLX sequencing technology in combination with an Illumina HiSeq sequencing approach. Annotation by Blastx revealed 7159 blast hits in the NCBI non-redundant protein database. The comparison between the spider crab H. araneus transcriptome and EST libraries of the European lobster Homarus americanus and the porcelain crab Petrolisthes cinctipes yielded 3229/2581 sequences with a significant hit, respectively. The clustering by the Markov Clustering Algorithm (MCL) revealed a common core of 1710 clusters present in all three species and 5903 unique clusters for H. araneus. The combined sequencing approaches generated transcripts that will greatly expand the limited genomic data available for crustaceans. We introduce the MCL clustering for transcriptome comparisons as a simple approach to estimate similarities between transcriptomic libraries of different size and quality and to analyze homologies within the selected group of species. In particular, we identified a large variety of reverse transcriptase (RT) sequences not only in the H. araneus transcriptome and other decapod crustaceans, but also sea urchin, supporting the hypothesis of a heritable, anti-viral immunity and the proposed viral fragment integration by host-derived RTs in marine invertebrates. © 2013.
Advances in Exercise, Fitness, and Performance Genomics in 2011

PubMed Central

Roth, Stephen M.; Rankinen, Tuomo; Hagberg, James M.; Loos, Ruth J. F.; Pérusse, Louis; Sarzynski, Mark A.; Wolfarth, Bernd; Bouchard, Claude

2014-01-01

This review of the exercise genomics literature emphasizes the highest quality papers published in 2011. Given this emphasis on the best publications, only a small number of published papers are reviewed. One study found that physical activity levels were significantly lower in patients with mitochondrial DNA mutations compared to controls. A two-stage fine mapping follow-up of a previous linkage peak found strong associations between sequence variation in the activin A receptor, type-1B (ACVR1B) gene and knee extensor strength, with rs2854464 emerging as the most promising candidate polymorphism. The association of higher muscular strength with the rs2854464 A-allele was confirmed in two separate cohorts. A study using a combination of transcriptomic and genomic data identified a comprehensive map of the transcriptomic features important for aerobic exercise training-induced improvements in maximal oxygen consumption, but no genetic variants derived from candidate transcripts were associated with trainability. A large-scale de novo meta-analysis confirmed that the effect of sequence variation in the fat mass and obesity-associated (FTO) gene on the risk of obesity differs between sedentary and physically active adults. Evidence for gene-physical activity interactions on type 2 diabetes risk was found in two separate studies. A large study of women found that physical activity modified the effect of polymorphisms in the lipoprotein lipase (LPL), hepatic lipase (LIPC), and cholesteryl ester transfer protein (CETP) genes, identified in previous genome-wide association study (GWAS) reports, on HDL-C. We conclude that a strong exercise genomics corpus of evidence would not only translate into powerful genomic predictors but would also have a major impact on exercise biology and exercise behavior research. PMID:22330029
Transcriptome and proteome analysis of Salmonella enterica serovar Typhimurium systemic infection of wild type and immune-deficient mice

PubMed Central

Oshota, Olusegun; Fookes, Maria; Schreiber, Fernanda; Chaudhuri, Roy R.; Yu, Lu; Clare, Simon; Choudhary, Jyoti; Thomson, Nicholas R.; Lio, Pietro

2017-01-01

Salmonella enterica are a threat to public health. Current vaccines are not fully effective. The ability to grow in infected tissues within phagocytes is required for S. enterica virulence in systemic disease. As the infection progresses the bacteria are exposed to a complex host immune response. Consequently, in order to continue growing in the tissues, S. enterica requires the coordinated regulation of fitness genes. Bacterial gene regulation has so far been investigated largely using exposure to artificial environmental conditions or to in vitro cultured cells, and little information is available on how S. enterica adapts in vivo to sustain cell division and survival. We have studied the transcriptome, proteome and metabolic flux of Salmonella, and the transcriptome of the host during infection of wild type C57BL/6 and immune-deficient gp91-/-phox mice. Our analyses advance the understanding of how S. enterica and the host behaves during infection to a more sophisticated level than has previously been reported. PMID:28796780
Sequencing and de novo analysis of the hemocytes transcriptome in Litopenaeus vannamei response to white spot syndrome virus infection.

PubMed

Xue, Shuxia; Liu, Yichen; Zhang, Yichen; Sun, Yan; Geng, Xuyun; Sun, Jinsheng

2013-01-01

White spot syndrome virus (WSSV) is a causative pathogen found in most shrimp farming areas of the world and causes large economic losses to the shrimp aquaculture. The mechanism underlying the molecular pathogenesis of the highly virulent WSSV remains unknown. To better understand the virus-host interactions at the molecular level, the transcriptome profiles in hemocytes of unchallenged and WSSV-challenged shrimp (Litopenaeus vannamei) were compared using a short-read deep sequencing method (Illumina). RNA-seq analysis generated more than 25.81 million clean pair end (PE) reads, which were assembled into 52,073 unigenes (mean size = 520 bp). Based on sequence similarity searches, 23,568 (45.3%) genes were identified, among which 6,562 and 7,822 unigenes were assigned to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. Searches in the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) mapped 14,941 (63.4%) unigenes to 240 KEGG pathways. Among all the annotated unigenes, 1,179 were associated with immune-related genes. Digital gene expression (DGE) analysis revealed that the host transcriptome profile was slightly changed in the early infection (5 hours post injection) of the virus, while large transcriptional differences were identified in the late infection (48 hpi) of WSSV. The differentially expressed genes mainly involved in pattern recognition genes and some immune response factors. The results indicated that antiviral immune mechanisms were probably involved in the recognition of pathogen-associated molecular patterns. This study provided a global survey of host gene activities against virus infection in a non-model organism, pacific white shrimp. Results can contribute to the in-depth study of candidate genes in white shrimp, and help to improve the current understanding of host-pathogen interactions.
Seasonal and latitudinal acclimatization of cardiac transcriptome responses to thermal stress in porcelain crabs, Petrolisthes cinctipes.

PubMed

Stillman, Jonathon H; Tagmount, Abderrahmane

2009-10-01

Central predictions of climate warming models include increased climate variability and increased severity of heat waves. Physiological acclimatization in populations across large-scale ecological gradients in habitat temperature fluctuation is an important factor to consider in detecting responses to climate change related increases in thermal fluctuation. We measured in vivo cardiac thermal maxima and used microarrays to profile transcriptome heat and cold stress responses in cardiac tissue of intertidal zone porcelain crabs across biogeographic and seasonal gradients in habitat temperature fluctuation. We observed acclimatization dependent induction of heat shock proteins, as well as unknown genes with heat shock protein-like expression profiles. Thermal acclimatization had the largest effect on heat stress responses of extensin-like, beta tubulin, and unknown genes. For these genes, crabs acclimatized to thermally variable sites had higher constitutive expression than specimens from low variability sites, but heat stress dramatically induced expression in specimens from low variability sites and repressed expression in specimens from highly variable sites. Our application of ecological transcriptomics has yielded new biomarkers that may represent sensitive indicators of acclimatization to habitat temperature fluctuation. Our study also has identified novel genes whose further description may yield novel understanding of cellular responses to thermal acclimatization or thermal stress.
Probing the transcriptome of Aconitum carmichaelii reveals the candidate genes associated with the biosynthesis of the toxic aconitine-type C19-diterpenoid alkaloids.

PubMed

Zhao, Dake; Shen, Yong; Shi, Yana; Shi, Xingqiao; Qiao, Qin; Zi, Shuhui; Zhao, Erqiang; Yu, Diqiu; Kennelly, Edward J

2018-05-11

Aconitum carmichaelii has long been used as a traditional Chinese medicine, and its processed lateral roots are known commonly as fuzi. Aconitine-type C 19 -diterpenoid alkaloids accumulating in the lateral roots are some of the main toxicants of this species, yet their biosynthesis remains largely unresolved. As a first step towards understanding the biosynthesis of aconitine-type C 19 -diterpenoid alkaloids, we performed de novo transcriptome assembly and analysis of rootstocks and leaf tissues of Aconitum carmichaelii by next-generation sequencing. A total of 525 unigene candidates were identified as involved in the formation of C 19 -diterpenoid alkaloids, including those encoding enzymes in the early steps of diterpenoid alkaloids scaffold biosynthetic pathway, such as ent-copalyl diphosphate synthases, ent-kaurene synthases, kaurene oxidases, cyclases, and key aminotransferases. Furthermore, candidates responsible for decorating of diterpenoid alkaloid skeletons were discovered from transcriptome sequencing of fuzi, such as monooxygenases, methyltransferase, and BAHD acyltransferases. In addition, 645 differentially expressed genes encoding transcription factors potentially related to diterpenoid alkaloids accumulation underground were documented. Subsequent modular domain structure phylogenetics and differential expression analysis led to the identification of BAHD acyltransferases possibly involved in the formation of acetyl and benzoyl esters of diterpenoid alkaloids, associated with the acute toxicity of fuzi. The transcriptome data provide the foundation for future research into the molecular basis for aconitine-type C 19 -diterpenoid alkaloids biosynthesis in A. carmichaelii. Copyright © 2018. Published by Elsevier Ltd.
GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis.

PubMed

Zheng, Qi; Wang, Xiu-Jie

2008-07-01

Gene Ontology (GO) analysis has become a commonly used approach for functional studies of large-scale genomic or transcriptomic data. Although there have been a lot of software with GO-related analysis functions, new tools are still needed to meet the requirements for data generated by newly developed technologies or for advanced analysis purpose. Here, we present a Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them. GOEAST also provides rigorous statistical tests to enhance the reliability of analysis results. GOEAST is freely accessible at http://omicslab.genetics.ac.cn/GOEAST/
Transcriptomic analysis of the red seaweed Laurencia dendroidea (Florideophyceae, Rhodophyta) and its microbiome.

PubMed

de Oliveira, Louisi Souza; Gregoracci, Gustavo Bueno; Silva, Genivaldo Gueiros Zacarias; Salgado, Leonardo Tavares; Filho, Gilberto Amado; Alves-Ferreira, Marcio; Pereira, Renato Crespo; Thompson, Fabiano L

2012-09-17

Seaweeds of the Laurencia genus have a broad geographic distribution and are largely recognized as important sources of secondary metabolites, mainly halogenated compounds exhibiting diverse potential pharmacological activities and relevant ecological role as anti-epibiosis. Host-microbe interaction is a driving force for co-evolution in the marine environment, but molecular studies of seaweed-associated microbial communities are still rare. Despite the large amount of research describing the chemical compositions of Laurencia species, the genetic knowledge regarding this genus is currently restricted to taxonomic markers and general genome features. In this work we analyze the transcriptomic profile of L. dendroidea J. Agardh, unveil the genes involved on the biosynthesis of terpenoid compounds in this seaweed and explore the interactions between this host and its associated microbiome. A total of 6 transcriptomes were obtained from specimens of L. dendroidea sampled in three different coastal locations of the Rio de Janeiro state. Functional annotations revealed predominantly basic cellular metabolic pathways. Bacteria was the dominant active group in the microbiome of L. dendroidea, standing out nitrogen fixing Cyanobacteria and aerobic heterotrophic Proteobacteria. The analysis of the relative contribution of each domain highlighted bacterial features related to glycolysis, lipid and polysaccharide breakdown, and also recognition of seaweed surface and establishment of biofilm. Eukaryotic transcripts, on the other hand, were associated with photosynthesis, synthesis of carbohydrate reserves, and defense mechanisms, including the biosynthesis of terpenoids through the mevalonate-independent pathway. This work describes the first transcriptomic profile of the red seaweed L. dendroidea, increasing the knowledge about ESTs from the Florideophyceae algal class. Our data suggest an important role for L. dendroidea in the primary production of the holobiont and the role of Bacteria as consumers of organic matter and possibly also as nitrogen source. Furthermore, this seaweed expressed sequences related to terpene biosynthesis, including the complete mevalonate-independent pathway, which offers new possibilities for biotechnological applications using secondary metabolites from L. dendroidea.
Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data.

PubMed

Tomescu, Oana A; Mattanovich, Diethard; Thallinger, Gerhard G

2014-01-01

Technological improvements have shifted the focus from data generation to data analysis. The availability of large amounts of data from transcriptomics, protemics and metabolomics experiments raise new questions concerning suitable integrative analysis methods. We compare three integrative analysis techniques (co-inertia analysis, generalized singular value decomposition and integrative biclustering) by applying them to gene and protein abundance data from the six life cycle stages of Plasmodium falciparum. Co-inertia analysis is an analysis method used to visualize and explore gene and protein data. The generalized singular value decomposition has shown its potential in the analysis of two transcriptome data sets. Integrative Biclustering applies biclustering to gene and protein data. Using CIA, we visualize the six life cycle stages of Plasmodium falciparum, as well as GO terms in a 2D plane and interpret the spatial configuration. With GSVD, we decompose the transcriptomic and proteomic data sets into matrices with biologically meaningful interpretations and explore the processes captured by the data sets. IBC identifies groups of genes, proteins, GO Terms and life cycle stages of Plasmodium falciparum. We show method-specific results as well as a network view of the life cycle stages based on the results common to all three methods. Additionally, by combining the results of the three methods, we create a three-fold validated network of life cycle stage specific GO terms: Sporozoites are associated with transcription and transport; merozoites with entry into host cell as well as biosynthetic and metabolic processes; rings with oxidation-reduction processes; trophozoites with glycolysis and energy production; schizonts with antigenic variation and immune response; gametocyctes with DNA packaging and mitochondrial transport. Furthermore, the network connectivity underlines the separation of the intraerythrocytic cycle from the gametocyte and sporozoite stages. Using integrative analysis techniques, we can integrate knowledge from different levels and obtain a wider view of the system under study. The overlap between method-specific and common results is considerable, even if the basic mathematical assumptions are very different. The three-fold validated network of life cycle stage characteristics of Plasmodium falciparum could identify a large amount of the known associations from literature in only one study.
Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data

PubMed Central

2014-01-01

Background Technological improvements have shifted the focus from data generation to data analysis. The availability of large amounts of data from transcriptomics, protemics and metabolomics experiments raise new questions concerning suitable integrative analysis methods. We compare three integrative analysis techniques (co-inertia analysis, generalized singular value decomposition and integrative biclustering) by applying them to gene and protein abundance data from the six life cycle stages of Plasmodium falciparum. Co-inertia analysis is an analysis method used to visualize and explore gene and protein data. The generalized singular value decomposition has shown its potential in the analysis of two transcriptome data sets. Integrative Biclustering applies biclustering to gene and protein data. Results Using CIA, we visualize the six life cycle stages of Plasmodium falciparum, as well as GO terms in a 2D plane and interpret the spatial configuration. With GSVD, we decompose the transcriptomic and proteomic data sets into matrices with biologically meaningful interpretations and explore the processes captured by the data sets. IBC identifies groups of genes, proteins, GO Terms and life cycle stages of Plasmodium falciparum. We show method-specific results as well as a network view of the life cycle stages based on the results common to all three methods. Additionally, by combining the results of the three methods, we create a three-fold validated network of life cycle stage specific GO terms: Sporozoites are associated with transcription and transport; merozoites with entry into host cell as well as biosynthetic and metabolic processes; rings with oxidation-reduction processes; trophozoites with glycolysis and energy production; schizonts with antigenic variation and immune response; gametocyctes with DNA packaging and mitochondrial transport. Furthermore, the network connectivity underlines the separation of the intraerythrocytic cycle from the gametocyte and sporozoite stages. Conclusion Using integrative analysis techniques, we can integrate knowledge from different levels and obtain a wider view of the system under study. The overlap between method-specific and common results is considerable, even if the basic mathematical assumptions are very different. The three-fold validated network of life cycle stage characteristics of Plasmodium falciparum could identify a large amount of the known associations from literature in only one study. PMID:25033389
Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

PubMed Central

Seaver, Samuel M. D.; Bradbury, Louis M. T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

2015-01-01

There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes. PMID:25806041
Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

DOE PAGES

Seaver, Samuel M.D.; Bradbury, Louis M.T.; Frelin, Océane; ...

2015-03-10

There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions andmore » possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.« less
Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

PubMed

Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.
Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

PubMed Central

Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551
Comparative transcriptomic analysis reveals phenol tolerance mechanism of evolved Chlorella strain.

PubMed

Zhou, Lin; Cheng, Dujia; Wang, Liang; Gao, Juan; Zhao, Quanyu; Wei, Wei; Sun, Yuhan

2017-03-01

The growth of microalgae is inhibited by high concentration phenol due to reactive oxygen species. An evolved strain tolerated to 500mg/L phenol, Chlorella sp. L5, was obtained in previous study. In this study, comparative transcriptomic analysis was performed for Chlorella sp. L5 and its original strain (Chlorella sp. L3). The tolerance mechanism of Chlorella sp. L5 for high concentration phenol was explored on genome scale. It was identified that the up-regulations of the related genes according to antioxidant enzymes (SOD, APX, CAT and GR) and carotenoids (astaxanthin, lutein and lycopene) biosynthesis had critical roles to tolerate high concentration phenol. In addition, most of genes of PS I, PS II, photosynthetic electron transport chain and starch biosynthesis were also up-regulated. It was consistent to the experimental results of total carbohydrate contents of Chlorella sp. L3 and Chlorella sp. L5 under 0mg/L and 500mg/L phenol. Copyright © 2016 Elsevier Ltd. All rights reserved.
Comprehensive phylogeny of ray-finned fishes (Actinopterygii) based on transcriptomic and genomic data.

PubMed

Hughes, Lily C; Ortí, Guillermo; Huang, Yu; Sun, Ying; Baldwin, Carole C; Thompson, Andrew W; Arcila, Dahiana; Betancur-R, Ricardo; Li, Chenhong; Becker, Leandro; Bellora, Nicolás; Zhao, Xiaomeng; Li, Xiaofeng; Wang, Min; Fang, Chao; Xie, Bing; Zhou, Zhuocheng; Huang, Hai; Chen, Songlin; Venkatesh, Byrappa; Shi, Qiong

2018-05-14

Our understanding of phylogenetic relationships among bony fishes has been transformed by analysis of a small number of genes, but uncertainty remains around critical nodes. Genome-scale inferences so far have sampled a limited number of taxa and genes. Here we leveraged 144 genomes and 159 transcriptomes to investigate fish evolution with an unparalleled scale of data: >0.5 Mb from 1,105 orthologous exon sequences from 303 species, representing 66 out of 72 ray-finned fish orders. We apply phylogenetic tests designed to trace the effect of whole-genome duplication events on gene trees and find paralogy-free loci using a bioinformatics approach. Genome-wide data support the structure of the fish phylogeny, and hypothesis-testing procedures appropriate for phylogenomic datasets using explicit gene genealogy interrogation settle some long-standing uncertainties, such as the branching order at the base of the teleosts and among early euteleosts, and the sister lineage to the acanthomorph and percomorph radiations. Comprehensive fossil calibrations date the origin of all major fish lineages before the end of the Cretaceous.
Transcriptome sequencing analysis of novel sRNAs of Kineococcus radiotolerans in response to ionizing radiation.

PubMed

Chen, Zhouwei; Li, Lufeng; Shan, Zhan; Huang, Hannian; Chen, Huan; Ding, Xianfeng; Guo, Jiangfeng; Liu, Lili

2016-11-01

Kineococcus radiotolerans is a Gram-positive, radio-resistant bacterium isolated from a radioactive environment. The small noncoding RNAs (sRNAs) in bacteria are reported to play roles in the immediate response to stress and/or the recovery from stress. The analysis of K. radiotolerans transcriptome sequencing results can identify these sRNAs in a genome-wide detection, using RNA sequencing (RNA-seq) by the deep sequencing technique. In this study, the raw data of radiation-exposed samples (RS) and control samples (CS) were acquired separately from the sequencing platform. There were 217 common sRNA candidates in the two samples screened in the genome-wide scale by bioinformatics analysis. There were 43 differentially expressed sRNA candidates, including 28 up-regulated and 15 down-regulated ones. The down-regulated sRNAs were selected for the sRNA target prediction, of which 12 sRNAs that may modulate the genes related to the transcription regulation and DNA repair were considered as the candidates involved in the radio-resistance regulation system. Copyright © 2016 Elsevier GmbH. All rights reserved.

Ultra-low input transcriptomics reveal the spore functional content and phylogenetic affiliations of poorly studied arbuscular mycorrhizal fungi

PubMed Central

Beaudet, Denis; Chen, Eric C H; Mathieu, Stephanie; Yildirir, Gokalp; Ndikumana, Steve; Dalpé, Yolande; Séguin, Sylvie; Farinelli, Laurent; Stajich, Jason E; Corradi, Nicolas

2018-01-01

Abstract Arbuscular mycorrhizal fungi (AMF) are a group of soil microorganisms that establish symbioses with the vast majority of land plants. To date, generation of AMF coding information has been limited to model genera that grow well axenically; Rhizoglomus and Gigaspora. Meanwhile, data on the functional gene repertoire of most AMF families is non-existent. Here, we provide primary large-scale transcriptome data from eight poorly studied AMF species (Acaulospora morrowiae, Diversispora versiforme, Scutellospora calospora, Racocetra castanea, Paraglomus brasilianum, Ambispora leptoticha, Claroideoglomus claroideum and Funneliformis mosseae) using ultra-low input ribonucleic acid (RNA)-seq approaches. Our analyses reveals that quiescent spores of many AMF species harbour a diverse functional diversity and solidify known evolutionary relationships within the group. Our findings demonstrate that RNA-seq data obtained from low-input RNA are reliable in comparison to conventional RNA-seq experiments. Thus, our methodology can potentially be used to deepen our understanding of fungal microbial function and phylogeny using minute amounts of RNA material. PMID:29211832
Construction of a robust microarray from a non-model species (largemouth bass) using pyrosequencing technology

PubMed Central

Garcia-Reyero, Natàlia; Griffitt, Robert J.; Liu, Li; Kroll, Kevin J.; Farmerie, William G.; Barber, David S.; Denslow, Nancy D.

2009-01-01

A novel custom microarray for largemouth bass (Micropterus salmoides) was designed with sequences obtained from a normalized cDNA library using the 454 Life Sciences GS-20 pyrosequencer. This approach yielded in excess of 58 million bases of high-quality sequence. The sequence information was combined with 2,616 reads obtained by traditional suppressive subtractive hybridizations to derive a total of 31,391 unique sequences. Annotation and coding sequences were predicted for these transcripts where possible. 16,350 annotated transcripts were selected as target sequences for the design of the custom largemouth bass oligonucleotide microarray. The microarray was validated by examining the transcriptomic response in male largemouth bass exposed to 17β-œstradiol. Transcriptomic responses were assessed in liver and gonad, and indicated gene expression profiles typical of exposure to œstradiol. The results demonstrate the potential to rapidly create the tools necessary to assess large scale transcriptional responses in non-model species, paving the way for expanded impact of toxicogenomics in ecotoxicology. PMID:19936325
Transcriptome Analysis of Chlorantraniliprole Resistance Development in the Diamondback Moth Plutella xylostella

PubMed Central

Hu, Zhendi; Chen, Huanyu; Yin, Fei; Li, Zhenyu; Dong, Xiaolin; Zhang, Deyong; Ren, Shunxiang; Feng, Xia

2013-01-01

Background The diamondback moth Plutella xyllostella has developed a high level of resistance to the latest insecticide chlorantraniliprole. A better understanding of P. xylostella’s resistance mechanism to chlorantraniliprole is needed to develop effective approaches for insecticide resistance management. Principal Findings To provide a comprehensive insight into the resistance mechanisms of P. xylostella to chlorantraniliprole, transcriptome assembly and tag-based digital gene expression (DGE) system were performed using Illumina HiSeq™ 2000. The transcriptome analysis of the susceptible strain (SS) provided 45,231 unigenes (with the size ranging from 200 bp to 13,799 bp), which would be efficient for analyzing the differences in different chlorantraniliprole-resistant P. xylostella stains. DGE analysis indicated that a total of 1215 genes (189 up-regulated and 1026 down-regulated) were gradient differentially expressed among the susceptible strain (SS) and different chlorantraniliprole-resistant P. xylostella strains, including low-level resistance (GXA), moderate resistance (LZA) and high resistance strains (HZA). A detailed analysis of gradient differentially expressed genes elucidated the existence of a phase-dependent divergence of biological investment at the molecular level. The genes related to insecticide resistance, such as P450, GST, the ryanodine receptor, and connectin, had different expression profiles in the different chlorantraniliprole-resistant DGE libraries, suggesting that the genes related to insecticide resistance are involved in P. xylostella resistance development against chlorantraniliprole. To confirm the results from the DGE, the expressional profiles of 4 genes related to insecticide resistance were further validated by qRT-PCR analysis. Conclusions The obtained transcriptome information provides large gene resources available for further studying the resistance development of P. xylostella to pesticides. The DGE data provide comprehensive insights into the gene expression profiles of the different chlorantraniliprole-resistant stains. These genes are specifically related to insecticide resistance, with different expressional profiles facilitating the study of the role of each gene in chlorantraniliprole resistance development. PMID:23977278
A New Omics Data Resource of Pleurocybella porrigens for Gene Discovery

PubMed Central

Dohra, Hideo; Someya, Takumi; Takano, Tomoyuki; Harada, Kiyonori; Omae, Saori; Hirai, Hirofumi; Yano, Kentaro; Kawagishi, Hirokazu

2013-01-01

Background Pleurocybella porrigens is a mushroom-forming fungus, which has been consumed as a traditional food in Japan. In 2004, 55 people were poisoned by eating the mushroom and 17 people among them died of acute encephalopathy. Since then, the Japanese government has been alerting Japanese people to take precautions against eating the P . porrigens mushroom. Unfortunately, despite efforts, the molecular mechanism of the encephalopathy remains elusive. The genome and transcriptome sequence data of P . porrigens and the related species, however, are not stored in the public database. To gain the omics data in P . porrigens , we sequenced genome and transcriptome of its fruiting bodies and mycelia by next generation sequencing. Methodology/Principal Findings Short read sequences of genomic DNAs and mRNAs in P . porrigens were generated by Illumina Genome Analyzer. Genome short reads were de novo assembled into scaffolds using Velvet. Comparisons of genome signatures among Agaricales showed that P . porrigens has a unique genome signature. Transcriptome sequences were assembled into contigs (unigenes). Biological functions of unigenes were predicted by Gene Ontology and KEGG pathway analyses. The majority of unigenes would be novel genes without significant counterparts in the public omics databases. Conclusions Functional analyses of unigenes present the existence of numerous novel genes in the basidiomycetes division. The results mean that the omics information such as genome, transcriptome and metabolome in basidiomycetes is short in the current databases. The large-scale omics information on P . porrigens , provided from this research, will give a new data resource for gene discovery in basidiomycetes. PMID:23936076
Integration of a constraint-based metabolic model of Brassica napus developing seeds with 13C-metabolic flux analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hay, Jordan O.; Shi, Hai; Heinzel, Nicolas

The use of large-scale or genome-scale metabolic reconstructions for modeling and simulation of plant metabolism and integration of those models with large-scale omics and experimental flux data is becoming increasingly important in plant metabolic research. Here we report an updated version of bna572, a bottom-up reconstruction of oilseed rape (Brassica napus L.; Brassicaceae) developing seeds with emphasis on representation of biomass-component biosynthesis. New features include additional seed-relevant pathways for isoprenoid, sterol, phenylpropanoid, flavonoid, and choline biosynthesis. Being now based on standardized data formats and procedures for model reconstruction, bna572+ is available as a COBRA-compliant Systems Biology Markup Language (SBML) modelmore » and conforms to the Minimum Information Requested in the Annotation of Biochemical Models (MIRIAM) standards for annotation of external data resources. Bna572+ contains 966 genes, 671 reactions, and 666 metabolites distributed among 11 subcellular compartments. It is referenced to the Arabidopsis thaliana genome, with gene-protein-reaction (GPR) associations resolving subcellular localization. Detailed mass and charge balancing and confidence scoring were applied to all reactions. Using B. napus seed specific transcriptome data, expression was verified for 78% of bna572+ genes and 97% of reactions. Alongside bna572+ we also present a revised carbon centric model for 13C-Metabolic Flux Analysis ( 13C-MFA) with all its reactions being referenced to bna572+ based on linear projections. By integration of flux ratio constraints obtained from 13C-MFA and by elimination of infinite flux bounds around thermodynamically infeasible loops based on COBRA loopless methods, we demonstrate improvements in predictive power of Flux Variability Analysis (FVA). In conclusion, using this combined approach we characterize the difference in metabolic flux of developing seeds of two B. napus genotypes contrasting in starch and oil content.« less
Integration of a constraint-based metabolic model of Brassica napus developing seeds with 13C-metabolic flux analysis

DOE PAGES

Hay, Jordan O.; Shi, Hai; Heinzel, Nicolas; ...

2014-12-19

The use of large-scale or genome-scale metabolic reconstructions for modeling and simulation of plant metabolism and integration of those models with large-scale omics and experimental flux data is becoming increasingly important in plant metabolic research. Here we report an updated version of bna572, a bottom-up reconstruction of oilseed rape (Brassica napus L.; Brassicaceae) developing seeds with emphasis on representation of biomass-component biosynthesis. New features include additional seed-relevant pathways for isoprenoid, sterol, phenylpropanoid, flavonoid, and choline biosynthesis. Being now based on standardized data formats and procedures for model reconstruction, bna572+ is available as a COBRA-compliant Systems Biology Markup Language (SBML) modelmore » and conforms to the Minimum Information Requested in the Annotation of Biochemical Models (MIRIAM) standards for annotation of external data resources. Bna572+ contains 966 genes, 671 reactions, and 666 metabolites distributed among 11 subcellular compartments. It is referenced to the Arabidopsis thaliana genome, with gene-protein-reaction (GPR) associations resolving subcellular localization. Detailed mass and charge balancing and confidence scoring were applied to all reactions. Using B. napus seed specific transcriptome data, expression was verified for 78% of bna572+ genes and 97% of reactions. Alongside bna572+ we also present a revised carbon centric model for 13C-Metabolic Flux Analysis ( 13C-MFA) with all its reactions being referenced to bna572+ based on linear projections. By integration of flux ratio constraints obtained from 13C-MFA and by elimination of infinite flux bounds around thermodynamically infeasible loops based on COBRA loopless methods, we demonstrate improvements in predictive power of Flux Variability Analysis (FVA). In conclusion, using this combined approach we characterize the difference in metabolic flux of developing seeds of two B. napus genotypes contrasting in starch and oil content.« less
The next generation of melanocyte data: Genetic, epigenetic, and transcriptional resource datasets and analysis tools.

PubMed

Loftus, Stacie K

2018-05-01

The number of melanocyte- and melanoma-derived next generation sequence genome-scale datasets have rapidly expanded over the past several years. This resource guide provides a summary of publicly available sources of melanocyte cell derived whole genome, exome, mRNA and miRNA transcriptome, chromatin accessibility and epigenetic datasets. Also highlighted are bioinformatic resources and tools for visualization and data queries which allow researchers a genome-scale view of the melanocyte. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
Octopus-toolkit: a workflow to automate mining of public epigenomic and transcriptomic next-generation sequencing data

PubMed Central

Kim, Taemook; Seo, Hogyu David; Hennighausen, Lothar; Lee, Daeyoup

2018-01-01

Abstract Octopus-toolkit is a stand-alone application for retrieving and processing large sets of next-generation sequencing (NGS) data with a single step. Octopus-toolkit is an automated set-up-and-analysis pipeline utilizing the Aspera, SRA Toolkit, FastQC, Trimmomatic, HISAT2, STAR, Samtools, and HOMER applications. All the applications are installed on the user's computer when the program starts. Upon the installation, it can automatically retrieve original files of various epigenomic and transcriptomic data sets, including ChIP-seq, ATAC-seq, DNase-seq, MeDIP-seq, MNase-seq and RNA-seq, from the gene expression omnibus data repository. The downloaded files can then be sequentially processed to generate BAM and BigWig files, which are used for advanced analyses and visualization. Currently, it can process NGS data from popular model genomes such as, human (Homo sapiens), mouse (Mus musculus), dog (Canis lupus familiaris), plant (Arabidopsis thaliana), zebrafish (Danio rerio), fruit fly (Drosophila melanogaster), worm (Caenorhabditis elegans), and budding yeast (Saccharomyces cerevisiae) genomes. With the processed files from Octopus-toolkit, the meta-analysis of various data sets, motif searches for DNA-binding proteins, and the identification of differentially expressed genes and/or protein-binding sites can be easily conducted with few commands by users. Overall, Octopus-toolkit facilitates the systematic and integrative analysis of available epigenomic and transcriptomic NGS big data. PMID:29420797
De novo comparative transcriptome analysis of genes involved in fruit morphology of pumpkin cultivars with extreme size difference and development of EST-SSR markers.

PubMed

Xanthopoulou, Aliki; Ganopoulos, Ioannis; Psomopoulos, Fotis; Manioudaki, Maria; Moysiadis, Theodoros; Kapazoglou, Aliki; Osathanunkul, Maslin; Michailidou, Sofia; Kalivas, Apostolos; Tsaftaris, Athanasios; Nianiou-Obeidat, Irini; Madesis, Panagiotis

2017-07-30

The genetic basis of fruit size and shape was investigated for the first time in Cucurbita species and genetic loci associated with fruit morphology have been identified. Although extensive genomic resources are available at present for tomato (Solanum lycopersicum), cucumber (Cucumis sativus), melon (Cucumis melo) and watermelon (Citrullus lanatus), genomic databases for Cucurbita species are limited. Recently, our group reported the generation of pumpkin (Cucurbita pepo) transcriptome databases from two contrasting cultivars with extreme fruit sizes. In the current study we used these databases to perform comparative transcriptome analysis in order to identify genes with potential roles in fruit morphology and fruit size. Differential Gene Expression (DGE) analysis between cv. 'Munchkin' (small-fruit) and cv. 'Big Moose' (large-fruit) revealed a variety of candidate genes associated with fruit morphology with significant differences in gene expression between the two cultivars. In addition, we have set the framework for generating EST-SSR markers, which discriminate different C. pepo cultivars and show transferability to related Cucurbitaceae species. The results of the present study will contribute to both further understanding the molecular mechanisms regulating fruit morphology and furthermore identifying the factors that determine fruit size. Moreover, they may lead to the development of molecular marker tools for selecting genotypes with desired morphological traits. Copyright © 2017. Published by Elsevier B.V.
Preliminary profiling of blood transcriptome in a rat model of hemorrhagic shock

PubMed Central

Braga, D; Barcella, M; D’Avila, F; Lupoli, S; Tagliaferri, F; Santamaria, MH; DeLano, FA; Baselli, G; Schmid-Schönbein, GW; Kistler, EB; Aletti, F

2017-01-01

Hemorrhagic shock is a leading cause of morbidity and mortality worldwide. Significant blood loss may lead to decreased blood pressure and inadequate tissue perfusion with resultant organ failure and death, even after replacement of lost blood volume. One reason for this high acuity is that the fundamental mechanisms of shock are poorly understood. Proteomic and metabolomic approaches have been used to investigate the molecular events occurring in hemorrhagic shock but, to our knowledge, a systematic analysis of the transcriptomic profile is missing. Therefore, a pilot analysis using paired-end RNA sequencing was used to identify changes that occur in the blood transcriptome of rats subjected to hemorrhagic shock after blood reinfusion. Hemorrhagic shock was induced using a Wigger’s shock model. The transcriptome of whole blood from shocked animals shows modulation of genes related to inflammation and immune response (Tlr13, Il1b, Ccl6, Lgals3), antioxidant functions (Mt2A, Mt1), tissue injury and repair pathways (Gpnmb, Trim72) and lipid mediators (Alox5ap, Ltb4r, Ptger2) compared with control animals. These findings are congruent with results obtained in hemorrhagic shock analysis by other authors using metabolomics and proteomics. The analysis of blood transcriptome may be a valuable tool to understand the biological changes occurring in hemorrhagic shock and a promising approach for the identification of novel biomarkers and therapeutic targets. Impact statement This study provides the first pilot analysis of the changes occurring in transcriptome expression of whole blood in hemorrhagic shock (HS) rats. We showed that the analysis of blood transcriptome is a useful approach to investigate pathways and functional alterations in this disease condition. This pilot study encourages the possible application of transcriptome analysis in the clinical setting, for the molecular profiling of whole blood in HS patients. PMID:28661205
Integrated analysis, transcriptome-lipidome, reveals the effects of INO-level (INO2 and INO4) on lipid metabolism in yeast.

PubMed

Chumnanpuen, Pramote; Nookaew, Intawat; Nielsen, Jens

2013-10-16

In the yeast Saccharomyces cerevisiae, genes containing UASINO sequences are regulated by the Ino2/Ino4 and Opi1 transcription factors, and this regulation controls lipid biosynthesis. The expression level of INO2 and INO4 genes (INO-level) at different nutrient limited conditions might lead to various responses in yeast lipid metabolism. In this study, we undertook a global study on how INO-levels (transcription level of INO2 and INO4) affect lipid metabolism in yeast and we also studied the effects of single and double deletions of the two INO-genes (deficient effect). Using 2 types of nutrient limitations (carbon and nitrogen) in chemostat cultures operated at a fixed specific growth rate of 0.1 h-1 and strains having different INO-level, we were able to see the effect on expression level of the genes involved in lipid biosynthesis and the fluxes towards the different lipid components. Through combined measurements of the transcriptome, metabolome, and lipidome it was possible to obtain a large dataset that could be used to identify how the INO-level controls lipid metabolism and also establish correlations between the different components. In this study, we undertook a global study on how INO-levels (transcription level of INO2 and INO4) affect lipid metabolism in yeast and we also studied the effects of single and double deletions of the two INO-genes (deficient effect). Using 2 types of nutrient limitations (carbon and nitrogen) in chemostat cultures operated at a fixed specific growth rate of 0.1 h-1 and strains having different INO-level, we were able to see the effect on expression level of the genes involved in lipid biosynthesis and the fluxes towards the different lipid components. Through combined measurements of the transcriptome, metabolome, and lipidome it was possible to obtain a large dataset that could be used to identify how the INO-level controls lipid metabolism and also establish correlations between the different components. Our analysis showed the strength of using a combination of transcriptome and lipidome analysis to illustrate the effect of INO-levels on phospholipid metabolism and based on our analysis we established a global regulatory map.
Sequencing of the needle transcriptome from Norway spruce (Picea abies Karst L.) reveals lower substitution rates, but similar selective constraints in gymnosperms and angiosperms

PubMed Central

2012-01-01

Background A detailed knowledge about spatial and temporal gene expression is important for understanding both the function of genes and their evolution. For the vast majority of species, transcriptomes are still largely uncharacterized and even in those where substantial information is available it is often in the form of partially sequenced transcriptomes. With the development of next generation sequencing, a single experiment can now simultaneously identify the transcribed part of a species genome and estimate levels of gene expression. Results mRNA from actively growing needles of Norway spruce (Picea abies) was sequenced using next generation sequencing technology. In total, close to 70 million fragments with a length of 76 bp were sequenced resulting in 5 Gbp of raw data. A de novo assembly of these reads, together with publicly available expressed sequence tag (EST) data from Norway spruce, was used to create a reference transcriptome. Of the 38,419 PUTs (putative unique transcripts) longer than 150 bp in this reference assembly, 83.5% show similarity to ESTs from other spruce species and of the remaining PUTs, 3,704 show similarity to protein sequences from other plant species, leaving 4,167 PUTs with limited similarity to currently available plant proteins. By predicting coding frames and comparing not only the Norway spruce PUTs, but also PUTs from the close relatives Picea glauca and Picea sitchensis to both Pinus taeda and Taxus mairei, we obtained estimates of synonymous and non-synonymous divergence among conifer species. In addition, we detected close to 15,000 SNPs of high quality and estimated gene expression differences between samples collected under dark and light conditions. Conclusions Our study yielded a large number of single nucleotide polymorphisms as well as estimates of gene expression on transcriptome scale. In agreement with a recent study we find that the synonymous substitution rate per year (0.6 × 10−09 and 1.1 × 10−09) is an order of magnitude smaller than values reported for angiosperm herbs. However, if one takes generation time into account, most of this difference disappears. The estimates of the dN/dS ratio (non-synonymous over synonymous divergence) reported here are in general much lower than 1 and only a few genes showed a ratio larger than 1. PMID:23122049
Solar Radiation Stress in Natural Acidophilic Biofilms of Euglena mutabilis Revealed by Metatranscriptomics and PAM Fluorometry.

PubMed

Puente-Sánchez, Fernando; Olsson, Sanna; Gómez-Rodriguez, Manuel; Souza-Egipsy, Virginia; Altamirano-Jeschke, Maria; Amils, Ricardo; Parro, Victor; Aguilera, Angeles

2016-02-01

The daily photosynthetic performance of a natural biofilm of the extreme acidophilic Euglena mutabilis from Río Tinto (SW, Spain) under full solar radiation was analyzed by means of pulse amplitude-modulated (PAM) fluorescence measurements and metatrascriptomic analysis. Natural E. mutabilis biofilms undergo large-scale transcriptomic reprogramming during midday due to a dynamic photoinhibition and solar radiation stress. Photoinhibition is due to UV radiation and not to light intensity, as revealed by PAM fluorometry analysis. In order to minimize the negative effects of solar radiation, our data supports the presence of a circadian rhythm in this euglenophyte that increases their opportunity to survive. Differential gene expression throughout the day (at 12:00, 20:00 and night) was monitored by massive Illumina parallel sequencing of metatranscriptomic libraries. The transcription pattern was altered in genes involved in Photosystem II stability and repair, UV damaged DNA repair, non-photochemical quenching and oxidative stress, supporting the photoinhibition detected by PAM fluorometry at midday. Copyright © 2016 Elsevier GmbH. All rights reserved.
Metformin-Induced Changes of the Coding Transcriptome and Non-Coding RNAs in the Livers of Non-Alcoholic Fatty Liver Disease Mice.

PubMed

Guo, Jun; Zhou, Yuan; Cheng, Yafen; Fang, Weiwei; Hu, Gang; Wei, Jie; Lin, Yajun; Man, Yong; Guo, Lixin; Sun, Mingxiao; Cui, Qinghua; Li, Jian

2018-01-01

Recent studies have suggested that changes in non-coding mRNA play a key role in the progression of non-alcoholic fatty liver disease (NAFLD). Metformin is now recommended and effective for the treatment of NAFLD. We hope the current analyses of the non-coding mRNA transcriptome will provide a better presentation of the potential roles of mRNAs and long non-coding RNAs (lncRNAs) that underlie NAFLD and metformin intervention. The present study mainly analysed changes in the coding transcriptome and non-coding RNAs after the application of a five-week metformin intervention. Liver samples from three groups of mice were harvested for transcriptome profiling, which covered mRNA, lncRNA, microRNA (miRNA) and circular RNA (circRNA), using a microarray technique. A systematic alleviation of high-fat diet (HFD)-induced transcriptome alterations by metformin was observed. The metformin treatment largely reversed the correlations with diabetes-related pathways. Our analysis also suggested interaction networks between differentially expressed lncRNAs and known hepatic disease genes and interactions between circRNA and their disease-related miRNA partners. Eight HFD-responsive lncRNAs and three metformin-responsive lncRNAs were noted due to their widespread associations with disease genes. Moreover, seven miRNAs that interacted with multiple differentially expressed circRNAs were highlighted because they were likely to be associated with metabolic or liver diseases. The present study identified novel changes in the coding transcriptome and non-coding RNAs in the livers of NAFLD mice after metformin treatment that might shed light on the underlying mechanism by which metformin impedes the progression of NAFLD. © 2018 The Author(s). Published by S. Karger AG, Basel.
Digital transcriptome profiling of normal and glioblastoma-derived neural stem cells identifies genes associated with patient survival

PubMed Central

2012-01-01

Background Glioblastoma multiforme, the most common type of primary brain tumor in adults, is driven by cells with neural stem (NS) cell characteristics. Using derivation methods developed for NS cells, it is possible to expand tumorigenic stem cells continuously in vitro. Although these glioblastoma-derived neural stem (GNS) cells are highly similar to normal NS cells, they harbor mutations typical of gliomas and initiate authentic tumors following orthotopic xenotransplantation. Here, we analyzed GNS and NS cell transcriptomes to identify gene expression alterations underlying the disease phenotype. Methods Sensitive measurements of gene expression were obtained by high-throughput sequencing of transcript tags (Tag-seq) on adherent GNS cell lines from three glioblastoma cases and two normal NS cell lines. Validation by quantitative real-time PCR was performed on 82 differentially expressed genes across a panel of 16 GNS and 6 NS cell lines. The molecular basis and prognostic relevance of expression differences were investigated by genetic characterization of GNS cells and comparison with public data for 867 glioma biopsies. Results Transcriptome analysis revealed major differences correlated with glioma histological grade, and identified misregulated genes of known significance in glioblastoma as well as novel candidates, including genes associated with other malignancies or glioma-related pathways. This analysis further detected several long non-coding RNAs with expression profiles similar to neighboring genes implicated in cancer. Quantitative PCR validation showed excellent agreement with Tag-seq data (median Pearson r = 0.91) and discerned a gene set robustly distinguishing GNS from NS cells across the 22 lines. These expression alterations include oncogene and tumor suppressor changes not detected by microarray profiling of tumor tissue samples, and facilitated the identification of a GNS expression signature strongly associated with patient survival (P = 1e-6, Cox model). Conclusions These results support the utility of GNS cell cultures as a model system for studying the molecular processes driving glioblastoma and the use of NS cells as reference controls. The association between a GNS expression signature and survival is consistent with the hypothesis that a cancer stem cell component drives tumor growth. We anticipate that analysis of normal and malignant stem cells will be an important complement to large-scale profiling of primary tumors. PMID:23046790
Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion

Treesearch

Diego Martinez; Jean Challacombe; Ingo Morgenstern; David Hibbett; Monika Schmoll; Christian P. Kubicek; Patricia Ferreira; Francisco J. Ruiz-Duenas; Angel T. Martinez; Philip J. Kersten; Kenneth E. Hammel; Jill A. Gaskell; Daniel Cullen

2009-01-01

Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome,...
Integrated Analysis of Transcriptomic and Proteomic Data

PubMed Central

Haider, Saad; Pal, Ranadip

2013-01-01

Until recently, understanding the regulatory behavior of cells has been pursued through independent analysis of the transcriptome or the proteome. Based on the central dogma, it was generally assumed that there exist a direct correspondence between mRNA transcripts and generated protein expressions. However, recent studies have shown that the correlation between mRNA and Protein expressions can be low due to various factors such as different half lives and post transcription machinery. Thus, a joint analysis of the transcriptomic and proteomic data can provide useful insights that may not be deciphered from individual analysis of mRNA or protein expressions. This article reviews the existing major approaches for joint analysis of transcriptomic and proteomic data. We categorize the different approaches into eight main categories based on the initial algorithm and final analysis goal. We further present analogies with other domains and discuss the existing research problems in this area. PMID:24082820
Exploring the Transcriptome of Ciliated Cells Using In Silico Dissection of Human Tissues

PubMed Central

Ivliev, Alexander E.; 't Hoen, Peter A. C.; van Roon-Mom, Willeke M. C.; Peters, Dorien J. M.; Sergeeva, Marina G.

2012-01-01

Cilia are cell organelles that play important roles in cell motility, sensory and developmental functions and are involved in a range of human diseases, known as ciliopathies. Here, we search for novel human genes related to cilia using a strategy that exploits the previously reported tendency of cell type-specific genes to be coexpressed in the transcriptome of complex tissues. Gene coexpression networks were constructed using the noise-resistant WGCNA algorithm in 12 publicly available microarray datasets from human tissues rich in motile cilia: airways, fallopian tubes and brain. A cilia-related coexpression module was detected in 10 out of the 12 datasets. A consensus analysis of this module's gene composition recapitulated 297 known and predicted 74 novel cilia-related genes. 82% of the novel candidates were supported by tissue-specificity expression data from GEO and/or proteomic data from the Human Protein Atlas. The novel findings included a set of genes (DCDC2, DYX1C1, KIAA0319) related to a neurological disease dyslexia suggesting their potential involvement in ciliary functions. Furthermore, we searched for differences in gene composition of the ciliary module between the tissues. A multidrug-and-toxin extrusion transporter MATE2 (SLC47A2) was found as a brain-specific central gene in the ciliary module. We confirm the localization of MATE2 in cilia by immunofluorescence staining using MDCK cells as a model. While MATE2 has previously gained attention as a pharmacologically relevant transporter, its potential relation to cilia is suggested for the first time. Taken together, our large-scale analysis of gene coexpression networks identifies novel genes related to human cell cilia. PMID:22558177
The transcriptome of Nacobbus aberrans reveals insights into the evolution of sedentary endoparasitism in plant-parasitic nematodes.

PubMed

Eves-van den Akker, Sebastian; Lilley, Catherine J; Danchin, Etienne G J; Rancurel, Corinne; Cock, Peter J A; Urwin, Peter E; Jones, John T

2014-08-13

Within the phylum Nematoda, plant-parasitism is hypothesized to have arisen independently on at least four occasions. The most economically damaging plant-parasitic nematode species, and consequently the most widely studied, are those that feed as they migrate destructively through host roots causing necrotic lesions (migratory endoparasites) and those that modify host root tissue to create a nutrient sink from which they feed (sedentary endoparasites). The false root-knot nematode Nacobbus aberrans is the only known species to have both migratory endoparasitic and sedentary endoparasitic stages within its life cycle. Moreover, its sedentary stage appears to have characteristics of both the root-knot and the cyst nematodes. We present the first large-scale genetic resource of any false-root knot nematode species. We use RNAseq to describe relative abundance changes in all expressed genes across the life cycle to provide interesting insights into the biology of this nematode as it transitions between modes of parasitism. A multigene phylogenetic analysis of N. aberrans with respect to plant-parasitic nematodes of all groups confirms its proximity to both cyst and root-knot nematodes. We present a transcriptome-wide analysis of both lateral gene transfer events and the effector complement. Comparing parasitism genes of typical root-knot and cyst nematodes to those of N. aberrans has revealed interesting similarities. Importantly, genes that were believed to be either cyst nematode, or root-knot nematode, "specific" have both been identified in N. aberrans. Our results provide insights into the characteristics of a common ancestor and the evolution of sedentary endoparasitism of plants by nematodes. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Analysis of the Nicotiana tabacum Stigma/Style Transcriptome Reveals Gene Expression Differences between Wet and Dry Stigma Species1[W][OA

PubMed Central

Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.

2009-01-01

The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150

Transcriptomic and metabolomic analysis of copper stress acclimation in Ectocarpus siliculosus highlights signaling and tolerance mechanisms in brown algae

PubMed Central

2014-01-01

Background Brown algae are sessile macro-organisms of great ecological relevance in coastal ecosystems. They evolved independently from land plants and other multicellular lineages, and therefore hold several original ontogenic and metabolic features. Most brown algae grow along the coastal zone where they face frequent environmental changes, including exposure to toxic levels of heavy metals such as copper (Cu). Results We carried out large-scale transcriptomic and metabolomic analyses to decipher the short-term acclimation of the brown algal model E. siliculosus to Cu stress, and compared these data to results known for other abiotic stressors. This comparison demonstrates that Cu induces oxidative stress in E. siliculosus as illustrated by the transcriptomic overlap between Cu and H2O2 treatments. The common response to Cu and H2O2 consisted in the activation of the oxylipin and the repression of inositol signaling pathways, together with the regulation of genes coding for several transcription-associated proteins. Concomitantly, Cu stress specifically activated a set of genes coding for orthologs of ABC transporters, a P1B-type ATPase, ROS detoxification systems such as a vanadium-dependent bromoperoxidase, and induced an increase of free fatty acid contents. Finally we observed, as a common abiotic stress mechanism, the activation of autophagic processes on one hand and the repression of genes involved in nitrogen assimilation on the other hand. Conclusions Comparisons with data from green plants indicate that some processes involved in Cu and oxidative stress response are conserved across these two distant lineages. At the same time the high number of yet uncharacterized brown alga-specific genes induced in response to copper stress underlines the potential to discover new components and molecular interactions unique to these organisms. Of particular interest for future research is the potential cross-talk between reactive oxygen species (ROS)-, myo-inositol-, and oxylipin signaling. PMID:24885189
Transcriptome Analysis and Comparison of Marmota monax and Marmota himalayana.

PubMed

Liu, Yanan; Wang, Baoju; Wang, Lu; Vikash, Vikash; Wang, Qin; Roggendorf, Michael; Lu, Mengji; Yang, Dongliang; Liu, Jia

2016-01-01

The Eastern woodchuck (Marmota monax) is a classical animal model for studying hepatitis B virus (HBV) infection and hepatocellular carcinoma (HCC) in humans. Recently, we found that Marmota himalayana, an Asian animal species closely related to Marmota monax, is susceptible to woodchuck hepatitis virus (WHV) infection and can be used as a new mammalian model for HBV infection. However, the lack of genomic sequence information of both Marmota models strongly limited their application breadth and depth. To address this major obstacle of the Marmota models, we utilized Illumina RNA-Seq technology to sequence the cDNA libraries of liver and spleen samples of two Marmota monax and four Marmota himalayana. In total, over 13 billion nucleotide bases were sequenced and approximately 1.5 billion clean reads were obtained. Following assembly, 106,496 consensus sequences of Marmota monax and 78,483 consensus sequences of Marmota himalayana were detected. For functional annotation, in total 73,603 Unigenes of Marmota monax and 78,483 Unigenes of Marmota himalayana were identified using different databases (NR, NT, Swiss-Prot, KEGG, COG, GO). The Unigenes were aligned by blastx to protein databases to decide the coding DNA sequences (CDS) and in total 41,247 CDS of Marmota monax and 34,033 CDS of Marmota himalayana were predicted. The single nucleotide polymorphisms (SNPs) and the simple sequence repeats (SSRs) were also analyzed for all Unigenes obtained. Moreover, a large-scale transcriptome comparison was performed and revealed a high similarity in transcriptome sequences between the two marmota species. Our study provides an extensive amount of novel sequence information for Marmota monax and Marmota himalayana. This information may serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the identification and characterization of functional genes that are involved in WHV infection and HCC development in the woodchuck model.
Transcriptome Analysis and Comparison of Marmota monax and Marmota himalayana

PubMed Central

Wang, Lu; Vikash, Vikash; Wang, Qin; Roggendorf, Michael; Lu, Mengji; Yang, Dongliang; Liu, Jia

2016-01-01

The Eastern woodchuck (Marmota monax) is a classical animal model for studying hepatitis B virus (HBV) infection and hepatocellular carcinoma (HCC) in humans. Recently, we found that Marmota himalayana, an Asian animal species closely related to Marmota monax, is susceptible to woodchuck hepatitis virus (WHV) infection and can be used as a new mammalian model for HBV infection. However, the lack of genomic sequence information of both Marmota models strongly limited their application breadth and depth. To address this major obstacle of the Marmota models, we utilized Illumina RNA-Seq technology to sequence the cDNA libraries of liver and spleen samples of two Marmota monax and four Marmota himalayana. In total, over 13 billion nucleotide bases were sequenced and approximately 1.5 billion clean reads were obtained. Following assembly, 106,496 consensus sequences of Marmota monax and 78,483 consensus sequences of Marmota himalayana were detected. For functional annotation, in total 73,603 Unigenes of Marmota monax and 78,483 Unigenes of Marmota himalayana were identified using different databases (NR, NT, Swiss-Prot, KEGG, COG, GO). The Unigenes were aligned by blastx to protein databases to decide the coding DNA sequences (CDS) and in total 41,247 CDS of Marmota monax and 34,033 CDS of Marmota himalayana were predicted. The single nucleotide polymorphisms (SNPs) and the simple sequence repeats (SSRs) were also analyzed for all Unigenes obtained. Moreover, a large-scale transcriptome comparison was performed and revealed a high similarity in transcriptome sequences between the two marmota species. Our study provides an extensive amount of novel sequence information for Marmota monax and Marmota himalayana. This information may serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the identification and characterization of functional genes that are involved in WHV infection and HCC development in the woodchuck model. PMID:27806133
Global Transcriptome Analysis of the Tentacle of the Jellyfish Cyanea capillata Using Deep Sequencing and Expressed Sequence Tags: Insight into the Toxin- and Degenerative Disease-Related Transcripts

PubMed Central

Liu, Dan; Wang, Qianqian; Ruan, Zengliang; He, Qian; Zhang, Liming

2015-01-01

Background Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. Results We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington’s, Alzheimer’s and Parkinson’s diseases. This is the first description of degenerative disease-associated genes in jellyfish. Conclusion We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular level and information on the underlying molecular mechanisms of jellyfish stinging. The findings of this study may also be used in comparative studies of gene expression profiling among different jellyfish species. PMID:26551022
Global Transcriptome Analysis of the Tentacle of the Jellyfish Cyanea capillata Using Deep Sequencing and Expressed Sequence Tags: Insight into the Toxin- and Degenerative Disease-Related Transcripts.

PubMed

Liu, Guoyan; Zhou, Yonghong; Liu, Dan; Wang, Qianqian; Ruan, Zengliang; He, Qian; Zhang, Liming

2015-01-01

Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington's, Alzheimer's and Parkinson's diseases. This is the first description of degenerative disease-associated genes in jellyfish. We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular level and information on the underlying molecular mechanisms of jellyfish stinging. The findings of this study may also be used in comparative studies of gene expression profiling among different jellyfish species.
Identification of olfactory receptor genes in the Japanese grenadier anchovy Coilia nasus.

PubMed

Zhu, Guoli; Wang, Liangjiang; Tang, Wenqiao; Wang, Xiaomei; Wang, Cong

2017-01-01

Olfaction is essential for fish to detect odorant elements in the environment and plays a critical role in navigating, locating food and detecting predators. Olfactory function is produced by the olfactory transduction pathway and is activated by olfactory receptors (ORs) through the binding of odorant elements. Recently, four types of olfactory receptors have been identified in vertebrate olfactory epithelium, including main odorant receptors (MORs), vomeronasal type receptors (VRs), trace-amine associated receptors (TAARs) and formyl peptide receptors (FPRs). It has been hypothesized that migratory fish, which have the ability to perform spawning migration, use olfactory cues to return to natal rivers. Therefore, obtaining OR genes from migratory fish will provide a resource for the study of molecular mechanisms that underlie fish spawning migration behaviors. Previous studies of OR genes have mainly focused on genomic data, however little information has been gained at the transcript level. In this study, we identified the OR genes of an economically important commercial fish Coilia nasus through searching for olfactory epithelium transcriptomes. A total of 142 candidate MOR, 52 V2R/OlfC, 32 TAAR and two FPR putative genes were identified. In addition, through genomic analysis we identified several MOR genes containing introns, which is unusual for vertebrate MOR genes. The transcriptome-scale mining strategy proved to be fruitful in identifying large sets of OR genes from species whose genome information is unavailable. Our findings lay the foundation for further research into the possible molecular mechanisms underlying the spawning migration behavior in C. nasus .
CLIP-seq analysis of multi-mapped reads discovers novel functional RNA regulatory sites in the human transcriptome.

PubMed

Zhang, Zijun; Xing, Yi

2017-09-19

Crosslinking or RNA immunoprecipitation followed by sequencing (CLIP-seq or RIP-seq) allows transcriptome-wide discovery of RNA regulatory sites. As CLIP-seq/RIP-seq reads are short, existing computational tools focus on uniquely mapped reads, while reads mapped to multiple loci are discarded. We present CLAM (CLIP-seq Analysis of Multi-mapped reads). CLAM uses an expectation-maximization algorithm to assign multi-mapped reads and calls peaks combining uniquely and multi-mapped reads. To demonstrate the utility of CLAM, we applied it to a wide range of public CLIP-seq/RIP-seq datasets involving numerous splicing factors, microRNAs and m6A RNA methylation. CLAM recovered a large number of novel RNA regulatory sites inaccessible by uniquely mapped reads. The functional significance of these sites was demonstrated by consensus motif patterns and association with alternative splicing (splicing factors), transcript abundance (AGO2) and mRNA half-life (m6A). CLAM provides a useful tool to discover novel protein-RNA interactions and RNA modification sites from CLIP-seq and RIP-seq data, and reveals the significant contribution of repetitive elements to the RNA regulatory landscape of the human transcriptome. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
The head-regeneration transcriptome of the planarian Schmidtea mediterranea.

PubMed

Sandmann, Thomas; Vogg, Matthias C; Owlarn, Suthira; Boutros, Michael; Bartscherer, Kerstin

2011-08-16

Planarian flatworms can regenerate their head, including a functional brain, within less than a week. Despite the enormous potential of these animals for medical research and regenerative medicine, the mechanisms of regeneration and the molecules involved remain largely unknown. To identify genes that are differentially expressed during early stages of planarian head regeneration, we generated a de novo transcriptome assembly from more than 300 million paired-end reads from planarian fragments regenerating the head at 16 different time points. The assembly yielded 26,018 putative transcripts, including very long transcripts spanning multiple genomic supercontigs, and thousands of isoforms. Using short-read data from two platforms, we analyzed dynamic gene regulation during the first three days of head regeneration. We identified at least five different temporal synexpression classes, including genes specifically induced within a few hours after injury. Furthermore, we characterized the role of a conserved Runx transcription factor, smed-runt-like1. RNA interference (RNAi) knockdown and immunofluorescence analysis of the regenerating visual system indicated that smed-runt-like1 encodes a transcriptional regulator of eye morphology and photoreceptor patterning. Transcriptome sequencing of short reads allowed for the simultaneous de novo assembly and differential expression analysis of transcripts, demonstrating highly dynamic regulation during head regeneration in planarians.
Redefining metamorphosis in spiny lobsters: molecular analysis of the phyllosoma to puerulus transition in Sagmariasus verreauxi

PubMed Central

Ventura, Tomer; Fitzgibbon, Quinn P.; Battaglene, Stephen C.; Elizur, Abigail

2015-01-01

The molecular understanding of crustacean metamorphosis is hindered by small sized individuals and inability to accurately define molt stages. We used the spiny lobster Sagmariasus verreauxi where the large, transparent larvae enable accurate tracing of the transition from a leaf-shaped phyllosoma to an intermediate larval-juvenile phase (puerulus). Transcriptomic analysis of larvae at well-defined stages prior to, during, and following this transition show that the phyllosoma-puerulus metamorphic transition is accompanied by vast transcriptomic changes exceeding 25% of the transcriptome. Notably, genes previously identified as regulating metamorphosis in other crustaceans do not fluctuate during this transition but in the later, morphologically-subtle puerulus-juvenile transition, indicating that the dramatic phyllosoma-puerulus morphological shift relies on a different, yet to be identified metamorphic mechanism. We examined the change in expression of domains and gene families, with focus on several key genes. Our research implies that the separation in molecular triggering systems between the phyllosoma-puerulus and puerulus-juvenile transitions might have enabled the extension of the oceanic phase in spiny lobsters. Study of similar transitions, where metamorphosis is uncoupled from the transition into the benthic juvenile form, in other commercially important crustacean groups might show common features to point on the evolutionary advantage of this two staged regulation. PMID:26311524
The head-regeneration transcriptome of the planarian Schmidtea mediterranea

PubMed Central

2011-01-01

Background Planarian flatworms can regenerate their head, including a functional brain, within less than a week. Despite the enormous potential of these animals for medical research and regenerative medicine, the mechanisms of regeneration and the molecules involved remain largely unknown. Results To identify genes that are differentially expressed during early stages of planarian head regeneration, we generated a de novo transcriptome assembly from more than 300 million paired-end reads from planarian fragments regenerating the head at 16 different time points. The assembly yielded 26,018 putative transcripts, including very long transcripts spanning multiple genomic supercontigs, and thousands of isoforms. Using short-read data from two platforms, we analyzed dynamic gene regulation during the first three days of head regeneration. We identified at least five different temporal synexpression classes, including genes specifically induced within a few hours after injury. Furthermore, we characterized the role of a conserved Runx transcription factor, smed-runt-like1. RNA interference (RNAi) knockdown and immunofluorescence analysis of the regenerating visual system indicated that smed-runt-like1 encodes a transcriptional regulator of eye morphology and photoreceptor patterning. Conclusions Transcriptome sequencing of short reads allowed for the simultaneous de novo assembly and differential expression analysis of transcripts, demonstrating highly dynamic regulation during head regeneration in planarians. PMID:21846378
Redefining metamorphosis in spiny lobsters: molecular analysis of the phyllosoma to puerulus transition in Sagmariasus verreauxi.

PubMed

Ventura, Tomer; Fitzgibbon, Quinn P; Battaglene, Stephen C; Elizur, Abigail

2015-08-27

The molecular understanding of crustacean metamorphosis is hindered by small sized individuals and inability to accurately define molt stages. We used the spiny lobster Sagmariasus verreauxi where the large, transparent larvae enable accurate tracing of the transition from a leaf-shaped phyllosoma to an intermediate larval-juvenile phase (puerulus). Transcriptomic analysis of larvae at well-defined stages prior to, during, and following this transition show that the phyllosoma-puerulus metamorphic transition is accompanied by vast transcriptomic changes exceeding 25% of the transcriptome. Notably, genes previously identified as regulating metamorphosis in other crustaceans do not fluctuate during this transition but in the later, morphologically-subtle puerulus-juvenile transition, indicating that the dramatic phyllosoma-puerulus morphological shift relies on a different, yet to be identified metamorphic mechanism. We examined the change in expression of domains and gene families, with focus on several key genes. Our research implies that the separation in molecular triggering systems between the phyllosoma-puerulus and puerulus-juvenile transitions might have enabled the extension of the oceanic phase in spiny lobsters. Study of similar transitions, where metamorphosis is uncoupled from the transition into the benthic juvenile form, in other commercially important crustacean groups might show common features to point on the evolutionary advantage of this two staged regulation.
Transcriptome analysis of Haloquadratum walsbyi: vanity is but the surface.

PubMed

Bolhuis, Henk; Martín-Cuadrado, Ana Belén; Rosselli, Riccardo; Pašić, Lejla; Rodriguez-Valera, Francisco

2017-07-03

Haloquadratum walsbyi dominates saturated thalassic lakes worldwide where they can constitute up to 80-90% of the total prokaryotic community. Despite the abundance of the enigmatic square-flattened cells, only 7 isolates are currently known with 2 genomes fully sequenced and annotated due to difficulties to grow them under laboratory conditions. We have performed a transcriptomic analysis of one of these isolates, the Spanish strain HBSQ001 in order to investigate gene transcription under light and dark conditions. Despite a potential advantage for light as additional source of energy, no significant differences were found between light and dark expressed genes. Constitutive high gene expression was observed in genes encoding surface glycoproteins, light mediated proton pumping by bacteriorhodopsin, several nutrient uptake systems, buoyancy and storage of excess carbon. Two low expressed regions of the genome were characterized by a lower codon adaptation index, low GC content and high incidence of hypothetical genes. Under the extant cultivation conditions, the square hyperhalophile devoted most of its transcriptome towards processes maintaining cell integrity and exploiting solar energy. Surface glycoproteins are essential for maintaining the large surface to volume ratio that facilitates light and organic nutrient harvesting whereas constitutive expression of bacteriorhodopsin warrants an immediate source of energy when light becomes available.
Insulin immuno-neutralization in fed chickens: effects on liver and muscle transcriptome.

PubMed

Simon, Jean; Milenkovic, Dragan; Godet, Estelle; Cabau, Cedric; Collin, Anne; Métayer-Coustard, Sonia; Rideau, Nicole; Tesseraud, Sophie; Derouet, Michel; Crochet, Sabine; Cailleau-Audouin, Estelle; Hennequet-Antier, Christelle; Gespach, Christian; Porter, Tom E; Duclos, Michel J; Dupont, Joëlle; Cogburn, Larry A

2012-03-01

Chickens mimic an insulin-resistance state by exhibiting several peculiarities with regard to plasma glucose level and its control by insulin. To gain insight into the role of insulin in the control of chicken transcriptome, liver and leg muscle transcriptomes were compared in fed controls and "diabetic" chickens, at 5 h after insulin immuno-neutralization, using 20.7K-chicken oligo-microarrays. At a level of false discovery rate <0.01, 1,573 and 1,225 signals were significantly modified by insulin privation in liver and muscle, respectively. Microarray data agreed reasonably well with qRT-PCR and some protein level measurements. Differentially expressed mRNAs with human ID were classified using Biorag analysis and Ingenuity Pathway Analysis. Multiple metabolic pathways, structural proteins, transporters and proteins of intracellular trafficking, major signaling pathways, and elements of the transcriptional control machinery were largely represented in both tissues. At least 42 mRNAs have already been associated with diabetes, insulin resistance, obesity, energy expenditure, or identified as sensors of metabolism in mice or humans. The contribution of the pathways presently identified to chicken physiology (particularly those not yet related to insulin) needs to be evaluated in future studies. Other challenges include the characterization of "unknown" mRNAs and the identification of the steps or networks, which disturbed tissue transcriptome so extensively, quickly after the turning off of the insulin signal. In conclusion, pleiotropic effects of insulin in chickens are further evidenced; major pathways controlled by insulin in mammals have been conserved despite the presence of unique features of insulin signaling in chicken muscle.
Optimizing Hybrid de Novo Transcriptome Assembly and Extending Genomic Resources for Giant Freshwater Prawns (Macrobrachium rosenbergii): The Identification of Genes and Markers Associated with Reproduction.

PubMed

Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A; Lyons, Russell E; Salin, Krishna R; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B

2016-05-07

The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world's most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium.
Optimizing Hybrid de Novo Transcriptome Assembly and Extending Genomic Resources for Giant Freshwater Prawns (Macrobrachium rosenbergii): The Identification of Genes and Markers Associated with Reproduction

PubMed Central

Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A.; Lyons, Russell E.; Salin, Krishna R.; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B.

2016-01-01

The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world’s most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium. PMID:27164098
In silico lineage tracing through single cell transcriptomics identifies a neural stem cell population in planarians.

PubMed

Molinaro, Alyssa M; Pearson, Bret J

2016-04-27

The planarian Schmidtea mediterranea is a master regenerator with a large adult stem cell compartment. The lack of transgenic labeling techniques in this animal has hindered the study of lineage progression and has made understanding the mechanisms of tissue regeneration a challenge. However, recent advances in single-cell transcriptomics and analysis methods allow for the discovery of novel cell lineages as differentiation progresses from stem cell to terminally differentiated cell. Here we apply pseudotime analysis and single-cell transcriptomics to identify adult stem cells belonging to specific cellular lineages and identify novel candidate genes for future in vivo lineage studies. We purify 168 single stem and progeny cells from the planarian head, which were subjected to single-cell RNA sequencing (scRNAseq). Pseudotime analysis with Waterfall and gene set enrichment analysis predicts a molecularly distinct neoblast sub-population with neural character (νNeoblasts) as well as a novel alternative lineage. Using the predicted νNeoblast markers, we demonstrate that a novel proliferative stem cell population exists adjacent to the brain. scRNAseq coupled with in silico lineage analysis offers a new approach for studying lineage progression in planarians. The lineages identified here are extracted from a highly heterogeneous dataset with minimal prior knowledge of planarian lineages, demonstrating that lineage purification by transgenic labeling is not a prerequisite for this approach. The identification of the νNeoblast lineage demonstrates the usefulness of the planarian system for computationally predicting cellular lineages in an adult context coupled with in vivo verification.
Genome and Transcriptome Analyses Provide Insight into the Euryhaline Adaptation Mechanism of Crassostrea gigas

PubMed Central

Zhang, Linlin; Li, Chunyan; Li, Li; She, Zhicai; Huang, Baoyu; Zhang, Guofan

2013-01-01

Background The Pacific oyster, Crassostrea gigas, has developed special mechanisms to regulate its osmotic balance to adapt to fluctuations of salinities in coastal zones. To understand the oyster’s euryhaline adaptation, we analyzed salt stress effectors metabolism pathways under different salinities (salt 5, 10, 15, 20, 25, 30 and 40 for 7 days) using transcriptome data, physiology experiment and quantitative real-time PCR. Results Transcriptome data uncovered 189, 480, 207 and 80 marker genes for monitoring physiology status of oysters and the environment conditions. Three known salt stress effectors (involving ion channels, aquaporins and free amino acids) were examined. The analysis of ion channels and aquaporins indicated that 7 days long-term salt stress inhibited voltage-gated Na+/K+ channel and aquaporin but increased calcium-activated K+ channel and Ca2+ channel. As the most important category of osmotic stress effector, we analyzed the oyster FAAs metabolism pathways (including taurine, glycine, alanine, beta-alanine, proline and arginine) and explained FAAs functional mechanism for oyster low salinity adaptation. FAAs metabolism key enzyme genes displayed expression differentiation in low salinity adapted individuals comparing with control which further indicated that FAAs played important roles for oyster salinity adaptation. A global metabolic pathway analysis (iPath) of oyster expanded genes displayed a co-expansion of FAAs metabolism in C. gigas compared with seven other species, suggesting oyster’s powerful ability regarding FAAs metabolism, allowing it to adapt to fluctuating salinities, which may be one important mechanism underlying euryhaline adaption in oyster. Additionally, using transcriptome data analysis, we uncovered salt stress transduction networks in C. gigas. Conclusions Our results represented oyster salt stress effectors functional mechanisms under salt stress conditions and explained the expansion of FAAs metabolism pathways as the most important effectors for oyster euryhaline adaptation. This study was the first to explain oyster euryhaline adaptation at a genome-wide scale in C. gigas. PMID:23554902
Astronomical algorithms for automated analysis of tissue protein expression in breast cancer

PubMed Central

Ali, H R; Irwin, M; Morris, L; Dawson, S-J; Blows, F M; Provenzano, E; Mahler-Araujo, B; Pharoah, P D; Walton, N A; Brenton, J D; Caldas, C

2013-01-01

Background: High-throughput evaluation of tissue biomarkers in oncology has been greatly accelerated by the widespread use of tissue microarrays (TMAs) and immunohistochemistry. Although TMAs have the potential to facilitate protein expression profiling on a scale to rival experiments of tumour transcriptomes, the bottleneck and imprecision of manually scoring TMAs has impeded progress. Methods: We report image analysis algorithms adapted from astronomy for the precise automated analysis of IHC in all subcellular compartments. The power of this technique is demonstrated using over 2000 breast tumours and comparing quantitative automated scores against manual assessment by pathologists. Results: All continuous automated scores showed good correlation with their corresponding ordinal manual scores. For oestrogen receptor (ER), the correlation was 0.82, P<0.0001, for BCL2 0.72, P<0.0001 and for HER2 0.62, P<0.0001. Automated scores showed excellent concordance with manual scores for the unsupervised assignment of cases to ‘positive' or ‘negative' categories with agreement rates of up to 96%. Conclusion: The adaptation of astronomical algorithms coupled with their application to large annotated study cohorts, constitutes a powerful tool for the realisation of the enormous potential of digital pathology. PMID:23329232
Characterization of receptor of activated C kinase 1 (RACK1) and functional analysis during larval metamorphosis of the oyster Crassostrea angulata.

PubMed

Yang, Bingye; Pu, Fei; Qin, Ji; You, Weiwei; Ke, Caihuan

2014-03-10

During a large-scale screen of the larval transcriptome library of the Portuguese oyster, Crassostrea angulata, the oyster gene RACK, which encodes a receptor of activated protein kinase C protein was isolated and characterized. The cDNA is 1,148 bp long and has a predicted open reading frame encoding 317 aa. The predicted protein shows high sequence identity to many RACK proteins of different organisms including molluscs, fish, amphibians and mammals, suggesting that it is conserved during evolution. The structural analysis of the Ca-RACK1 genomic sequence implies that the Ca-RACK1 gene has seven exons and six introns, extending approximately 6.5 kb in length. It is expressed ubiquitously in many oyster tissues as detected by RT-PCR analysis. The Ca-RACK1 mRNA expression pattern was markedly increased at larval metamorphosis; and was further increased along with Ca-RACK1 protein synthesis during epinephrine-induced metamorphosis. These results indicate that the Ca-RACK1 plays an important role in tissue differentiation and/or in cell growth during larval metamorphosis in the oyster, C. angulata. Copyright © 2013 Elsevier B.V. All rights reserved.
Falco: a quick and flexible single-cell RNA-seq processing framework on the cloud.

PubMed

Yang, Andrian; Troup, Michael; Lin, Peijie; Ho, Joshua W K

2017-03-01

Single-cell RNA-seq (scRNA-seq) is increasingly used in a range of biomedical studies. Nonetheless, current RNA-seq analysis tools are not specifically designed to efficiently process scRNA-seq data due to their limited scalability. Here we introduce Falco, a cloud-based framework to enable paralellization of existing RNA-seq processing pipelines using big data technologies of Apache Hadoop and Apache Spark for performing massively parallel analysis of large scale transcriptomic data. Using two public scRNA-seq datasets and two popular RNA-seq alignment/feature quantification pipelines, we show that the same processing pipeline runs 2.6-145.4 times faster using Falco than running on a highly optimized standalone computer. Falco also allows users to utilize low-cost spot instances of Amazon Web Services, providing a ∼65% reduction in cost of analysis. Falco is available via a GNU General Public License at https://github.com/VCCRI/Falco/. j.ho@victorchang.edu.au. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

Culturing Synechocystis sp. Strain PCC 6803 with N2 and CO2 in a Diel Regime Reveals Multiphase Glycogen Dynamics with Low Maintenance Costs

PubMed Central

Angermayr, S. Andreas; van Alphen, Pascal; Hasdemir, Dicle; Kramer, Gertjan; Iqbal, Muzamal; van Grondelle, Wilmar; Hoefsloot, Huub C.; Choi, Young Hae

2016-01-01

ABSTRACT Investigating the physiology of cyanobacteria cultured under a diel light regime is relevant for a better understanding of the resulting growth characteristics and for specific biotechnological applications that are foreseen for these photosynthetic organisms. Here, we present the results of a multiomics study of the model cyanobacterium Synechocystis sp. strain PCC 6803, cultured in a lab-scale photobioreactor in physiological conditions relevant for large-scale culturing. The culture was sparged with N2 and CO2, leading to an anoxic environment during the dark period. Growth followed the availability of light. Metabolite analysis performed with 1H nuclear magnetic resonance analysis showed that amino acids involved in nitrogen and sulfur assimilation showed elevated levels in the light. Most protein levels, analyzed through mass spectrometry, remained rather stable. However, several high-light-response proteins and stress-response proteins showed distinct changes at the onset of the light period. Microarray-based transcript analysis found common patterns of ∼56% of the transcriptome following the diel regime. These oscillating transcripts could be grouped coarsely into genes that were upregulated and downregulated in the dark period. The accumulated glycogen was degraded in the anaerobic environment in the dark. A small part was degraded gradually, reflecting basic maintenance requirements of the cells in darkness. Surprisingly, the largest part was degraded rapidly in a short time span at the end of the dark period. This degradation could allow rapid formation of metabolic intermediates at the end of the dark period, preparing the cells for the resumption of growth at the start of the light period. IMPORTANCE Industrial-scale biotechnological applications are anticipated for cyanobacteria. We simulated large-scale high-cell-density culturing of Synechocystis sp. PCC 6803 under a diel light regime in a lab-scale photobioreactor. In BG-11 medium, Synechocystis grew only in the light. Metabolite analysis grouped the collected samples according to the light and dark conditions. Proteome analysis suggested that the majority of enzyme-activity regulation was not hierarchical but rather occurred through enzyme activity regulation. An abrupt light-on condition induced high-light-stress proteins. Transcript analysis showed distinct patterns for the light and dark periods. Glycogen gradually accumulated in the light and was rapidly consumed in the last quarter of the dark period. This suggests that the circadian clock primed the cellular machinery for immediate resumption of growth in the light. PMID:27208121
Transcriptomic response and perturbation of toxicity pathways in zebrafish larvae after exposure to graphene quantum dots (GQDs).

PubMed

Deng, Shun; Jia, Pan-Pan; Zhang, Jing-Hui; Junaid, Muhammad; Niu, Aping; Ma, Yan-Bo; Fu, Ailing; Pei, De-Sheng

2018-05-29

Graphene quantum dots (GQDs) are widely used for biomedical applications. Previously, the low-level toxicity of GQDs in vivo and in vitro has been elucidated, but the underlying molecular mechanisms remained largely unknown. Here, we employed the Illumina high-throughput RNA-sequencing to explore the whole-transcriptome profiling of zebrafish larvae after exposure to GQDs. Comparative transcriptome analysis identified 2116 differentially expressed genes between GQDs exposed groups and control. Functional classification demonstrated that a large proportion of genes involved in acute inflammatory responses and detoxifying process were significantly up-regulated by GQDs. The inferred gene regulatory network suggested that activator protein 1 (AP-1) was the early-response transcription factor in the linkage of a cascade of downstream (pro-) inflammatory signals with the apoptosis signals. Moreover, hierarchical signaling threshold determined the high sensitivity of complement system in zebrafish when exposed to the sublethal dose of GQDs. Further, 35 candidate genes from various signaling pathways were further validated by qPCR after exposure to 25, 50, and 100 μg/mL of GQDs. Taken together, our study provided a valuable insight into the molecular mechanisms of potential bleeding risks and detoxifying processes in response to GQDs exposure, thereby establishing a mechanistic basis for the biosafety evaluation of GQDs. Copyright © 2018 Elsevier B.V. All rights reserved.
Characterization of Heterobasidion occidentale transcriptomes reveals candidate genes and DNA polymorphisms for virulence variations.

PubMed

Liu, Jun-Jun; Shamoun, Simon Francis; Leal, Isabel; Kowbel, Robert; Sumampong, Grace; Zamany, Arezoo

2018-05-01

Characterization of genes involved in differentiation of pathogen species and isolates with variations of virulence traits provides valuable information to control tree diseases for meeting the challenges of sustainable forest health and phytosanitary trade issues. Lack of genetic knowledge and genomic resources hinders novel gene discovery, molecular mechanism studies and development of diagnostic tools in the management of forest pathogens. Here, we report on transcriptome profiling of Heterobasidion occidentale isolates with contrasting virulence levels. Comparative transcriptomic analysis identified orthologous groups exclusive to H. occidentale and its isolates, revealing biological processes involved in the differentiation of isolates. Further bioinformatics analyses identified an H. occidentale secretome, CYPome and other candidate effectors, from which genes with species- and isolate-specific expression were characterized. A large proportion of differentially expressed genes were revealed to have putative activities as cell wall modification enzymes and transcription factors, suggesting their potential roles in virulence and fungal pathogenesis. Next, large numbers of simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were detected, including more than 14 000 interisolate non-synonymous SNPs. These polymorphic loci and species/isolate-specific genes may contribute to virulence variations and provide ideal DNA markers for development of diagnostic tools and investigation of genetic diversity. © 2018 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Spliced synthetic genes as internal controls in RNA sequencing experiments.

PubMed

Hardwick, Simon A; Chen, Wendy Y; Wong, Ted; Deveson, Ira W; Blackburn, James; Andersen, Stacey B; Nielsen, Lars K; Mattick, John S; Mercer, Tim R

2016-09-01

RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome.
A Unique Model Platform for C4 Plant Systems and Synthetic Biology

DTIC Science & Technology

2015-12-10

International Conference in Bioinformatics , Sydney, Australia, July 31 - August 2, 2014.  Nielsen LK (2015) Genome scale metabolic and regulatory...the comparison of transcriptome proteome and central metabolome in mature and immature tissue. Preliminary data were obtained suggesting successful...guide the comparison of transcriptome, proteome and central metabolome in mature and immature tissue. Preliminary data were obtained suggesting
Draft genome and reference transcriptomic resources for the urticating pine defoliator Thaumetopoea pityocampa (Lepidoptera: Notodontidae).

PubMed

Gschloessl, B; Dorkeld, F; Berges, H; Beydon, G; Bouchez, O; Branco, M; Bretaudeau, A; Burban, C; Dubois, E; Gauthier, P; Lhuillier, E; Nichols, J; Nidelet, S; Rocha, S; Sauné, L; Streiff, R; Gautier, M; Kerdelhué, C

2018-05-01

The pine processionary moth Thaumetopoea pityocampa (Lepidoptera: Notodontidae) is the main pine defoliator in the Mediterranean region. Its urticating larvae cause severe human and animal health concerns in the invaded areas. This species shows a high phenotypic variability for various traits, such as phenology, fecundity and tolerance to extreme temperatures. This study presents the construction and analysis of extensive genomic and transcriptomic resources, which are an obligate prerequisite to understand their underlying genetic architecture. Using a well-studied population from Portugal with peculiar phenological characteristics, the karyotype was first determined and a first draft genome of 537 Mb total length was assembled into 68,292 scaffolds (N50 = 164 kb). From this genome assembly, 29,415 coding genes were predicted. To circumvent some limitations for fine-scale physical mapping of genomic regions of interest, a 3X coverage BAC library was also developed. In particular, 11 BACs from this library were individually sequenced to assess the assembly quality. Additionally, de novo transcriptomic resources were generated from various developmental stages sequenced with HiSeq and MiSeq Illumina technologies. The reads were de novo assembled into 62,376 and 63,175 transcripts, respectively. Then, a robust subset of the genome-predicted coding genes, the de novo transcriptome assemblies and previously published 454/Sanger data were clustered to obtain a high-quality and comprehensive reference transcriptome consisting of 29,701 bona fide unigenes. These sequences covered 99% of the cegma and 88% of the busco highly conserved eukaryotic genes and 84% of the busco arthropod gene set. Moreover, 90% of these transcripts could be localized on the draft genome. The described information is available via a genome annotation portal (http://bipaa.genouest.org/sp/thaumetopoea_pityocampa/). © 2018 John Wiley & Sons Ltd.
Integration analysis of quantitative proteomics and transcriptomics data identifies potential targets of frizzled-8 protein-related antiproliferative factor in vivo.

PubMed

Yang, Wei; Kim, Yongsoo; Kim, Taek-Kyun; Keay, Susan K; Kim, Kwang Pyo; Steen, Hanno; Freeman, Michael R; Hwang, Daehee; Kim, Jayoung

2012-12-01

What's known on the subject? and What does the study add? Interstitial cystitis (IC) is a prevalent and debilitating pelvic disorder generally accompanied by chronic pain combined with chronic urinating problems. Over one million Americans are affected, especially middle-aged women. However, its aetiology or mechanism remains unclear. No efficient drug has been provided to patients. Several urinary biomarker candidates have been identified for IC; among the most promising is antiproliferative factor (APF), whose biological activity is detectable in urine specimens from >94% of patients with both ulcerative and non-ulcerative IC. The present study identified several important mediators of the effect of APF on bladder cell physiology, suggesting several candidate drug targets against IC. In an attempt to identify potential proteins and genes regulated by APF in vivo, and to possibly expand the APF-regulated network identified by stable isotope labelling by amino acids in cell culture (SILAC), we performed an integration analysis of our own SILAC data and the microarray data of Gamper et al. (2009) BMC Genomics 10: 199. Notably, two of the proteins (i.e. MAPKSP1 and GSPT1) that are down-regulated by APF are involved in the activation of mTORC1, suggesting that the mammalian target of rapamycin (mTOR) pathway is potentially a critical pathway regulated by APF in vivo. Several components of the mTOR pathway are currently being studied as potential therapeutic targets in other diseases. Our analysis suggests that this pathway might also be relevant in the design of diagnostic tools and medications targeting IC. • To enhance our understanding of the interstitial cystitis urine biomarker antiproliferative factor (APF), as well as interstitial cystitis biology more generally at the systems level, we reanalyzed recently published large-scale quantitative proteomics and in vivo transcriptomics data sets using an integration analysis tool that we have developed. • To identify more differentially expressed genes with a lower false discovery rate from a previously published microarray data set, an integrative hypothesis-testing statistical approach was applied. • For validation experiments, expression and phosphorylation levels of select proteins were evaluated by western blotting. • Integration analysis of this transcriptomics data set with our own quantitative proteomics data set identified 10 genes that are potentially regulated by APF in vivo from 4140 differentially expressed genes identified with a false discovery rate of 1%. • Of these, five (i.e. JUP, MAPKSP1, GSPT1, PTGS2/COX-2 and XPOT) were found to be prominent after network modelling of the common genes identified in the proteomics and microarray studies. • This molecular signature reflects the biological processes of cell adhesion, cell proliferation and inflammation, which is consistent with the known physiological effects of APF. • Lastly, we found the mammalian target of rapamycin pathway was down-regulated in response to APF. • This unbiased integration analysis of in vitro quantitative proteomics data with in vivo quantitative transcriptomics data led to the identification of potential downstream mediators of the APF signal transduction pathway. © 2012 THE AUTHORS. BJU INTERNATIONAL © 2012 BJU INTERNATIONAL.
Reptilian-transcriptome v1.0, a glimpse in the brain transcriptome of five divergent Sauropsida lineages and the phylogenetic position of turtles.

PubMed

Tzika, Athanasia C; Helaers, Raphaël; Schramm, Gerrit; Milinkovitch, Michel C

2011-09-26

Reptiles are largely under-represented in comparative genomics despite the fact that they are substantially more diverse in many respects than mammals. Given the high divergence of reptiles from classical model species, next-generation sequencing of their transcriptomes is an approach of choice for gene identification and annotation. Here, we use 454 technology to sequence the brain transcriptome of four divergent reptilian and one reference avian species: the Nile crocodile, the corn snake, the bearded dragon, the red-eared turtle, and the chicken. Using an in-house pipeline for recursive similarity searches of >3,000,000 reads against multiple databases from 7 reference vertebrates, we compile a reptilian comparative transcriptomics dataset, with homology assignment for 20,000 to 31,000 transcripts per species and a cumulated non-redundant sequence length of 248.6 Mbases. Our approach identifies the majority (87%) of chicken brain transcripts and about 50% of de novo assembled reptilian transcripts. In addition to 57,502 microsatellite loci, we identify thousands of SNP and indel polymorphisms for population genetic and linkage analyses. We also build very large multiple alignments for Sauropsida and mammals (two million residues per species) and perform extensive phylogenetic analyses suggesting that turtles are not basal living reptiles but are rather associated with Archosaurians, hence, potentially answering a long-standing question in the phylogeny of Amniotes. The reptilian transcriptome (freely available at http://www.reptilian-transcriptomes.org) should prove a useful new resource as reptiles are becoming important new models for comparative genomics, ecology, and evolutionary developmental genetics.
Transcriptome map of plant mitochondria reveals islands of unexpected transcribed regions.

PubMed

Fujii, Sota; Toda, Takushi; Kikuchi, Shunsuke; Suzuki, Ryutaro; Yokoyama, Koji; Tsuchida, Hiroko; Yano, Kentaro; Toriyama, Kinya

2011-06-01

Plant mitochondria contain a relatively large amount of genetic information, suggesting that their functional regulation may not be as straightforward as that of metazoans. We used a genomic tiling array to draw a transcriptomic atlas of Oryza sativa japonica (rice) mitochondria, which was predicted to be approximately 490-kb long. Whereas statistical analysis verified the transcription of all previously known functional genes such as the ones related to oxidative phosphorylation, a similar extent of RNA expression was frequently observed in the inter-genic regions where none of the previously annotated genes are located. The newly identified open reading frames (ORFs) predicted in these transcribed inter-genic regions were generally not conserved among flowering plant species, suggesting that these ORFs did not play a role in mitochondrial principal functions. We also identified two partial fragments of retrotransposon sequences as being transcribed in rice mitochondria. The present study indicated the previously unexpected complexity of plant mitochondrial RNA metabolism. Our transcriptomic data (Oryza sativa Mitochondrial rna Expression Server: OsMES) is publicly accessible at [http://bioinf.mind.meiji.ac.jp/cgi-bin/gbrowse/OsMes/#search].
Upper airway gene expression in smokers: the mouth as a "window to the soul" of lung carcinogenesis?

PubMed

Spira, Avrum

2010-03-01

This perspective on Boyle et al. (beginning on page 266 in this issue of the journal) explores transcriptomic profiling of upper airway epithelium as a biomarker of host response to tobacco smoke exposure. Boyle et al. have shown a striking relationship between smoking-related gene expression changes in the mouth and bronchus. This relationship suggests that buccal gene expression may serve as a relatively noninvasive surrogate marker of the physiologic response of the lung to tobacco smoke that could be used in large-scale screening and chemoprevention studies for lung cancer.
Cancer Transcriptome Dataset Analysis: Comparing Methods of Pathway and Gene Regulatory Network-Based Cluster Identification.

PubMed

Nam, Seungyoon

2017-04-01

Cancer transcriptome analysis is one of the leading areas of Big Data science, biomarker, and pharmaceutical discovery, not to forget personalized medicine. Yet, cancer transcriptomics and postgenomic medicine require innovation in bioinformatics as well as comparison of the performance of available algorithms. In this data analytics context, the value of network generation and algorithms has been widely underscored for addressing the salient questions in cancer pathogenesis. Analysis of cancer trancriptome often results in complicated networks where identification of network modularity remains critical, for example, in delineating the "druggable" molecular targets. Network clustering is useful, but depends on the network topology in and of itself. Notably, the performance of different network-generating tools for network cluster (NC) identification has been little investigated to date. Hence, using gastric cancer (GC) transcriptomic datasets, we compared two algorithms for generating pathway versus gene regulatory network-based NCs, showing that the pathway-based approach better agrees with a reference set of cancer-functional contexts. Finally, by applying pathway-based NC identification to GC transcriptome datasets, we describe cancer NCs that associate with candidate therapeutic targets and biomarkers in GC. These observations collectively inform future research on cancer transcriptomics, drug discovery, and rational development of new analysis tools for optimal harnessing of omics data.
Proteomics reveals novel components of the Anopheles gambiae eggshell

PubMed Central

Amenya, Dolphine A.; Chou, Wayne; Li, Jianyong; Yan, Guiyun; Gershon, Paul D.; James, Anthony A.; Marinotti, Osvaldo

2010-01-01

While genome and transcriptome sequencing has revealed a large number and diversity of Anopheles gambiae predicted proteins, identifying their functions and biosynthetic pathways remains challenging. Applied mass spectrometry based proteomics in conjunction with mosquito genome and transcriptome databases were used to identify 44 proteins as putative components of the eggshell. Among the identified molecules are two vitelline membrane proteins and a group of seven putative chorion proteins. Enzymes with peroxidase, laccase and phenoloxidase activities, likely involved in cross-linking reactions that stabilize the eggshell structure, also were identified. Seven odorant binding proteins were found in association with the mosquito eggshell, although their role has yet to be demonstrated. This analysis fills a considerable gap of knowledge about proteins that build the eggshell of anopheline mosquitoes. PMID:20433845
Transcriptome Analysis in Venom Gland of the Predatory Giant Ant Dinoponera quadriceps: Insights into the Polypeptide Toxin Arsenal of Hymenopterans

PubMed Central

Chong, Cheong-Meng; Leung, Siu Wai; Prieto-da-Silva, Álvaro R. B.; Havt, Alexandre; Quinet, Yves P.; Martins, Alice M. C.; Lee, Simon M. Y.; Rádis-Baptista, Gandhi

2014-01-01

Background Dinoponera quadriceps is a predatory giant ant that inhabits the Neotropical region and subdues its prey (insects) with stings that deliver a toxic cocktail of molecules. Human accidents occasionally occur and cause local pain and systemic symptoms. A comprehensive study of the D. quadriceps venom gland transcriptome is required to advance our knowledge about the toxin repertoire of the giant ant venom and to understand the physiopathological basis of Hymenoptera envenomation. Results We conducted a transcriptome analysis of a cDNA library from the D. quadriceps venom gland with Sanger sequencing in combination with whole-transcriptome shotgun deep sequencing. From the cDNA library, a total of 420 independent clones were analyzed. Although the proportion of dinoponeratoxin isoform precursors was high, the first giant ant venom inhibitor cysteine-knot (ICK) toxin was found. The deep next generation sequencing yielded a total of 2,514,767 raw reads that were assembled into 18,546 contigs. A BLAST search of the assembled contigs against non-redundant and Swiss-Prot databases showed that 6,463 contigs corresponded to BLASTx hits and indicated an interesting diversity of transcripts related to venom gene expression. The majority of these venom-related sequences code for a major polypeptide core, which comprises venom allergens, lethal-like proteins and esterases, and a minor peptide framework composed of inter-specific structurally conserved cysteine-rich toxins. Both the cDNA library and deep sequencing yielded large proportions of contigs that showed no similarities with known sequences. Conclusions To our knowledge, this is the first report of the venom gland transcriptome of the New World giant ant D. quadriceps. The glandular venom system was dissected, and the toxin arsenal was revealed; this process brought to light novel sequences that included an ICK-folded toxins, allergen proteins, esterases (phospholipases and carboxylesterases), and lethal-like toxins. These findings contribute to the understanding of the ecology, behavior and venomics of hymenopterans. PMID:24498135
Poplar trees reconfigure the transcriptome and metabolome in response to drought in a genotype- and time-of-day-dependent manner.

PubMed

Hamanishi, Erin T; Barchet, Genoa L H; Dauwe, Rebecca; Mansfield, Shawn D; Campbell, Malcolm M

2015-04-21

Drought has a major impact on tree growth and survival. Understanding tree responses to this stress can have important application in both conservation of forest health, and in production forestry. Trees of the genus Populus provide an excellent opportunity to explore the mechanistic underpinnings of forest tree drought responses, given the growing molecular resources that are available for this taxon. Here, foliar tissue of six water-deficit stressed P. balsamifera genotypes was analysed for variation in the metabolome in response to drought and time of day by using an untargeted metabolite profiling technique, gas chromatography/mass-spectrometry (GC/MS). Significant variation in the metabolome was observed in response the imposition of water-deficit stress. Notably, organic acid intermediates such as succinic and malic acid had lower concentrations in leaves exposed to drought, whereas galactinol and raffinose were found in increased concentrations. A number of metabolites with significant difference in accumulation under water-deficit conditions exhibited intraspecific variation in metabolite accumulation. Large magnitude fold-change accumulation was observed in three of the six genotypes. In order to understand the interaction between the transcriptome and metabolome, an integrated analysis of the drought-responsive transcriptome and the metabolome was performed. One P. balsamifera genotype, AP-1006, demonstrated a lack of congruence between the magnitude of the drought transcriptome response and the magnitude of the metabolome response. More specifically, metabolite profiles in AP-1006 demonstrated the smallest changes in response to water-deficit conditions. Pathway analysis of the transcriptome and metabolome revealed specific genotypic responses with respect to primary sugar accumulation, citric acid metabolism, and raffinose family oligosaccharide biosynthesis. The intraspecific variation in the molecular strategies that underpin the responses to drought among genotypes may have an important role in the maintenance of forest health and productivity.
De novo assembly and characterization of bark transcriptome using Illumina sequencing and development of EST-SSR markers in rubber tree (Hevea brasiliensis Muell. Arg.)

PubMed Central

2012-01-01

Background In rubber tree, bark is one of important agricultural and biological organs. However, the molecular mechanism involved in the bark formation and development in rubber tree remains largely unknown, which is at least partially due to lack of bark transcriptomic and genomic information. Therefore, it is necessary to carried out high-throughput transcriptome sequencing of rubber tree bark to generate enormous transcript sequences for the functional characterization and molecular marker development. Results In this study, more than 30 million sequencing reads were generated using Illumina paired-end sequencing technology. In total, 22,756 unigenes with an average length of 485 bp were obtained with de novo assembly. The similarity search indicated that 16,520 and 12,558 unigenes showed significant similarities to known proteins from NCBI non-redundant and Swissprot protein databases, respectively. Among these annotated unigenes, 6,867 and 5,559 unigenes were separately assigned to Gene Ontology (GO) and Clusters of Orthologous Group (COG). When 22,756 unigenes searched against the Kyoto Encyclopedia of Genes and Genomes Pathway (KEGG) database, 12,097 unigenes were assigned to 5 main categories including 123 KEGG pathways. Among the main KEGG categories, metabolism was the biggest category (9,043, 74.75%), suggesting the active metabolic processes in rubber tree bark. In addition, a total of 39,257 EST-SSRs were identified from 22,756 unigenes, and the characterizations of EST-SSRs were further analyzed in rubber tree. 110 potential marker sites were randomly selected to validate the assembly quality and develop EST-SSR markers. Among 13 Hevea germplasms, PCR success rate and polymorphism rate of 110 markers were separately 96.36% and 55.45% in this study. Conclusion By assembling and analyzing de novo transcriptome sequencing data, we reported the comprehensive functional characterization of rubber tree bark. This research generated a substantial fraction of rubber tree transcriptome sequences, which were very useful resources for gene annotation and discovery, molecular markers development, genome assembly and annotation, and microarrays development in rubber tree. The EST-SSR markers identified and developed in this study will facilitate marker-assisted selection breeding in rubber tree. Moreover, this study also supported that transcriptome analysis based on Illumina paired-end sequencing is a powerful tool for transcriptome characterization and molecular marker development in non-model species, especially those with large and complex genomes. PMID:22607098
Benchmarking Water Quality from Wastewater to Drinking Waters Using Reduced Transcriptome of Human Cells.

PubMed

Xia, Pu; Zhang, Xiaowei; Zhang, Hanxin; Wang, Pingping; Tian, Mingming; Yu, Hongxia

2017-08-15

One of the major challenges in environmental science is monitoring and assessing the risk of complex environmental mixtures. In vitro bioassays with limited key toxicological end points have been shown to be suitable to evaluate mixtures of organic pollutants in wastewater and recycled water. Omics approaches such as transcriptomics can monitor biological effects at the genome scale. However, few studies have applied omics approach in the assessment of mixtures of organic micropollutants. Here, an omics approach was developed for profiling bioactivity of 10 water samples ranging from wastewater to drinking water in human cells by a reduced human transcriptome (RHT) approach and dose-response modeling. Transcriptional expression of 1200 selected genes were measured by an Ampliseq technology in two cell lines, HepG2 and MCF7, that were exposed to eight serial dilutions of each sample. Concentration-effect models were used to identify differentially expressed genes (DEGs) and to calculate effect concentrations (ECs) of DEGs, which could be ranked to investigate low dose response. Furthermore, molecular pathways disrupted by different samples were evaluated by Gene Ontology (GO) enrichment analysis. The ability of RHT for representing bioactivity utilizing both HepG2 and MCF7 was shown to be comparable to the results of previous in vitro bioassays. Finally, the relative potencies of the mixtures indicated by RHT analysis were consistent with the chemical profiles of the samples. RHT analysis with human cells provides an efficient and cost-effective approach to benchmarking mixture of micropollutants and may offer novel insight into the assessment of mixture toxicity in water.
High-resolution picture of a venom gland transcriptome: case study with the marine snail Conus consors.

PubMed

Terrat, Yves; Biass, Daniel; Dutertre, Sébastien; Favreau, Philippe; Remm, Maido; Stöcklin, Reto; Piquemal, David; Ducancel, Frédéric

2012-01-01

Although cone snail venoms have been intensively investigated in the past few decades, little is known about the whole conopeptide and protein content in venom ducts, especially at the transcriptomic level. If most of the previous studies focusing on a limited number of sequences have contributed to a better understanding of conopeptide superfamilies, they did not give access to a complete panorama of a whole venom duct. Additionally, rare transcripts were usually not identified due to sampling effect. This work presents the data and analysis of a large number of sequences obtained from high throughput 454 sequencing technology using venom ducts of Conus consors, an Indo-Pacific living piscivorous cone snail. A total of 213,561 Expressed Sequence Tags (ESTs) with an average read length of 218 base pairs (bp) have been obtained. These reads were assembled into 65,536 contiguous DNA sequences (contigs) then into 5039 clusters. The data revealed 11 conopeptide superfamilies representing a total of 53 new isoforms (full length or nearly full-length sequences). Considerable isoform diversity and major differences in transcription level could be noted between superfamilies. A, O and M superfamilies are the most diverse. The A family isoforms account for more than 70% of the conopeptide cocktail (considering all ESTs before clustering step). In addition to traditional superfamilies and families, minor transcripts including both cysteine free and cysteine-rich peptides could be detected, some of them figuring new clades of conopeptides. Finally, several sets of transcripts corresponding to proteins commonly recruited in venom function could be identified for the first time in cone snail venom duct. This work provides one of the first large-scale EST project for a cone snail venom duct using next-generation sequencing, allowing a detailed overview of the venom duct transcripts. This leads to an expanded definition of the overall cone snail venom duct transcriptomic activity, which goes beyond the cysteine-rich conopeptides. For instance, this study enabled to detect proteins involved in common post-translational maturation and folding, and to reveal compounds classically involved in hemolysis and mechanical penetration of the venom into the prey. Further comparison with proteomic and genomic data will lead to a better understanding of conopeptides diversity and the underlying mechanisms involved in conopeptide evolution. Copyright © 2011 Elsevier Ltd. All rights reserved.
Enhancer Sharing Promotes Neighborhoods of Transcriptional Regulation Across Eukaryotes

PubMed Central

Quintero-Cadena, Porfirio; Sternberg, Paul W.

2016-01-01

Enhancers physically interact with transcriptional promoters, looping over distances that can span multiple regulatory elements. Given that enhancer–promoter (EP) interactions generally occur via common protein complexes, it is unclear whether EP pairing is predominantly deterministic or proximity guided. Here, we present cross-organismic evidence suggesting that most EP pairs are compatible, largely determined by physical proximity rather than specific interactions. By reanalyzing transcriptome datasets, we find that the transcription of gene neighbors is correlated over distances that scale with genome size. We experimentally show that nonspecific EP interactions can explain such correlation, and that EP distance acts as a scaling factor for the transcriptional influence of an enhancer. We propose that enhancer sharing is commonplace among eukaryotes, and that EP distance is an important layer of information in gene regulation. PMID:27799341
Transcriptome analysis reveals intermittent fasting-induced genetic changes in ischemic stroke.

PubMed

Kim, Joonki; Kang, Sung-Wook; Mallilankaraman, Karthik; Baik, Sang-Ha; Lim, James C; Balaganapathy, Priyanka; She, David T; Lok, Ker-Zhing; Fann, David Y; Thambiayah, Uma; Tang, Sung-Chun; Stranahan, Alexis M; Dheen, S Thameem; Gelderblom, Mathias; Seet, Raymond C; Karamyan, Vardan T; Vemuganti, Raghu; Sobey, Christopher G; Mattson, Mark P; Jo, Dong-Gyu; Arumugam, Thiruma V

2018-05-01

Genetic changes due to dietary intervention in the form of either calorie restriction (CR) or intermittent fasting (IF) are not reported in detail until now. However, it is well established that both CR and IF extend the lifespan and protect against neurodegenerative diseases and stroke. The current research aims were first to describe the transcriptomic changes in brains of IF mice and, second, to determine whether IF induces extensive transcriptomic changes following ischemic stroke to protect the brain from injury. Mice were randomly assigned to ad libitum feeding (AL), 12 (IF12) or 16 (IF16) h daily fasting. Each diet group was then subjected to sham surgery or middle cerebral artery occlusion and consecutive reperfusion. Mid-coronal sections of ipsilateral cerebral tissue were harvested at the end of the 1 h ischemic period or at 3, 12, 24 or 72 h of reperfusion, and genome-wide mRNA expression was quantified by RNA sequencing. The cerebral transcriptome of mice in AL group exhibited robust, sustained up-regulation of detrimental genetic pathways under ischemic stroke, but activation of these pathways was suppressed in IF16 group. Interestingly, the cerebral transcriptome of AL mice was largely unchanged during the 1 h of ischemia, whereas mice in IF16 group exhibited extensive up-regulation of genetic pathways involved in neuroplasticity and down-regulation of protein synthesis. Our data provide a genetic molecular framework for understanding how IF protects brain cells against damage caused by ischemic stroke, and reveal cellular signaling and bioenergetic pathways to target in the development of clinical interventions.
Transcriptome architecture across tissues in the pig

PubMed Central

Ferraz, André LJ; Ojeda, Ana; López-Béjar, Manel; Fernandes, Lana T; Castelló, Anna; Folch, Josep M; Pérez-Enciso, Miguel

2008-01-01

Background Artificial selection has resulted in animal breeds with extreme phenotypes. As an organism is made up of many different tissues and organs, each with its own genetic programme, it is pertinent to ask: How relevant is tissue in terms of total transcriptome variability? Which are the genes most distinctly expressed between tissues? Does breed or sex equally affect the transcriptome across tissues? Results In order to gain insight on these issues, we conducted microarray expression profiling of 16 different tissues from four animals of two extreme pig breeds, Large White and Iberian, two males and two females. Mixed model analysis and neighbor – joining trees showed that tissues with similar developmental origin clustered closer than those with different embryonic origins. Often a sound biological interpretation was possible for overrepresented gene ontology categories within differentially expressed genes between groups of tissues. For instance, an excess of nervous system or muscle development genes were found among tissues of ectoderm or mesoderm origins, respectively. Tissue accounted for ~11 times more variability than sex or breed. Nevertheless, we were able to confidently identify genes with differential expression across tissues between breeds (33 genes) and between sexes (19 genes). The genes primarily affected by sex were overall different than those affected by breed or tissue. Interaction with tissue can be important for differentially expressed genes between breeds but not so much for genes whose expression differ between sexes. Conclusion Embryonic development leaves an enduring footprint on the transcriptome. The interaction in gene × tissue for differentially expressed genes between breeds suggests that animal breeding has targeted differentially each tissue's transcriptome. PMID:18416811

Transcriptome analysis of the response of Burmese python to digestion

PubMed Central

Sanggaard, Kristian Wejse; Schauser, Leif; Lauridsen, Sanne Enok; Enghild, Jan J.

2017-01-01

Abstract Exceptional and extreme feeding behaviour makes the Burmese python (Python bivittatus) an interesting model to study physiological remodelling and metabolic adaptation in response to refeeding after prolonged starvation. In this study, we used transcriptome sequencing of 5 visceral organs during fasting as well as 24 hours and 48 hours after ingestion of a large meal to unravel the postprandial changes in Burmese pythons. We first used the pooled data to perform a de novo assembly of the transcriptome and supplemented this with a proteomic survey of enzymes in the plasma and gastric fluid. We constructed a high-quality transcriptome with 34 423 transcripts, of which 19 713 (57%) were annotated. Among highly expressed genes (fragments per kilo base per million sequenced reads > 100 in 1 tissue), we found that the transition from fasting to digestion was associated with differential expression of 43 genes in the heart, 206 genes in the liver, 114 genes in the stomach, 89 genes in the pancreas, and 158 genes in the intestine. We interrogated the function of these genes to test previous hypotheses on the response to feeding. We also used the transcriptome to identify 314 secreted proteins in the gastric fluid of the python. Digestion was associated with an upregulation of genes related to metabolic processes, and translational changes therefore appear to support the postprandial rise in metabolism. We identify stomach-related proteins from a digesting individual and demonstrate that the sensitivity of modern liquid chromatography/tandem mass spectrometry equipment allows the identification of gastric juice proteins that are present during digestion. PMID:28873961
A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

PubMed Central

2012-01-01

Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331
Transcriptome analysis of the response of Burmese python to digestion.

PubMed

Duan, Jinjie; Sanggaard, Kristian Wejse; Schauser, Leif; Lauridsen, Sanne Enok; Enghild, Jan J; Schierup, Mikkel Heide; Wang, Tobias

2017-08-01

Exceptional and extreme feeding behaviour makes the Burmese python (Python bivittatus) an interesting model to study physiological remodelling and metabolic adaptation in response to refeeding after prolonged starvation. In this study, we used transcriptome sequencing of 5 visceral organs during fasting as well as 24 hours and 48 hours after ingestion of a large meal to unravel the postprandial changes in Burmese pythons. We first used the pooled data to perform a de novo assembly of the transcriptome and supplemented this with a proteomic survey of enzymes in the plasma and gastric fluid. We constructed a high-quality transcriptome with 34 423 transcripts, of which 19 713 (57%) were annotated. Among highly expressed genes (fragments per kilo base per million sequenced reads > 100 in 1 tissue), we found that the transition from fasting to digestion was associated with differential expression of 43 genes in the heart, 206 genes in the liver, 114 genes in the stomach, 89 genes in the pancreas, and 158 genes in the intestine. We interrogated the function of these genes to test previous hypotheses on the response to feeding. We also used the transcriptome to identify 314 secreted proteins in the gastric fluid of the python. Digestion was associated with an upregulation of genes related to metabolic processes, and translational changes therefore appear to support the postprandial rise in metabolism. We identify stomach-related proteins from a digesting individual and demonstrate that the sensitivity of modern liquid chromatography/tandem mass spectrometry equipment allows the identification of gastric juice proteins that are present during digestion. © The Authors 2017. Published by Oxford University Press.
De novo characterization of Larimichthys crocea transcriptome for growth-/immune-related gene identification and massive microsatellite (SSR) marker development

NASA Astrophysics Data System (ADS)

Han, Zhaofang; Xiao, Shijun; Liu, Xiande; Liu, Yang; Li, Jiakai; Xie, Yangjie; Wang, Zhiyong

2017-03-01

The large yellow croaker, Larimichthys crocea is an important marine fish in China with a high economic value. In the last decade, the stock conservation and aquaculture industry of this species have been facing severe challenges because of wild population collapse and degeneration of important economic traits. However, genes contributing to growth and immunity in L. crocea have not been thoroughly analyzed, and available molecular markers are still not sufficient for genetic resource management and molecular selection. In this work, we sequenced the transcriptome in L. crocea liver tissue with a Roche 454 sequencing platform and assembled the transcriptome into 93 801 transcripts. Of them, 38 856 transcripts were successfully annotated in nt, nr, Swiss-Prot, InterPro, COG, GO and KEGG databases. Based on the annotation information, 3 165 unigenes related to growth and immunity were identified. Additionally, a total of 6 391 simple sequence repeats (SSRs) were identified from the transcriptome, among which 4 498 SSRs had enough flanking regions to design primers for polymerase chain reactions (PCR). To access the polymorphism of these markers, 30 primer pairs were randomly selected for PCR amplification and validation in 30 individuals, and 12 primer pairs (40.0%) exhibited obvious length polymorphisms. This work applied RNA-Seq to assemble and analyze a live transcriptome in L. crocea. With gene annotation and sequence information, genes related to growth and immunity were identified and massive SSR markers were developed, providing valuable genetic resources for future gene functional analysis and selective breeding of L. crocea.
Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex.

PubMed

Pollen, Alex A; Nowakowski, Tomasz J; Shuga, Joe; Wang, Xiaohui; Leyrat, Anne A; Lui, Jan H; Li, Nianzhen; Szpankowski, Lukasz; Fowler, Brian; Chen, Peilin; Ramalingam, Naveen; Sun, Gang; Thu, Myo; Norris, Michael; Lebofsky, Ronald; Toppani, Dominique; Kemp, Darnell W; Wong, Michael; Clerkson, Barry; Jones, Brittnee N; Wu, Shiquan; Knutsson, Lawrence; Alvarado, Beatriz; Wang, Jing; Weaver, Lesley S; May, Andrew P; Jones, Robert C; Unger, Marc A; Kriegstein, Arnold R; West, Jay A A

2014-10-01

Large-scale surveys of single-cell gene expression have the potential to reveal rare cell populations and lineage relationships but require efficient methods for cell capture and mRNA sequencing. Although cellular barcoding strategies allow parallel sequencing of single cells at ultra-low depths, the limitations of shallow sequencing have not been investigated directly. By capturing 301 single cells from 11 populations using microfluidics and analyzing single-cell transcriptomes across downsampled sequencing depths, we demonstrate that shallow single-cell mRNA sequencing (~50,000 reads per cell) is sufficient for unbiased cell-type classification and biomarker identification. In the developing cortex, we identify diverse cell types, including multiple progenitor and neuronal subtypes, and we identify EGR1 and FOS as previously unreported candidate targets of Notch signaling in human but not mouse radial glia. Our strategy establishes an efficient method for unbiased analysis and comparison of cell populations from heterogeneous tissue by microfluidic single-cell capture and low-coverage sequencing of many cells.
Systems Biology of Metabolic Regulation by Estrogen Receptor Signaling in Breast Cancer.

PubMed

Zhao, Yiru Chen; Madak Erdogan, Zeynep

2016-03-17

With the advent of the -omics approaches our understanding of the chronic diseases like cancer and metabolic syndrome has improved. However, effective mining of the information in the large-scale datasets that are obtained from gene expression microarrays, deep sequencing experiments or metabolic profiling is essential to uncover and then effectively target the critical regulators of diseased cell phenotypes. Estrogen Receptor α (ERα) is one of the master transcription factors regulating the gene programs that are important for estrogen responsive breast cancers. In order to understand to role of ERα signaling in breast cancer metabolism we utilized transcriptomic, cistromic and metabolomic data from MCF-7 cells treated with estradiol. In this report we described generation of samples for RNA-Seq, ChIP-Seq and metabolomics experiments and the integrative computational analysis of the obtained data. This approach is useful in delineating novel molecular mechanisms and gene regulatory circuits that are regulated by a particular transcription factor which impacts metabolism of normal or diseased cells.
Interfacing cellular networks of S. cerevisiae and E. coli: Connecting dynamic and genetic information

PubMed Central

2013-01-01

Background In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. Results We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. Conclusions Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes. PMID:23663484
Novel genomic resources for a climate change sensitive mammal: characterization of the American pika transcriptome.

PubMed

Lemay, Matthew A; Henry, Philippe; Lamb, Clayton T; Robson, Kelsey M; Russello, Michael A

2013-05-10

When faced with climate change, species must either shift their home range or adapt in situ in order to maintain optimal physiological balance with their environment. The American pika (Ochotona princeps) is a small alpine mammal with limited dispersal capacity and low tolerance for thermal stress. As a result, pikas have become an important system for examining biotic responses to changing climatic conditions. Previous research using amplified fragment length polymorphisms (AFLPs) has revealed evidence for environmental-mediated selection in O. princeps populations distributed along elevation gradients, yet the anonymity of AFLP loci and lack of available genomic resources precluded the identification of associated gene regions. Here, we harnessed next-generation sequencing technology in order to characterize the American pika transcriptome and identify a large suite of single nucleotide polymorphisms (SNPs), which can be used to elucidate elevation- and site-specific patterns of sequence variation. We constructed pooled cDNA libraries of O. princeps from high (1400 m) and low (300 m) elevation sites along a previously established transect in British Columbia. Transcriptome sequencing using the Roche 454 GS FLX titanium platform generated 780 million base pairs of data, which were assembled into 7,325 high coverage contigs. These contigs were used to identify 24,261 novel SNP loci. Using high resolution melt analysis, we developed 17 of these SNPs into genotyping assays, which were validated with independent DNA samples from British Columbia Canada and Oregon State USA. In addition, we detected haplotypes in the NADH dehydrogenase subunit 5 of the mitochondrial genome that were fixed and different among elevations, suggesting that this may be an informative target gene for studying the role of cellular respiration in local adaptation. We also identified contigs that were unique to each elevation, including a high elevation-specific contig that was a positive match with the hemoglobin alpha chain from the plateau pika, a species restricted to high elevation steppes in Asia. Elevation-specific contigs may represent candidate regions subject to differential levels of gene expression along this elevation gradient. To our knowledge, this is the first broad-scale, transcriptome-level study conducted within the Ochotonidae, providing novel genomic resources for studying pika ecology, behaviour and population history.
Transcriptomic Analysis of Paulownia Infected by Paulownia Witches'-Broom Phytoplasma

PubMed Central

Zhu, Shui-Fang; Lin, Cai-Li; Tian, Guo-Zhong; Xu, Xia; Zhao, Wen-Jun

2013-01-01

Phytoplasmas are plant pathogenic bacteria that have no cell wall and are responsible for major crop losses throughout the world. Phytoplasma-infected plants show a variety of symptoms and the mechanisms they use to physiologically alter the host plants are of considerable interest, but poorly understood. In this study we undertook a detailed analysis of Paulownia infected by Paulownia witches’-broom (PaWB) Phytoplasma using high-throughput mRNA sequencing (RNA-Seq) and digital gene expression (DGE). RNA-Seq analysis identified 74,831 unigenes, which were subsequently used as reference sequences for DGE analysis of diseased and healthy Paulownia in field grown and tissue cultured plants. Our study revealed that dramatic changes occurred in the gene expression profile of Paulownia after PaWB Phytoplasma infection. Genes encoding key enzymes in cytokinin biosynthesis, such as isopentenyl diphosphate isomerase and isopentenyltransferase, were significantly induced in the infected Paulownia. Genes involved in cell wall biosynthesis and degradation were largely up-regulated and genes related to photosynthesis were down-regulated after PaWB Phytoplasma infection. Our systematic analysis provides comprehensive transcriptomic data about plants infected by Phytoplasma. This information will help further our understanding of the detailed interaction mechanisms between plants and Phytoplasma. PMID:24130859
Quantitative RNA-seq analysis of the Campylobacter jejuni transcriptome

PubMed Central

Chaudhuri, Roy R.; Yu, Lu; Kanji, Alpa; Perkins, Timothy T.; Gardner, Paul P.; Choudhary, Jyoti; Maskell, Duncan J.

2011-01-01

Campylobacter jejuni is the most common bacterial cause of foodborne disease in the developed world. Its general physiology and biochemistry, as well as the mechanisms enabling it to colonize and cause disease in various hosts, are not well understood, and new approaches are required to understand its basic biology. High-throughput sequencing technologies provide unprecedented opportunities for functional genomic research. Recent studies have shown that direct Illumina sequencing of cDNA (RNA-seq) is a useful technique for the quantitative and qualitative examination of transcriptomes. In this study we report RNA-seq analyses of the transcriptomes of C. jejuni (NCTC11168) and its rpoN mutant. This has allowed the identification of hitherto unknown transcriptional units, and further defines the regulon that is dependent on rpoN for expression. The analysis of the NCTC11168 transcriptome was supplemented by additional proteomic analysis using liquid chromatography-MS. The transcriptomic and proteomic datasets represent an important resource for the Campylobacter research community. PMID:21816880
The transcriptome of Bathymodiolus azoricus gill reveals expression of genes from endosymbionts and free-living deep-sea bacteria.

PubMed

Egas, Conceição; Pinheiro, Miguel; Gomes, Paula; Barroso, Cristina; Bettencourt, Raul

2012-08-01

Deep-sea environments are largely unexplored habitats where a surprising number of species may be found in large communities, thriving regardless of the darkness, extreme cold, and high pressure. Their unique geochemical features result in reducing environments rich in methane and sulfides, sustaining complex chemosynthetic ecosystems that represent one of the most surprising findings in oceans in the last 40 years. The deep-sea Lucky Strike hydrothermal vent field, located in the Mid Atlantic Ridge, is home to large vent mussel communities where Bathymodiolus azoricus represents the dominant faunal biomass, owing its survival to symbiotic associations with methylotrophic or methanotrophic and thiotrophic bacteria. The recent transcriptome sequencing and analysis of gill tissues from B. azoricus revealed a number of genes of bacterial origin, hereby analyzed to provide a functional insight into the gill microbial community. The transcripts supported a metabolically active microbiome and a variety of mechanisms and pathways, evidencing also the sulfur and methane metabolisms. Taxonomic affiliation of transcripts and 16S rRNA community profiling revealed a microbial community dominated by thiotrophic and methanotrophic endosymbionts of B. azoricus and the presence of a Sulfurovum-like epsilonbacterium.
Analysis of Transcriptomic Dose Response Data in the ...

EPA Pesticide Factsheets

Slide presentation at the HESI-HEALTH Canada-McGill Workshop on Transcriptomic Dose Response Data in the Context of Chemical Risk Assessment Slide presentation at the HESI-HEALTH Canada-McGill Workshop on Transcriptomic Dose Response Data in the Context of Chemical Risk Assessment
Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis

PubMed Central

Jones, Beryl M.; Wcislo, William T.; Robinson, Gene E.

2015-01-01

Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome for analysis of gene expression profiles throughout development. Gene Ontology analysis indicates that stage-specific genes are involved in ion transport, cell–cell signaling, and metabolism. A number of distinct biological processes are upregulated in each life stage, and transitions between life stages involve shifts in dominant functional processes, including shifts from transcriptional regulation in embryos to metabolism in larvae, and increased lipid metabolism in adults. We expect that this transcriptome will provide a useful resource for future analyses to better understand the molecular basis of the evolution of eusociality and, more generally, phenotypic plasticity. PMID:26276382
Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis.

PubMed

Jones, Beryl M; Wcislo, William T; Robinson, Gene E

2015-08-14

Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome for analysis of gene expression profiles throughout development. Gene Ontology analysis indicates that stage-specific genes are involved in ion transport, cell-cell signaling, and metabolism. A number of distinct biological processes are upregulated in each life stage, and transitions between life stages involve shifts in dominant functional processes, including shifts from transcriptional regulation in embryos to metabolism in larvae, and increased lipid metabolism in adults. We expect that this transcriptome will provide a useful resource for future analyses to better understand the molecular basis of the evolution of eusociality and, more generally, phenotypic plasticity. Copyright © 2015 Jones et al.
Transcriptome In Vivo Analysis (TIVA) of spatially defined single cells in intact live mouse and human brain tissue

PubMed Central

Lovatt, Ditte; Ruble, Brittani K.; Lee, Jaehee; Dueck, Hannah; Kim, Tae Kyung; Fisher, Stephen; Francis, Chantal; Spaethling, Jennifer M.; Wolf, John A.; Grady, M. Sean; Ulyanova, Alexandra V.; Yeldell, Sean B.; Griepenburg, Julianne C.; Buckley, Peter T.; Kim, Junhyong; Sul, Jai-Yoon; Dmochowski, Ivan J.; Eberwine, James

2014-01-01

Transcriptome profiling is an indispensable tool in advancing the understanding of single cell biology, but depends upon methods capable of isolating mRNA at the spatial resolution of a single cell. Current capture methods lack sufficient spatial resolution to isolate mRNA from individual in vivo resident cells without damaging adjacent tissue. Because of this limitation, it has been difficult to assess the influence of the microenvironment on the transcriptome of individual neurons. Here, we engineered a Transcriptome In Vivo Analysis (TIVA)-tag, which upon photoactivation enables mRNA capture from single cells in live tissue. Using the TIVA-tag in combination with RNA-seq to analyze transcriptome variance among single dispersed cells and in vivo resident mouse and human neurons, we show that the tissue microenvironment shapes the transcriptomic landscape of individual cells. The TIVA methodology provides the first noninvasive approach for capturing mRNA from single cells in their natural microenvironment. PMID:24412976
A sense of life: computational and experimental investigations with models of biochemical and evolutionary processes.

PubMed

Mishra, Bud; Daruwala, Raoul-Sam; Zhou, Yi; Ugel, Nadia; Policriti, Alberto; Antoniotti, Marco; Paxia, Salvatore; Rejali, Marc; Rudra, Archisman; Cherepinsky, Vera; Silver, Naomi; Casey, William; Piazza, Carla; Simeoni, Marta; Barbano, Paolo; Spivak, Marina; Feng, Jiawu; Gill, Ofer; Venkatesh, Mysore; Cheng, Fang; Sun, Bing; Ioniata, Iuliana; Anantharaman, Thomas; Hubbard, E Jane Albert; Pnueli, Amir; Harel, David; Chandru, Vijay; Hariharan, Ramesh; Wigler, Michael; Park, Frank; Lin, Shih-Chieh; Lazebnik, Yuri; Winkler, Franz; Cantor, Charles R; Carbone, Alessandra; Gromov, Mikhael

2003-01-01

We collaborate in a research program aimed at creating a rigorous framework, experimental infrastructure, and computational environment for understanding, experimenting with, manipulating, and modifying a diverse set of fundamental biological processes at multiple scales and spatio-temporal modes. The novelty of our research is based on an approach that (i) requires coevolution of experimental science and theoretical techniques and (ii) exploits a certain universality in biology guided by a parsimonious model of evolutionary mechanisms operating at the genomic level and manifesting at the proteomic, transcriptomic, phylogenic, and other higher levels. Our current program in "systems biology" endeavors to marry large-scale biological experiments with the tools to ponder and reason about large, complex, and subtle natural systems. To achieve this ambitious goal, ideas and concepts are combined from many different fields: biological experimentation, applied mathematical modeling, computational reasoning schemes, and large-scale numerical and symbolic simulations. From a biological viewpoint, the basic issues are many: (i) understanding common and shared structural motifs among biological processes; (ii) modeling biological noise due to interactions among a small number of key molecules or loss of synchrony; (iii) explaining the robustness of these systems in spite of such noise; and (iv) cataloging multistatic behavior and adaptation exhibited by many biological processes.
Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to Mud crab reovirus (MCRV).

PubMed

Liu, Shanshan; Chen, Guanxing; Xu, Haidong; Zou, Weibin; Yan, Wenrui; Wang, Qianqian; Deng, Hengwei; Zhang, Heqian; Yu, Guojiao; He, Jianguo; Weng, Shaoping

2017-01-01

Mud crab (Scylla paramamosain) is an economically important marine cultured species in China's coastal area. Mud crab reovirus (MCRV) is the most important pathogen of mud crab, resulting in large economic losses in crab farming. In this paper, next-generation sequencing technology and bioinformatics analysis are used to study transcriptome differences between MCRV-infected mud crab and normal control. A total of 104.3 million clean reads were obtained, including 52.7 million and 51.6 million clean reads from MCRV-infected (CA) and controlled (HA) mud crabs respectively. 81,901, 70,059 and 67,279 unigenes were gained respectively from HA reads, CA reads and HA&CA reads. A total of 32,547 unigenes from HA&CA reads called All-Unigenes were matched to at least one database among Nr, Nt, Swiss-prot, COG, GO and KEGG databases. Among these, 13,039, 20,260 and 11,866 unigenes belonged to the 3, 258 and 25 categories of GO, KEGG pathway, and COG databases, respectively. Solexa/Illumina's DGE platform was also used, and about 13,856 differentially expressed genes (DEGs), including 4444 significantly upregulated and 9412 downregulated DEGs were detected in diseased crabs compared with the control. KEGG pathway analysis revealed that DEGs were obviously enriched in the pathways related to different diseases or infections. This transcriptome analysis provided valuable information on gene functions associated with the response to MCRV in mud crab, as well as detail information for identifying novel genes in the absence of the mud crab genome database. Copyright © 2016. Published by Elsevier Ltd.
Comparative transcriptome analysis of pepper (Capsicum annuum) revealed common regulons in multiple stress conditions and hormone treatments.

PubMed

Lee, Sanghyeob; Choi, Doil

2013-09-01

Global transcriptome analysis revealed common regulons for biotic/abiotic stresses, and some of these regulons encoding signaling components in both stresses were newly identified in this study. In this study, we aimed to identify plant responses to multiple stress conditions and discover the common regulons activated under a variety of stress conditions. Global transcriptome analysis revealed that salicylic acid (SA) may affect the activation of abiotic stress-responsive genes in pepper. Our data indicate that methyl jasmonate (MeJA) and ethylene (ET)-responsive genes were primarily activated by biotic stress, while abscisic acid (ABA)-responsive genes were activated under both types of stresses. We also identified differentially expressed gene (DEG) responses to specific stress conditions. Biotic stress induces more DEGs than those induced by abiotic and hormone applications. The clustering analysis using DEGs indicates that there are common regulons for biotic or abiotic stress conditions. Although SA and MeJA have an antagonistic effect on gene expression levels, SA and MeJA show a largely common regulation as compared to the regulation at the DEG expression level induced by other hormones. We also monitored the expression profiles of DEG encoding signaling components. Twenty-two percent of these were commonly expressed in both stress conditions. The importance of this study is that several genes commonly regulated by both stress conditions may have future applications for creating broadly stress-tolerant pepper plants. This study revealed that there are complex regulons in pepper plant to both biotic and abiotic stress conditions.
PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.

PubMed

Gan, Ruei-Chi; Chen, Ting-Wen; Wu, Timothy H; Huang, Po-Jung; Lee, Chi-Ching; Yeh, Yuan-Ming; Chiu, Cheng-Hsun; Huang, Hsien-Da; Tang, Petrus

2016-12-22

Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw .
WormBase ParaSite - a comprehensive resource for helminth genomics.

PubMed

Howe, Kevin L; Bolt, Bruce J; Shafie, Myriam; Kersey, Paul; Berriman, Matthew

2017-07-01

The number of publicly available parasitic worm genome sequences has increased dramatically in the past three years, and research interest in helminth functional genomics is now quickly gathering pace in response to the foundation that has been laid by these collective efforts. A systematic approach to the organisation, curation, analysis and presentation of these data is clearly vital for maximising the utility of these data to researchers. We have developed a portal called WormBase ParaSite (http://parasite.wormbase.org) for interrogating helminth genomes on a large scale. Data from over 100 nematode and platyhelminth species are integrated, adding value by way of systematic and consistent functional annotation (e.g. protein domains and Gene Ontology terms), gene expression analysis (e.g. alignment of life-stage specific transcriptome data sets), and comparative analysis (e.g. orthologues and paralogues). We provide several ways of exploring the data, including genome browsers, genome and gene summary pages, text search, sequence search, a query wizard, bulk downloads, and programmatic interfaces. In this review, we provide an overview of the back-end infrastructure and analysis behind WormBase ParaSite, and the displays and tools available to users for interrogating helminth genomic data. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

Spatial transcriptomic survey of human embryonic cerebral cortex by single-cell RNA-seq analysis.

PubMed

Fan, Xiaoying; Dong, Ji; Zhong, Suijuan; Wei, Yuan; Wu, Qian; Yan, Liying; Yong, Jun; Sun, Le; Wang, Xiaoye; Zhao, Yangyu; Wang, Wei; Yan, Jie; Wang, Xiaoqun; Qiao, Jie; Tang, Fuchou

2018-06-04

The cellular complexity of human brain development has been intensively investigated, although a regional characterization of the entire human cerebral cortex based on single-cell transcriptome analysis has not been reported. Here, we performed RNA-seq on over 4,000 individual cells from 22 brain regions of human mid-gestation embryos. We identified 29 cell sub-clusters, which showed different proportions in each region and the pons showed especially high percentage of astrocytes. Embryonic neurons were not as diverse as adult neurons, although they possessed important features of their destinies in adults. Neuron development was unsynchronized in the cerebral cortex, as dorsal regions appeared to be more mature than ventral regions at this stage. Region-specific genes were comprehensively identified in each neuronal sub-cluster, and a large proportion of these genes were neural disease related. Our results present a systematic landscape of the regionalized gene expression and neuron maturation of the human cerebral cortex.
Germination Potential of Dormant and Nondormant Arabidopsis Seeds Is Driven by Distinct Recruitment of Messenger RNAs to Polysomes

PubMed Central

Basbouss-Serhal, Isabelle; Soubigou-Taconnat, Ludivine; Bailly, Christophe; Leymarie, Juliette

2015-01-01

Dormancy is a complex evolutionary trait that temporally prevents seed germination, thus allowing seedling growth at a favorable season. High-throughput analyses of transcriptomes have led to significant progress in understanding the molecular regulation of this process, but the role of posttranscriptional mechanisms has received little attention. In this work, we have studied the dynamics of messenger RNA association with polysomes and compared the transcriptome with the translatome in dormant and nondormant seeds of Arabidopsis (Arabidopsis thaliana) during their imbibition at 25°C in darkness, a temperature preventing germination of dormant seeds only. DNA microarray analysis revealed that 4,670 and 7,028 transcripts were differentially abundant in dormant and nondormant seeds in the transcriptome and the translatome, respectively. We show that there is no correlation between transcriptome and translatome and that germination regulation is also largely translational, implying a selective and dynamic recruitment of messenger RNAs to polysomes in both dormant and nondormant seeds. The study of 5′ untranslated region features revealed that GC content and the number of upstream open reading frames could play a role in selective translation occurring during germination. Gene Ontology clustering showed that the functions of polysome-associated transcripts differed between dormant and nondormant seeds and revealed actors in seed dormancy and germination. In conclusion, our results demonstrate the essential role of selective polysome loading in this biological process. PMID:26019300
Transcriptomics Profiling of Alzheimer’s Disease Reveal Neurovascular Defects, Altered Amyloid-β Homeostasis, and Deregulated Expression of Long Noncoding RNAs

PubMed Central

Magistri, Marco; Velmeshev, Dmitry; Makhmutova, Madina; Faghihi, Mohammad Ali

2015-01-01

Abstract The underlying genetic variations of late-onset Alzheimer’s disease (LOAD) cases remain largely unknown. A combination of genetic variations with variable penetrance and lifetime epigenetic factors may converge on transcriptomic alterations that drive LOAD pathological process. Transcriptome profiling using deep sequencing technology offers insight into common altered pathways regardless of underpinning genetic or epigenetic factors and thus represents an ideal tool to investigate molecular mechanisms related to the pathophysiology of LOAD. We performed directional RNA sequencing on high quality RNA samples extracted from hippocampi of LOAD and age-matched controls. We further validated our data using qRT-PCR on a larger set of postmortem brain tissues, confirming downregulation of the gene encoding substance P (TAC1) and upregulation of the gene encoding the plasminogen activator inhibitor-1 (SERPINE1). Pathway analysis indicates dysregulation in neural communication, cerebral vasculature, and amyloid-β clearance. Beside protein coding genes, we identified several annotated and non-annotated long noncoding RNAs that are differentially expressed in LOAD brain tissues, three of them are activity-dependent regulated and one is induced by Aβ1 - 42 exposure of human neural cells. Our data provide a comprehensive list of transcriptomics alterations in LOAD hippocampi and warrant holistic approach including both coding and non-coding RNAs in functional studies aimed to understand the pathophysiology of LOAD. PMID:26402107
Reptilian-transcriptome v1.0, a glimpse in the brain transcriptome of five divergent Sauropsida lineages and the phylogenetic position of turtles

PubMed Central

2011-01-01

Background Reptiles are largely under-represented in comparative genomics despite the fact that they are substantially more diverse in many respects than mammals. Given the high divergence of reptiles from classical model species, next-generation sequencing of their transcriptomes is an approach of choice for gene identification and annotation. Results Here, we use 454 technology to sequence the brain transcriptome of four divergent reptilian and one reference avian species: the Nile crocodile, the corn snake, the bearded dragon, the red-eared turtle, and the chicken. Using an in-house pipeline for recursive similarity searches of >3,000,000 reads against multiple databases from 7 reference vertebrates, we compile a reptilian comparative transcriptomics dataset, with homology assignment for 20,000 to 31,000 transcripts per species and a cumulated non-redundant sequence length of 248.6 Mbases. Our approach identifies the majority (87%) of chicken brain transcripts and about 50% of de novo assembled reptilian transcripts. In addition to 57,502 microsatellite loci, we identify thousands of SNP and indel polymorphisms for population genetic and linkage analyses. We also build very large multiple alignments for Sauropsida and mammals (two million residues per species) and perform extensive phylogenetic analyses suggesting that turtles are not basal living reptiles but are rather associated with Archosaurians, hence, potentially answering a long-standing question in the phylogeny of Amniotes. Conclusions The reptilian transcriptome (freely available at http://www.reptilian-transcriptomes.org) should prove a useful new resource as reptiles are becoming important new models for comparative genomics, ecology, and evolutionary developmental genetics. PMID:21943375
An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

PubMed Central

2012-01-01

Background The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate. Results We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures. Conclusion T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially. PMID:22276688
High dimensional biological data retrieval optimization with NoSQL technology.

PubMed

Wang, Shicai; Pandis, Ioannis; Wu, Chao; He, Sijin; Johnson, David; Emam, Ibrahim; Guitton, Florian; Guo, Yike

2014-01-01

High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data.
High dimensional biological data retrieval optimization with NoSQL technology

PubMed Central

2014-01-01

Background High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. Results In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. Conclusions The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data. PMID:25435347
Transcriptome Profiling of the Abdominal Skin of Larimichthys crocea in Light Stress

NASA Astrophysics Data System (ADS)

Han, Zhaofang; Lv, Changhuan; Xiao, Shijun; Ye, Kun; Zhang, Dongling; Tsai, Huai Jen; Wang, Zhiyong

2018-04-01

Large yellow croaker ( Larimichthys crocea), one of the most important marine fish species in China, can change its abdominal skin color when it is shifted from light to dark or from dark to light, providing us an opportunity of investigating the molecular responding mechanism of teleost in light stress. The gene expression profile of fish under light stress is rarely documented. In this research, the transcriptome profiles of the abdominal skin of L. crocea exposed to light or dark for 0 h, 0.5 h and 2 h were produced by next-generation sequencing (NGS). The cluster results demonstrated that stress period, rather than light intensity ( e.g., light or dark), is the major influencing factor. Differently expressed genes (DEGs) were identified between 0 h and 0.5 h groups, between 0 h and 2 h groups, between 0.5 h light and 0.5 h dark, and between 2 h light and 2 h dark, respectively. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotation revealed that the genes relating to immunity, energy metabolism, and cytoskeletal protein binding were significantly enriched. The detailed analysis of transcriptome profiles also revealed regular gene expression trends, indicating that the elaborate gene regulation networks underlined the molecular responses of the fish to light stress. This transcriptome analysis suggested that systematic and complicated regulatory cascades were functionally activated in response to external stress, and coloration change caused by light stress was mainly attributed to the change in the density of chromatophores for L. crocea. This study also provided valuable information for skin coloration or light stress research on other marine fish species.
Gene expression signature of cerebellar hypoplasia in a mouse model of Down syndrome during postnatal development

PubMed Central

Laffaire, Julien; Rivals, Isabelle; Dauphinot, Luce; Pasteau, Fabien; Wehrle, Rosine; Larrat, Benoit; Vitalis, Tania; Moldrich, Randal X; Rossier, Jean; Sinkus, Ralph; Herault, Yann; Dusart, Isabelle; Potier, Marie-Claude

2009-01-01

Background Down syndrome is a chromosomal disorder caused by the presence of three copies of chromosome 21. The mechanisms by which this aneuploidy produces the complex and variable phenotype observed in people with Down syndrome are still under discussion. Recent studies have demonstrated an increased transcript level of the three-copy genes with some dosage compensation or amplification for a subset of them. The impact of this gene dosage effect on the whole transcriptome is still debated and longitudinal studies assessing the variability among samples, tissues and developmental stages are needed. Results We thus designed a large scale gene expression study in mice (the Ts1Cje Down syndrome mouse model) in which we could measure the effects of trisomy 21 on a large number of samples (74 in total) in a tissue that is affected in Down syndrome (the cerebellum) and where we could quantify the defect during postnatal development in order to correlate gene expression changes to the phenotype observed. Statistical analysis of microarray data revealed a major gene dosage effect: for the three-copy genes as well as for a 2 Mb segment from mouse chromosome 12 that we show for the first time as being deleted in the Ts1Cje mice. This gene dosage effect impacts moderately on the expression of euploid genes (2.4 to 7.5% differentially expressed). Only 13 genes were significantly dysregulated in Ts1Cje mice at all four postnatal development stages studied from birth to 10 days after birth, and among them are 6 three-copy genes. The decrease in granule cell proliferation demonstrated in newborn Ts1Cje cerebellum was correlated with a major gene dosage effect on the transcriptome in dissected cerebellar external granule cell layer. Conclusion High throughput gene expression analysis in the cerebellum of a large number of samples of Ts1Cje and euploid mice has revealed a prevailing gene dosage effect on triplicated genes. Moreover using an enriched cell population that is thought responsible for the cerebellar hypoplasia in Down syndrome, a global destabilization of gene expression was not detected. Altogether these results strongly suggest that the three-copy genes are directly responsible for the phenotype present in cerebellum. We provide here a short list of candidate genes. PMID:19331679
Ultra-low input transcriptomics reveal the spore functional content and phylogenetic affiliations of poorly studied arbuscular mycorrhizal fungi.

PubMed

Beaudet, Denis; Chen, Eric C H; Mathieu, Stephanie; Yildirir, Gokalp; Ndikumana, Steve; Dalpé, Yolande; Séguin, Sylvie; Farinelli, Laurent; Stajich, Jason E; Corradi, Nicolas

2017-12-02

Arbuscular mycorrhizal fungi (AMF) are a group of soil microorganisms that establish symbioses with the vast majority of land plants. To date, generation of AMF coding information has been limited to model genera that grow well axenically; Rhizoglomus and Gigaspora. Meanwhile, data on the functional gene repertoire of most AMF families is non-existent. Here, we provide primary large-scale transcriptome data from eight poorly studied AMF species (Acaulospora morrowiae, Diversispora versiforme, Scutellospora calospora, Racocetra castanea, Paraglomus brasilianum, Ambispora leptoticha, Claroideoglomus claroideum and Funneliformis mosseae) using ultra-low input ribonucleic acid (RNA)-seq approaches. Our analyses reveals that quiescent spores of many AMF species harbour a diverse functional diversity and solidify known evolutionary relationships within the group. Our findings demonstrate that RNA-seq data obtained from low-input RNA are reliable in comparison to conventional RNA-seq experiments. Thus, our methodology can potentially be used to deepen our understanding of fungal microbial function and phylogeny using minute amounts of RNA material. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Construction of a Species-Level Tree of Life for the Insects and Utility in Taxonomic Profiling

PubMed Central

Chesters, Douglas

2017-01-01

Abstract Although comprehensive phylogenies have proven an invaluable tool in ecology and evolution, their construction is made increasingly challenging both by the scale and structure of publically available sequences. The distinct partition between gene-rich (genomic) and species-rich (DNA barcode) data is a feature of data that has been largely overlooked, yet presents a key obstacle to scaling supermatrix analysis. I present a phyloinformatics framework for draft construction of a species-level phylogeny of insects (Class Insecta). Matrix-building requires separately optimized pipelines for nuclear transcriptomic, mitochondrial genomic, and species-rich markers, whereas tree-building requires hierarchical inference in order to capture species-breadth while retaining deep-level resolution. The phylogeny of insects contains 49,358 species, 13,865 genera, 760 families. Deep-level splits largely reflected previous findings for sections of the tree that are data rich or unambiguous, such as inter-ordinal Endopterygota and Dictyoptera, the recently evolved and relatively homogeneous Lepidoptera, Hymenoptera, Brachycera (Diptera), and Cucujiformia (Coleoptera). However, analysis of bias, matrix construction and gene-tree variation suggests confidence in some relationships (such as in Polyneoptera) is less than has been indicated by the matrix bootstrap method. To assess the utility of the insect tree as a tool in query profiling several tree-based taxonomic assignment methods are compared. Using test data sets with existing taxonomic annotations, a tendency is observed for greater accuracy of species-level assignments where using a fixed comprehensive tree of life in contrast to methods generating smaller de novo reference trees. Described herein is a solution to the discrepancy in the way data are fit into supermatrices. The resulting tree facilitates wider studies of insect diversification and application of advanced descriptions of diversity in community studies, among other presumed applications. PMID:27798407
A detailed gene expression study of the Miscanthus genus reveals changes in the transcriptome associated with the rejuvenation of spring rhizomes.

PubMed

Barling, Adam; Swaminathan, Kankshita; Mitros, Therese; James, Brandon T; Morris, Juliette; Ngamboma, Ornella; Hall, Megan C; Kirkpatrick, Jessica; Alabady, Magdy; Spence, Ashley K; Hudson, Matthew E; Rokhsar, Daniel S; Moose, Stephen P

2013-12-09

The Miscanthus genus of perennial C4 grasses contains promising biofuel crops for temperate climates. However, few genomic resources exist for Miscanthus, which limits understanding of its interesting biology and future genetic improvement. A comprehensive catalog of expressed sequences were generated from a variety of Miscanthus species and tissue types, with an emphasis on characterizing gene expression changes in spring compared to fall rhizomes. Illumina short read sequencing technology was used to produce transcriptome sequences from different tissues and organs during distinct developmental stages for multiple Miscanthus species, including Miscanthus sinensis, Miscanthus sacchariflorus, and their interspecific hybrid Miscanthus × giganteus. More than fifty billion base-pairs of Miscanthus transcript sequence were produced. Overall, 26,230 Sorghum gene models (i.e., ~ 96% of predicted Sorghum genes) had at least five Miscanthus reads mapped to them, suggesting that a large portion of the Miscanthus transcriptome is represented in this dataset. The Miscanthus × giganteus data was used to identify genes preferentially expressed in a single tissue, such as the spring rhizome, using Sorghum bicolor as a reference. Quantitative real-time PCR was used to verify examples of preferential expression predicted via RNA-Seq. Contiguous consensus transcript sequences were assembled for each species and annotated using InterProScan. Sequences from the assembled transcriptome were used to amplify genomic segments from a doubled haploid Miscanthus sinensis and from Miscanthus × giganteus to further disentangle the allelic and paralogous variations in genes. This large expressed sequence tag collection creates a valuable resource for the study of Miscanthus biology by providing detailed gene sequence information and tissue preferred expression patterns. We have successfully generated a database of transcriptome assemblies and demonstrated its use in the study of genes of interest. Analysis of gene expression profiles revealed biological pathways that exhibit altered regulation in spring compared to fall rhizomes, which are consistent with their different physiological functions. The expression profiles of the subterranean rhizome provides a better understanding of the biological activities of the underground stem structures that are essentials for perenniality and the storage or remobilization of carbon and nutrient resources.
Insights into transcriptomes of Big and Low sagebrush

Treesearch

Mark D. Huynh; Justin T. Page; Bryce A. Richardson; Joshua A. Udall

2015-01-01

We report the sequencing and assembly of three transcriptomes from Big (Artemisia tridentatassp. wyomingensis and A. tridentatassp. tridentata) and Low (A. arbuscula ssp. arbuscula) sagebrush. The sequence reads are available in the Sequence Read Archive of NCBI. We demonstrate the utilities of these transcriptomes for gene discovery and phylogenomic analysis. An...
Molecular characterization of pyrethroid resistance in the olive fruit fly Bactrocera oleae.

PubMed

Pavlidi, Nena; Kampouraki, Anastasia; Tseliou, Vasilis; Wybouw, Nicky; Dermauw, Wannes; Roditakis, Emmanouil; Nauen, Ralf; Van Leeuwen, Thomas; Vontas, John

2018-06-01

Α reduction of pyrethroid efficacy has been recently recorded in Bactrocera oleae, the most destructive insect of olives. The resistance levels of field populations collected from Crete-Greece scaled up to 22-folds, compared to reference laboratory strains. Sequence analysis of the IIS4-IIS6 region of para sodium channel gene in a large number of resistant flies indicated that resistance may not be associated with target site mutations, in line with previous studies in other Tephritidae species. We analyzed the transcriptomic differences between two resistant populations versus an almost susceptible field population and two laboratory strains. A large number of genes was found to be significantly differentially transcribed across the pairwise comparisons. Interestingly, gene set analysis revealed that genes of the 'electron carrier activity' GO group were enriched in one specific comparison, which might suggest a P450-mediated resistance mechanism. The up-regulation of several transcripts encoding detoxification enzymes was qPCR validated, focusing on transcripts coding for P450s. Of note, the expression of contig00436 and contig02103, encoding CYP6 P450s, was significantly higher in all resistant populations, compared to susceptible ones. These results suggest that an increase in the amount of the CYP6 P450s might be an important mechanism of pyrethroid resistance in B. oleae. Copyright © 2018 Elsevier Inc. All rights reserved.
Using scale and feather traits for module construction provides a functional approach to chicken epidermal development.

PubMed

Bao, Weier; Greenwold, Matthew J; Sawyer, Roger H

2017-11-01

Gene co-expression network analysis has been a research method widely used in systematically exploring gene function and interaction. Using the Weighted Gene Co-expression Network Analysis (WGCNA) approach to construct a gene co-expression network using data from a customized 44K microarray transcriptome of chicken epidermal embryogenesis, we have identified two distinct modules that are highly correlated with scale or feather development traits. Signaling pathways related to feather development were enriched in the traditional KEGG pathway analysis and functional terms relating specifically to embryonic epidermal development were also enriched in the Gene Ontology analysis. Significant enrichment annotations were discovered from customized enrichment tools such as Modular Single-Set Enrichment Test (MSET) and Medical Subject Headings (MeSH). Hub genes in both trait-correlated modules showed strong specific functional enrichment toward epidermal development. Also, regulatory elements, such as transcription factors and miRNAs, were targeted in the significant enrichment result. This work highlights the advantage of this methodology for functional prediction of genes not previously associated with scale- and feather trait-related modules.
Quantitative phenotyping via deep barcode sequencing.

PubMed

Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey

2009-10-01

Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.
Combined analysis of DNA methylome and transcriptome reveal novel candidate genes with susceptibility to bovine Staphylococcus aureus subclinical mastitis.

PubMed

Song, Minyan; He, Yanghua; Zhou, Huangkai; Zhang, Yi; Li, Xizhi; Yu, Ying

2016-07-14

Subclinical mastitis is a widely spread disease of lactating cows. Its major pathogen is Staphylococcus aureus (S. aureus). In this study, we performed genome-wide integrative analysis of DNA methylation and transcriptional expression to identify candidate genes and pathways relevant to bovine S. aureus subclinical mastitis. The genome-scale DNA methylation profiles of peripheral blood lymphocytes in cows with S. aureus subclinical mastitis (SA group) and healthy controls (CK) were generated by methylated DNA immunoprecipitation combined with microarrays. We identified 1078 differentially methylated genes in SA cows compared with the controls. By integrating DNA methylation and transcriptome data, 58 differentially methylated genes were shared with differently expressed genes, in which 20.7% distinctly hypermethylated genes showed down-regulated expression in SA versus CK, whereas 14.3% dramatically hypomethylated genes showed up-regulated expression. Integrated pathway analysis suggested that these genes were related to inflammation, ErbB signalling pathway and mismatch repair. Further functional analysis revealed that three genes, NRG1, MST1 and NAT9, were strongly correlated with the progression of S. aureus subclinical mastitis and could be used as powerful biomarkers for the improvement of bovine mastitis resistance. Our studies lay the groundwork for epigenetic modification and mechanistic studies on susceptibility of bovine mastitis.
Combined analysis of DNA methylome and transcriptome reveal novel candidate genes with susceptibility to bovine Staphylococcus aureus subclinical mastitis

PubMed Central

Song, Minyan; He, Yanghua; Zhou, Huangkai; Zhang, Yi; Li, Xizhi; Yu, Ying

2016-01-01

Subclinical mastitis is a widely spread disease of lactating cows. Its major pathogen is Staphylococcus aureus (S. aureus). In this study, we performed genome-wide integrative analysis of DNA methylation and transcriptional expression to identify candidate genes and pathways relevant to bovine S. aureus subclinical mastitis. The genome-scale DNA methylation profiles of peripheral blood lymphocytes in cows with S. aureus subclinical mastitis (SA group) and healthy controls (CK) were generated by methylated DNA immunoprecipitation combined with microarrays. We identified 1078 differentially methylated genes in SA cows compared with the controls. By integrating DNA methylation and transcriptome data, 58 differentially methylated genes were shared with differently expressed genes, in which 20.7% distinctly hypermethylated genes showed down-regulated expression in SA versus CK, whereas 14.3% dramatically hypomethylated genes showed up-regulated expression. Integrated pathway analysis suggested that these genes were related to inflammation, ErbB signalling pathway and mismatch repair. Further functional analysis revealed that three genes, NRG1, MST1 and NAT9, were strongly correlated with the progression of S. aureus subclinical mastitis and could be used as powerful biomarkers for the improvement of bovine mastitis resistance. Our studies lay the groundwork for epigenetic modification and mechanistic studies on susceptibility of bovine mastitis. PMID:27411928
Decoding genes with coexpression networks and metabolomics - 'majority report by precogs'.

PubMed

Saito, Kazuki; Hirai, Masami Y; Yonekura-Sakakibara, Keiko

2008-01-01

Following the sequencing of whole genomes of model plants, high-throughput decoding of gene function is a major challenge in modern plant biology. In view of remarkable technical advances in transcriptomics and metabolomics, integrated analysis of these 'omics' by data-mining informatics is an excellent tool for prediction and identification of gene function, particularly for genes involved in complicated metabolic pathways. The availability of Arabidopsis public transcriptome datasets containing data of >1000 microarrays reinforces the potential for prediction of gene function by transcriptome coexpression analysis. Here, we review the strategy of combining transcriptome and metabolome as a powerful technology for studying the functional genomics of model plants and also crop and medicinal plants.
Horizontal gene transfer is a significant driver of gene innovation in dinoflagellates.

PubMed

Wisecaver, Jennifer H; Brosnahan, Michael L; Hackett, Jeremiah D

2013-01-01

The dinoflagellates are an evolutionarily and ecologically important group of microbial eukaryotes. Previous work suggests that horizontal gene transfer (HGT) is an important source of gene innovation in these organisms. However, dinoflagellate genomes are notoriously large and complex, making genomic investigation of this phenomenon impractical with currently available sequencing technology. Fortunately, de novo transcriptome sequencing and assembly provides an alternative approach for investigating HGT. We sequenced the transcriptome of the dinoflagellate Alexandrium tamarense Group IV to investigate how HGT has contributed to gene innovation in this group. Our comprehensive A. tamarense Group IV gene set was compared with those of 16 other eukaryotic genomes. Ancestral gene content reconstruction of ortholog groups shows that A. tamarense Group IV has the largest number of gene families gained (314-1,563 depending on inference method) relative to all other organisms in the analysis (0-782). Phylogenomic analysis indicates that genes horizontally acquired from bacteria are a significant proportion of this gene influx, as are genes transferred from other eukaryotes either through HGT or endosymbiosis. The dinoflagellates also display curious cases of gene loss associated with mitochondrial metabolism including the entire Complex I of oxidative phosphorylation. Some of these missing genes have been functionally replaced by bacterial and eukaryotic xenologs. The transcriptome of A. tamarense Group IV lends strong support to a growing body of evidence that dinoflagellate genomes are extraordinarily impacted by HGT.

Horizontal Gene Transfer is a Significant Driver of Gene Innovation in Dinoflagellates

PubMed Central

Wisecaver, Jennifer H.; Brosnahan, Michael L.; Hackett, Jeremiah D.

2013-01-01

The dinoflagellates are an evolutionarily and ecologically important group of microbial eukaryotes. Previous work suggests that horizontal gene transfer (HGT) is an important source of gene innovation in these organisms. However, dinoflagellate genomes are notoriously large and complex, making genomic investigation of this phenomenon impractical with currently available sequencing technology. Fortunately, de novo transcriptome sequencing and assembly provides an alternative approach for investigating HGT. We sequenced the transcriptome of the dinoflagellate Alexandrium tamarense Group IV to investigate how HGT has contributed to gene innovation in this group. Our comprehensive A. tamarense Group IV gene set was compared with those of 16 other eukaryotic genomes. Ancestral gene content reconstruction of ortholog groups shows that A. tamarense Group IV has the largest number of gene families gained (314–1,563 depending on inference method) relative to all other organisms in the analysis (0–782). Phylogenomic analysis indicates that genes horizontally acquired from bacteria are a significant proportion of this gene influx, as are genes transferred from other eukaryotes either through HGT or endosymbiosis. The dinoflagellates also display curious cases of gene loss associated with mitochondrial metabolism including the entire Complex I of oxidative phosphorylation. Some of these missing genes have been functionally replaced by bacterial and eukaryotic xenologs. The transcriptome of A. tamarense Group IV lends strong support to a growing body of evidence that dinoflagellate genomes are extraordinarily impacted by HGT. PMID:24259313
Fructose overfeeding in first-degree relatives of type 2 diabetic patients impacts energy metabolism and mitochondrial functions in skeletal muscle.

PubMed

Seyssel, Kevin; Meugnier, Emmanuelle; Lê, Kim-Anne; Durand, Christine; Disse, Emmanuel; Blond, Emilie; Pays, Laurent; Nataf, Serge; Brozek, John; Vidal, Hubert; Tappy, Luc; Laville, Martine

2016-12-01

The aim of the study was to assess the effects of a high-fructose diet (HFrD) on skeletal muscle transcriptomic response in healthy offspring of patients with type 2 diabetes, a subgroup of individuals prone to metabolic disorders. Ten healthy normal weight first-degree relatives of type 2 diabetic patients were submitted to a HFrD (+3.5 g fructose/kg fat-free mass per day) during 7 days. A global transcriptomic analysis was performed on skeletal muscle biopsies combined with in vitro experiments using primary myotubes. Transcriptomic analysis highlighted profound effects on fatty acid oxidation and mitochondrial pathways supporting the whole-body metabolic shift with the preferential use of carbohydrates instead of lipids. Bioinformatics tools pointed out possible transcription factors orchestrating this genomic regulation, such as PPARα and NR4A2. In vitro experiments in human myotubes suggested an indirect action of fructose in skeletal muscle, which seemed to be independent from lactate, uric acid, or nitric oxide. This study shows therefore that a large cluster of genes related to energy metabolism, mitochondrial function, and lipid oxidation was downregulated after 7 days of HFrD, thus supporting the concept that overconsumption of fructose-containing foods could contribute to metabolic deterioration in humans. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Belowground neighbor perception in Arabidopsis thaliana studied by transcriptome analysis: roots of Hieracium pilosella cause biotic stress

PubMed Central

Schmid, Christoph; Bauer, Sibylle; Müller, Benedikt; Bartelheimer, Maik

2013-01-01

Root-root interactions are much more sophisticated than previously thought, yet the mechanisms of belowground neighbor perception remain largely obscure. Genome-wide transcriptome analyses allow detailed insight into plant reactions to environmental cues. A root interaction trial was set up to explore both morphological and whole genome transcriptional responses in roots of Arabidopsis thaliana in the presence or absence of an inferior competitor, Hieracium pilosella. Neighbor perception was indicated by Arabidopsis roots predominantly growing away from the neighbor (segregation), while solitary plants placed more roots toward the middle of the pot. Total biomass remained unaffected. Database comparisons in transcriptome analysis revealed considerable similarity between Arabidopsis root reactions to neighbors and reactions to pathogens. Detailed analyses of the functional category “biotic stress” using MapMan tools found the sub-category “pathogenesis-related proteins” highly significantly induced. A comparison to a study on intraspecific competition brought forward a core of genes consistently involved in reactions to neighbor roots. We conclude that beyond resource depletion roots perceive neighboring roots or their associated microorganisms by a relatively uniform mechanism that involves the strong induction of pathogenesis-related proteins. In an ecological context the findings reveal that belowground neighbor detection may occur independently of resource depletion, allowing for a time advantage for the root to prepare for potential interactions. PMID:23967000
De Novo Assembly and Analysis of Polygonatum sibiricum Transcriptome and Identification of Genes Involved in Polysaccharide Biosynthesis.

PubMed

Wang, Shiqiang; Wang, Bin; Hua, Wenping; Niu, Junfeng; Dang, Kaikai; Qiang, Yi; Wang, Zhezhi

2017-09-12

Polygonatum sibiricum polysaccharides (PSPs) are used to improve immunity, alleviate dryness, promote the secretion of fluids, and quench thirst. However, the PSP biosynthetic pathway is largely unknown. Understanding the genetic background will help delineate that pathway at the molecular level so that researchers can develop better conservation strategies. After comparing the PSP contents among several different P. sibiricum germplasms, we selected two groups with the largest contrasts in contents and subjected them to HiSeq2500 transcriptome sequencing to identify the candidate genes involved in PSP biosynthesis. In all, 20 kinds of enzyme-encoding genes were related to PSP biosynthesis. The polysaccharide content was positively correlated with the expression patterns of β-fructofuranosidase ( sacA ), fructokinase ( scrK ), UDP-glucose 4-epimerase ( GALE ), Mannose-1-phosphate guanylyltransferase ( GMPP ), and UDP-glucose 6-dehydrogenase ( UGDH ), but negatively correlated with the expression of Hexokinase ( HK ). Through qRT-PCR validation and comprehensive analysis, we determined that sacA , HK , and GMPP are key genes for enzymes within the PSP metabolic pathway in P. sibiricum. Our results provide a public transcriptome dataset for this species and an outline of pathways for the production of polysaccharides in medicinal plants. They also present more information about the PSP biosynthesis pathway at the molecular level in P. sibiricum and lay the foundation for subsequent research of gene functions.
Single-nucleus RNA-seq of differentiating human myoblasts reveals the extent of fate heterogeneity

PubMed Central

Zeng, Weihua; Jiang, Shan; Kong, Xiangduo; El-Ali, Nicole; Ball, Alexander R.; Ma, Christopher I-Hsing; Hashimoto, Naohiro; Yokomori, Kyoko; Mortazavi, Ali

2016-01-01

Myoblasts are precursor skeletal muscle cells that differentiate into fused, multinucleated myotubes. Current single-cell microfluidic methods are not optimized for capturing very large, multinucleated cells such as myotubes. To circumvent the problem, we performed single-nucleus transcriptome analysis. Using immortalized human myoblasts, we performed RNA-seq analysis of single cells (scRNA-seq) and single nuclei (snRNA-seq) and found them comparable, with a distinct enrichment for long non-coding RNAs (lncRNAs) in snRNA-seq. We then compared snRNA-seq of myoblasts before and after differentiation. We observed the presence of mononucleated cells (MNCs) that remained unfused and analyzed separately from multi-nucleated myotubes. We found that while the transcriptome profiles of myoblast and myotube nuclei are relatively homogeneous, MNC nuclei exhibited significant heterogeneity, with the majority of them adopting a distinct mesenchymal state. Primary transcripts for microRNAs (miRNAs) that participate in skeletal muscle differentiation were among the most differentially expressed lncRNAs, which we validated using NanoString. Our study demonstrates that snRNA-seq provides reliable transcriptome quantification for cells that are otherwise not amenable to current single-cell platforms. Our results further indicate that snRNA-seq has unique advantage in capturing nucleus-enriched lncRNAs and miRNA precursors that are useful in mapping and monitoring differential miRNA expression during cellular differentiation. PMID:27566152
De Novo Assembly and Analysis of Polygonatum sibiricum Transcriptome and Identification of Genes Involved in Polysaccharide Biosynthesis

PubMed Central

Wang, Shiqiang; Wang, Bin; Hua, Wenping; Niu, Junfeng; Dang, Kaikai; Qiang, Yi; Wang, Zhezhi

2017-01-01

Polygonatum sibiricum polysaccharides (PSPs) are used to improve immunity, alleviate dryness, promote the secretion of fluids, and quench thirst. However, the PSP biosynthetic pathway is largely unknown. Understanding the genetic background will help delineate that pathway at the molecular level so that researchers can develop better conservation strategies. After comparing the PSP contents among several different P. sibiricum germplasms, we selected two groups with the largest contrasts in contents and subjected them to HiSeq2500 transcriptome sequencing to identify the candidate genes involved in PSP biosynthesis. In all, 20 kinds of enzyme-encoding genes were related to PSP biosynthesis. The polysaccharide content was positively correlated with the expression patterns of β-fructofuranosidase (sacA), fructokinase (scrK), UDP-glucose 4-epimerase (GALE), Mannose-1-phosphate guanylyltransferase (GMPP), and UDP-glucose 6-dehydrogenase (UGDH), but negatively correlated with the expression of Hexokinase (HK). Through qRT-PCR validation and comprehensive analysis, we determined that sacA, HK, and GMPP are key genes for enzymes within the PSP metabolic pathway in P. sibiricum. Our results provide a public transcriptome dataset for this species and an outline of pathways for the production of polysaccharides in medicinal plants. They also present more information about the PSP biosynthesis pathway at the molecular level in P. sibiricum and lay the foundation for subsequent research of gene functions. PMID:28895881
Combined Analysis of the Chloroplast Genome and Transcriptome of the Antarctic Vascular Plant Deschampsia antarctica Desv

PubMed Central

Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

2014-01-01

Background Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. Results The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5′- or 3′-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. Conclusions We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the characteristics of the chloroplast transcriptome. PMID:24647560
Transcriptome Analysis of the Arabidopsis Megaspore Mother Cell Uncovers the Importance of RNA Helicases for Plant Germline Development

PubMed Central

Schmidt, Anja; Wuest, Samuel E.; Vijverberg, Kitty; Baroux, Célia; Kleen, Daniela; Grossniklaus, Ueli

2011-01-01

Germ line specification is a crucial step in the life cycle of all organisms. For sexual plant reproduction, the megaspore mother cell (MMC) is of crucial importance: it marks the first cell of the plant “germline” lineage that gets committed to undergo meiosis. One of the meiotic products, the functional megaspore, subsequently gives rise to the haploid, multicellular female gametophyte that harbours the female gametes. The MMC is formed by selection and differentiation of a single somatic, sub-epidermal cell in the ovule. The transcriptional network underlying MMC specification and differentiation is largely unknown. We provide the first transcriptome analysis of an MMC using the model plant Arabidopsis thaliana with a combination of laser-assisted microdissection and microarray hybridizations. Statistical analyses identified an over-representation of translational regulation control pathways and a significant enrichment of DEAD/DEAH-box helicases in the MMC transcriptome, paralleling important features of the animal germline. Analysis of two independent T-DNA insertion lines suggests an important role of an enriched helicase, MNEME (MEM), in MMC differentiation and the restriction of the germline fate to only one cell per ovule primordium. In heterozygous mem mutants, additional enlarged MMC-like cells, which sometimes initiate female gametophyte development, were observed at higher frequencies than in the wild type. This closely resembles the phenotype of mutants affected in the small RNA and DNA-methylation pathways important for epigenetic regulation. Importantly, the mem phenotype shows features of apospory, as female gametophytes initiate from two non-sister cells in these mutants. Moreover, in mem gametophytic nuclei, both higher order chromatin structure and the distribution of LIKE HETEROCHROMATIN PROTEIN1 were affected, indicating epigenetic perturbations. In summary, the MMC transcriptome sets the stage for future functional characterization as illustrated by the identification of MEM, a novel gene involved in the restriction of germline fate. PMID:21949639
Subgroup-Elimination Transcriptomics Identifies Signaling Proteins that Define Subclasses of TRPV1-Positive Neurons and a Novel Paracrine Circuit

PubMed Central

Isensee, Jörg; Wenzel, Carsten; Buschow, Rene; Weissmann, Robert; Kuss, Andreas W.; Hucho, Tim

2014-01-01

Normal and painful stimuli are detected by specialized subgroups of peripheral sensory neurons. The understanding of the functional differences of each neuronal subgroup would be strongly enhanced by knowledge of the respective subgroup transcriptome. The separation of the subgroup of interest, however, has proven challenging as they can hardly be enriched. Instead of enriching, we now rapidly eliminated the subgroup of neurons expressing the heat-gated cation channel TRPV1 from dissociated rat sensory ganglia. Elimination was accomplished by brief treatment with TRPV1 agonists followed by the removal of compromised TRPV1(+) neurons using density centrifugation. By differential microarray and sequencing (RNA-Seq) based expression profiling we compared the transcriptome of all cells within sensory ganglia versus the same cells lacking TRPV1 expressing neurons, which revealed 240 differentially expressed genes (adj. p<0.05, fold-change>1.5). Corroborating the specificity of the approach, many of these genes have been reported to be involved in noxious heat or pain sensitization. Beyond the expected enrichment of ion channels, we found the TRPV1 transcriptome to be enriched for GPCRs and other signaling proteins involved in adenosine, calcium, and phosphatidylinositol signaling. Quantitative population analysis using a recent High Content Screening (HCS) microscopy approach identified substantial heterogeneity of expressed target proteins even within TRPV1-positive neurons. Signaling components defined distinct further subgroups within the population of TRPV1-positive neurons. Analysis of one such signaling system showed that the pain sensitizing prostaglandin PGD2 activates DP1 receptors expressed predominantly on TRPV1(+) neurons. In contrast, we found the PGD2 producing prostaglandin D synthase to be expressed exclusively in myelinated large-diameter neurons lacking TRPV1, which suggests a novel paracrine neuron-neuron communication. Thus, subgroup analysis based on the elimination rather than enrichment of the subgroup of interest revealed proteins that define subclasses of TRPV1-positive neurons and suggests a novel paracrine circuit. PMID:25551770
Combined analysis of the chloroplast genome and transcriptome of the Antarctic vascular plant Deschampsia antarctica Desv.

PubMed

Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

2014-01-01

Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5'- or 3'-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the characteristics of the chloroplast transcriptome.
Transcriptomic analysis of flower development in wintersweet (Chimonanthus praecox).

PubMed

Liu, Daofeng; Sui, Shunzhao; Ma, Jing; Li, Zhineng; Guo, Yulong; Luo, Dengpan; Yang, Jianfeng; Li, Mingyang

2014-01-01

Wintersweet (Chimonanthus praecox) is familiar as a garden plant and woody ornamental flower. On account of its unique flowering time and strong fragrance, it has a high ornamental and economic value. Despite a long history of human cultivation, our understanding of wintersweet genetics and molecular biology remains scant, reflecting a lack of basic genomic and transcriptomic data. In this study, we assembled three cDNA libraries, from three successive stages in flower development, designated as the flower bud with displayed petal, open flower and senescing flower stages. Using the Illumina RNA-Seq method, we obtained 21,412,928, 26,950,404, 24,912,954 qualified Illumina reads, respectively, for the three successive stages. The pooled reads from all three libraries were then assembled into 106,995 transcripts, 51,793 of which were annotated in the NCBI non-redundant protein database. Of these annotated sequences, 32,649 and 21,893 transcripts were assigned to gene ontology categories and clusters of orthologous groups, respectively. We could map 15,587 transcripts onto 312 pathways using the Kyoto Encyclopedia of Genes and Genomes pathway database. Based on these transcriptomic data, we obtained a large number of candidate genes that were differentially expressed at the open flower and senescing flower stages. An analysis of differentially expressed genes involved in plant hormone signal transduction pathways indicated that although flower opening and senescence may be independent of the ethylene signaling pathway in wintersweet, salicylic acid may be involved in the regulation of flower senescence. We also succeeded in isolating key genes of floral scent biosynthesis and proposed a biosynthetic pathway for monoterpenes and sesquiterpenes in wintersweet flowers, based on the annotated sequences. This comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in wintersweet. And our data provided a useful database for further research of wintersweet and other Calycanthaceae family plants.
Transcriptomic Analysis of Flower Development in Wintersweet (Chimonanthus praecox)

PubMed Central

Liu, Daofeng; Sui, Shunzhao; Ma, Jing; Li, Zhineng; Guo, Yulong; Luo, Dengpan; Yang, Jianfeng; Li, Mingyang

2014-01-01

Wintersweet (Chimonanthus praecox) is familiar as a garden plant and woody ornamental flower. On account of its unique flowering time and strong fragrance, it has a high ornamental and economic value. Despite a long history of human cultivation, our understanding of wintersweet genetics and molecular biology remains scant, reflecting a lack of basic genomic and transcriptomic data. In this study, we assembled three cDNA libraries, from three successive stages in flower development, designated as the flower bud with displayed petal, open flower and senescing flower stages. Using the Illumina RNA-Seq method, we obtained 21,412,928, 26,950,404, 24,912,954 qualified Illumina reads, respectively, for the three successive stages. The pooled reads from all three libraries were then assembled into 106,995 transcripts, 51,793 of which were annotated in the NCBI non-redundant protein database. Of these annotated sequences, 32,649 and 21,893 transcripts were assigned to gene ontology categories and clusters of orthologous groups, respectively. We could map 15,587 transcripts onto 312 pathways using the Kyoto Encyclopedia of Genes and Genomes pathway database. Based on these transcriptomic data, we obtained a large number of candidate genes that were differentially expressed at the open flower and senescing flower stages. An analysis of differentially expressed genes involved in plant hormone signal transduction pathways indicated that although flower opening and senescence may be independent of the ethylene signaling pathway in wintersweet, salicylic acid may be involved in the regulation of flower senescence. We also succeeded in isolating key genes of floral scent biosynthesis and proposed a biosynthetic pathway for monoterpenes and sesquiterpenes in wintersweet flowers, based on the annotated sequences. This comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in wintersweet. And our data provided a useful database for further research of wintersweet and other Calycanthaceae family plants. PMID:24489818
Differentiation of Symbiotic Cells and Endosymbionts in Medicago truncatula Nodulation Are Coupled to Two Transcriptome-Switches

PubMed Central

Maunoury, Nicolas; Redondo-Nieto, Miguel; Bourcy, Marie; Van de Velde, Willem; Alunni, Benoit; Laporte, Philippe; Durand, Patricia; Agier, Nicolas; Marisa, Laetitia; Vaubert, Danièle; Delacroix, Hervé; Duc, Gérard; Ratet, Pascal; Aggerbeck, Lawrence; Kondorosi, Eva; Mergaert, Peter

2010-01-01

The legume plant Medicago truncatula establishes a symbiosis with the nitrogen-fixing bacterium Sinorhizobium meliloti which takes place in root nodules. The formation of nodules employs a complex developmental program involving organogenesis, specific cellular differentiation of the host cells and the endosymbiotic bacteria, called bacteroids, as well as the specific activation of a large number of plant genes. By using a collection of plant and bacterial mutants inducing non-functional, Fix− nodules, we studied the differentiation processes of the symbiotic partners together with the nodule transcriptome, with the aim of unravelling links between cell differentiation and transcriptome activation. Two waves of transcriptional reprogramming involving the repression and the massive induction of hundreds of genes were observed during wild-type nodule formation. The dominant features of this “nodule-specific transcriptome” were the repression of plant defense-related genes, the transient activation of cell cycle and protein synthesis genes at the early stage of nodule development and the activation of the secretory pathway along with a large number of transmembrane and secretory proteins or peptides throughout organogenesis. The fifteen plant and bacterial mutants that were analyzed fell into four major categories. Members of the first category of mutants formed non-functional nodules although they had differentiated nodule cells and bacteroids. This group passed the two transcriptome switch-points similarly to the wild type. The second category, which formed nodules in which the plant cells were differentiated and infected but the bacteroids did not differentiate, passed the first transcriptome switch but not the second one. Nodules in the third category contained infection threads but were devoid of differentiated symbiotic cells and displayed a root-like transcriptome. Nodules in the fourth category were free of bacteria, devoid of differentiated symbiotic cells and also displayed a root-like transcriptome. A correlation thus exists between the differentiation of symbiotic nodule cells and the first wave of nodule specific gene activation and between differentiation of rhizobia to bacteroids and the second transcriptome wave in nodules. The differentiation of symbiotic cells and of bacteroids may therefore constitute signals for the execution of these transcriptome-switches. PMID:20209049
A transcriptome atlas of rabbit revealed by PacBio single-molecule long-read sequencing.

PubMed

Chen, Shi-Yi; Deng, Feilong; Jia, Xianbo; Li, Cao; Lai, Song-Jia

2017-08-09

It is widely acknowledged that transcriptional diversity largely contributes to biological regulation in eukaryotes. Since the advent of second-generation sequencing technologies, a large number of RNA sequencing studies have considerably improved our understanding of transcriptome complexity. However, it still remains a huge challenge for obtaining full-length transcripts because of difficulties in the short read-based assembly. In the present study we employ PacBio single-molecule long-read sequencing technology for whole-transcriptome profiling in rabbit (Oryctolagus cuniculus). We totally obtain 36,186 high-confidence transcripts from 14,474 genic loci, among which more than 23% of genic loci and 66% of isoforms have not been annotated yet within the current reference genome. Furthermore, about 17% of transcripts are computationally revealed to be non-coding RNAs. Up to 24,797 alternative splicing (AS) and 11,184 alternative polyadenylation (APA) events are detected within this de novo constructed transcriptome, respectively. The results provide a comprehensive set of reference transcripts and hence contribute to the improved annotation of rabbit genome.
Transcriptome analysis in non-model species: a new method for the analysis of heterologous hybridization on microarrays

PubMed Central

2010-01-01

Background Recent developments in high-throughput methods of analyzing transcriptomic profiles are promising for many areas of biology, including ecophysiology. However, although commercial microarrays are available for most common laboratory models, transcriptome analysis in non-traditional model species still remains a challenge. Indeed, the signal resulting from heterologous hybridization is low and difficult to interpret because of the weak complementarity between probe and target sequences, especially when no microarray dedicated to a genetically close species is available. Results We show here that transcriptome analysis in a species genetically distant from laboratory models is made possible by using MAXRS, a new method of analyzing heterologous hybridization on microarrays. This method takes advantage of the design of several commercial microarrays, with different probes targeting the same transcript. To illustrate and test this method, we analyzed the transcriptome of king penguin pectoralis muscle hybridized to Affymetrix chicken microarrays, two organisms separated by an evolutionary distance of approximately 100 million years. The differential gene expression observed between different physiological situations computed by MAXRS was confirmed by real-time PCR on 10 genes out of 11 tested. Conclusions MAXRS appears to be an appropriate method for gene expression analysis under heterologous hybridization conditions. PMID:20509979
Arkas: Rapid reproducible RNAseq analysis

PubMed Central

Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan

2017-01-01

The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments. We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways . Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing. Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import. Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134
A deep transcriptomic analysis of pod development in the vanilla orchid (Vanilla planifolia).

PubMed

Rao, Xiaolan; Krom, Nick; Tang, Yuhong; Widiez, Thomas; Havkin-Frenkel, Daphna; Belanger, Faith C; Dixon, Richard A; Chen, Fang

2014-11-07

Pods of the vanilla orchid (Vanilla planifolia) accumulate large amounts of the flavor compound vanillin (3-methoxy, 4-hydroxy-benzaldehyde) as a glucoside during the later stages of their development. At earlier stages, the developing seeds within the pod synthesize a novel lignin polymer, catechyl (C) lignin, in their coats. Genomic resources for determining the biosynthetic routes to these compounds and other flavor components in V. planifolia are currently limited. Using next-generation sequencing technologies, we have generated very large gene sequence datasets from vanilla pods at different times of development, and representing different tissue types, including the seeds, hairs, placental and mesocarp tissues. This developmental series was chosen as being the most informative for interrogation of pathways of vanillin and C-lignin biosynthesis in the pod and seed, respectively. The combined 454/Illumina RNA-seq platforms provide both deep sequence coverage and high quality de novo transcriptome assembly for this non-model crop species. The annotated sequence data provide a foundation for understanding multiple aspects of the biochemistry and development of the vanilla bean, as exemplified by the identification of candidate genes involved in lignin biosynthesis. Our transcriptome data indicate that C-lignin formation in the seed coat involves coordinate expression of monolignol biosynthetic genes with the exception of those encoding the caffeoyl coenzyme A 3-O-methyltransferase for conversion of caffeoyl to feruloyl moieties. This database provides a general resource for further studies on this important flavor species.
Single-cell analysis of the transcriptome and its application in the characterization of stem cells and early embryos.

PubMed

Liu, Na; Liu, Lin; Pan, Xinghua

2014-07-01

Cellular heterogeneity within a cell population is a common phenomenon in multicellular organisms, tissues, cultured cells, and even FACS-sorted subpopulations. Important information may be masked if the cells are studied as a mass. Transcriptome profiling is a parameter that has been intensively studied, and relatively easier to address than protein composition. To understand the basis and importance of heterogeneity and stochastic aspects of the cell function and its mechanisms, it is essential to examine transcriptomes of a panel of single cells. High-throughput technologies, starting from microarrays and now RNA-seq, provide a full view of the expression of transcriptomes but are limited by the amount of RNA for analysis. Recently, several new approaches for amplification and sequencing the transcriptome of single cells or a limited low number of cells have been developed and applied. In this review, we summarize these major strategies, such as PCR-based methods, IVT-based methods, phi29-DNA polymerase-based methods, and several other methods, including their principles, characteristics, advantages, and limitations, with representative applications in cancer stem cells, early development, and embryonic stem cells. The prospects for development of future technology and application of transcriptome analysis in a single cell are also discussed.
A New Model Army: Emerging fish models to study the genomics of vertebrate Evo-Devo

PubMed Central

Braasch, Ingo; Peterson, Samuel M.; Desvignes, Thomas; McCluskey, Braedan M.; Batzel, Peter; Postlethwait, John H.

2014-01-01

Many fields of biology – including vertebrate Evo-Devo research – are facing an explosion of genomic and transcriptomic sequence information and a multitude of fish species are now swimming in this ‘genomic tsunami’. Here, we first give an overview of recent developments in sequencing fish genomes and transcriptomes that identify properties of fish genomes requiring particular attention and propose strategies to overcome common challenges in fish genomics. We suggest that the generation of chromosome-level genome assemblies - for which we introduce the term ‘chromonome’ – should be a key component of genomic investigations in fish because they enable large-scale conserved synteny analyses that inform orthology detection, a process critical for connectivity of genomes. Orthology calls in vertebrates, especially in teleost fish, are complicated by divergent evolution of gene repertoires and functions following two rounds of genome duplication in the ancestor of vertebrates and a third round at the base of teleost fish. Second, using examples of spotted gar, basal teleosts, zebrafish-related cyprinids, cavefish, livebearers, icefish, and lobefin fish, we illustrate how next generation sequencing technologies liberate emerging fish systems from genomic ignorance and transform them into a new model army to answer longstanding questions on the genomic and developmental basis of their biodiversity. Finally, we discuss recent progress in the genetic toolbox for the major fish models for functional analysis, zebrafish and medaka, that can be transferred to many other fish species to study in vivo the functional effect of evolutionary genomic change as Evo-Devo research enters the postgenomic era. PMID:25111899
Transcriptome Analysis of Barbarea vulgaris Infested with Diamondback Moth (Plutella xylostella) Larvae

PubMed Central

Shen, Di; Wang, Haiping; Wu, Qingjun; Lu, Peng; Qiu, Yang; Song, Jiangping; Zhang, Youjun; Li, Xixiang

2013-01-01

Background The diamondback moth (DBM, Plutella xylostella) is a crucifer-specific pest that causes significant crop losses worldwide. Barbarea vulgaris (Brassicaceae) can resist DBM and other herbivorous insects by producing feeding-deterrent triterpenoid saponins. Plant breeders have long aimed to transfer this insect resistance to other crops. However, a lack of knowledge on the biosynthetic pathways and regulatory networks of these insecticidal saponins has hindered their practical application. A pyrosequencing-based transcriptome analysis of B. vulgaris during DBM larval feeding was performed to identify genes and gene networks responsible for saponin biosynthesis and its regulation at the genome level. Principal Findings Approximately 1.22, 1.19, 1.16, 1.23, 1.16, 1.20, and 2.39 giga base pairs of clean nucleotides were generated from B. vulgaris transcriptomes sampled 1, 4, 8, 12, 24, and 48 h after onset of P. xylostella feeding and from non-inoculated controls, respectively. De novo assembly using all data of the seven transcriptomes generated 39,531 unigenes. A total of 37,780 (95.57%) unigenes were annotated, 14,399 of which were assigned to one or more gene ontology terms and 19,620 of which were assigned to 126 known pathways. Expression profiles revealed 2,016–4,685 up-regulated and 557–5188 down-regulated transcripts. Secondary metabolic pathways, such as those of terpenoids, glucosinolates, and phenylpropanoids, and its related regulators were elevated. Candidate genes for the triterpene saponin pathway were found in the transcriptome. Orthological analysis of the transcriptome with four other crucifer transcriptomes identified 592 B. vulgaris-specific gene families with a P-value cutoff of 1e−5. Conclusion This study presents the first comprehensive transcriptome analysis of B. vulgaris subjected to a series of DBM feedings. The biosynthetic and regulatory pathways of triterpenoid saponins and other DBM deterrent metabolites in this plant were classified. The results of this study will provide useful data for future investigations on pest-resistance phytochemistry and plant breeding. PMID:23696897

Coccidian Merozoite Transcriptome Analysis From Eimeria Maxima In Comparison To Eimeria Tenella And Eimeria Acervulina

USDA-ARS?s Scientific Manuscript database

Using the Eimeria spp. population that infect chickens as a model for coccidian biology, we aimed to survey the transcriptome of E. maxima and contrast it to the two other Eimeria spp. for which transcriptome data are available, E. tenella and E. acervulina. Examining specifically the asexual intra...
Structural covariance of brain region volumes is associated with both structural connectivity and transcriptomic similarity.

PubMed

Yee, Yohan; Fernandes, Darren J; French, Leon; Ellegood, Jacob; Cahill, Lindsay S; Vousden, Dulcie A; Spencer Noakes, Leigh; Scholz, Jan; van Eede, Matthijs C; Nieman, Brian J; Sled, John G; Lerch, Jason P

2018-05-18

An organizational pattern seen in the brain, termed structural covariance, is the statistical association of pairs of brain regions in their anatomical properties. These associations, measured across a population as covariances or correlations usually in cortical thickness or volume, are thought to reflect genetic and environmental underpinnings. Here, we examine the biological basis of structural volume covariance in the mouse brain. We first examined large scale associations between brain region volumes using an atlas-based approach that parcellated the entire mouse brain into 318 regions over which correlations in volume were assessed, for volumes obtained from 153 mouse brain images via high-resolution MRI. We then used a seed-based approach and determined, for 108 different seed regions across the brain and using mouse gene expression and connectivity data from the Allen Institute for Brain Science, the variation in structural covariance data that could be explained by distance to seed, transcriptomic similarity to seed, and connectivity to seed. We found that overall, correlations in structure volumes hierarchically clustered into distinct anatomical systems, similar to findings from other studies and similar to other types of networks in the brain, including structural connectivity and transcriptomic similarity networks. Across seeds, this structural covariance was significantly explained by distance (17% of the variation, up to a maximum of 49% for structural covariance to the visceral area of the cortex), transcriptomic similarity (13% of the variation, up to maximum of 28% for structural covariance to the primary visual area) and connectivity (15% of the variation, up to a maximum of 36% for structural covariance to the intermediate reticular nucleus in the medulla) of covarying structures. Together, distance, connectivity, and transcriptomic similarity explained 37% of structural covariance, up to a maximum of 63% for structural covariance to the visceral area. Additionally, this pattern of explained variation differed spatially across the brain, with transcriptomic similarity playing a larger role in the cortex than subcortex, while connectivity explains structural covariance best in parts of the cortex, midbrain, and hindbrain. These results suggest that both gene expression and connectivity underlie structural volume covariance, albeit to different extents depending on brain region, and this relationship is modulated by distance. Copyright © 2018. Published by Elsevier Inc.
Characterizing differential gene expression in polyploid grasses lacking a reference transcriptome

USDA-ARS?s Scientific Manuscript database

Basal transcriptome characterization and differential gene expression in response to varying conditions are often addressed through next generation sequencing (NGS) and data analysis techniques. While these strategies are commonly used, there are countless tools, pipelines, data analysis methods an...
Comparative Genomics and Transcriptomics Analyses Reveal Divergent Lifestyle Features of Nematode Endoparasitic Fungus Hirsutella minnesotensis

PubMed Central

Lai, Yiling; Liu, Keke; Zhang, Xinyu; Zhang, Xiaoling; Li, Kuan; Wang, Niuniu; Shu, Chi; Wu, Yunpeng; Wang, Chengshu; Bushley, Kathryn E.; Xiang, Meichun; Liu, Xingzhong

2014-01-01

Hirsutella minnesotensis [Ophiocordycipitaceae (Hypocreales, Ascomycota)] is a dominant endoparasitic fungus by using conidia that adhere to and penetrate the secondary stage juveniles of soybean cyst nematode. Its genome was de novo sequenced and compared with five entomopathogenic fungi in the Hypocreales and three nematode-trapping fungi in the Orbiliales (Ascomycota). The genome of H. minnesotensis is 51.4 Mb and encodes 12,702 genes enriched with transposable elements up to 32%. Phylogenomic analysis revealed that H. minnesotensis was diverged from entomopathogenic fungi in Hypocreales. Genome of H. minnesotensis is similar to those of entomopathogenic fungi to have fewer genes encoding lectins for adhesion and glycoside hydrolases for cellulose degradation, but is different from those of nematode-trapping fungi to possess more genes for protein degradation, signal transduction, and secondary metabolism. Those results indicate that H. minnesotensis has evolved different mechanism for nematode endoparasitism compared with nematode-trapping fungi. Transcriptomics analyses for the time-scale parasitism revealed the upregulations of lectins, secreted proteases and the genes for biosynthesis of secondary metabolites that could be putatively involved in host surface adhesion, cuticle degradation, and host manipulation. Genome and transcriptome analyses provided comprehensive understanding of the evolution and lifestyle of nematode endoparasitism. PMID:25359922
Detailed transcriptome description of the neglected cestode Taenia multiceps.

PubMed

Wu, Xuhang; Fu, Yan; Yang, Deying; Zhang, Runhui; Zheng, Wanpeng; Nie, Huaming; Xie, Yue; Yan, Ning; Hao, Guiying; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou

2012-01-01

The larval stage of Taenia multiceps, a global cestode, encysts in the central nervous system (CNS) of sheep and other livestock. This frequently leads to their death and huge socioeconomic losses, especially in developing countries. This parasite can also cause zoonotic infections in humans, but has been largely neglected due to a lack of diagnostic techniques and studies. Recent developments in next-generation sequencing provide an opportunity to explore the transcriptome of T. multiceps. We obtained a total of 31,282 unigenes (mean length 920 bp) using Illumina paired-end sequencing technology and a new Trinity de novo assembler without a referenced genome. Individual transcription molecules were determined by sequence-based annotations and/or domain-based annotations against public databases (Nr, UniprotKB/Swiss-Prot, COG, KEGG, UniProtKB/TrEMBL, InterPro and Pfam). We identified 26,110 (83.47%) unigenes and inferred 20,896 (66.8%) coding sequences (CDS). Further comparative transcripts analysis with other cestodes (Taenia pisiformis, Taenia solium, Echincoccus granulosus and Echincoccus multilocularis) and intestinal parasites (Trichinella spiralis, Ancylostoma caninum and Ascaris suum) showed that 5,100 common genes were shared among three Taenia tapeworms, 261 conserved genes were detected among five Taeniidae cestodes, and 109 common genes were found in four zoonotic intestinal parasites. Some of the common genes were genes required for parasite survival, involved in parasite-host interactions. In addition, we amplified two full-length CDS of unigenes from the common genes using RT-PCR. This study provides an extensive transcriptome of the adult stage of T. multiceps, and demonstrates that comparative transcriptomic investigations deserve to be further studied. This transcriptome dataset forms a substantial public information platform to achieve a fundamental understanding of the biology of T. multiceps, and helps in the identification of drug targets and parasite-host interaction studies.
Host plant driven transcriptome plasticity in the salivary glands of the cabbage looper (Trichoplusia ni)

PubMed Central

Galbraith, David A.; Grozinger, Christina M.; Felton, Gary W.

2017-01-01

Generalist herbivores feed on a wide array of plants and need to adapt to varying host qualities and defenses. One of the first insect derived secretions to come in contact with the plant is the saliva. Insect saliva is potentially involved in both the pre-digestion of the host plant as well as induction/suppression of plant defenses, yet how the salivary glands respond to changes in host plant at the transcriptional level is largely unknown. The objective of this study was to determine how the labial salivary gland transcriptome varies according to the host plant on which the insect is feeding. In order to determine this, cabbage looper (Trichoplusia ni) larvae were reared on cabbage, tomato, and pinto bean artificial diet. Labial glands were dissected from fifth instar larvae and used to extract RNA for RNASeq analysis. Assembly of the resulting sequencing reads resulted in a transcriptome library for T. ni salivary glands consisting of 14,037 expressed genes. Feeding on different host plant diets resulted in substantial remodeling of the gland transcriptomes, with 4,501 transcripts significantly differentially expressed across the three treatment groups. Gene expression profiles were most similar between cabbage and artificial diet, which corresponded to the two diets on which larvae perform best. Expression of several transcripts involved in detoxification processes were differentially expressed, and transcripts involved in the spliceosome pathway were significantly downregulated in tomato-reared larvae. Overall, this study demonstrates that the transcriptomes of the salivary glands of the cabbage looper are strongly responsive to diet. It also provides a foundation for future functional studies that can help us understand the role of saliva of chewing insects in plant-herbivore interactions. PMID:28792546
Transcriptome sequence analysis of an ornamental plant, Ananas comosus var. bracteatus, revealed the potential unigenes involved in terpenoid and phenylpropanoid biosynthesis.

PubMed

Ma, Jun; Kanakala, S; He, Yehua; Zhang, Junli; Zhong, Xiaolan

2015-01-01

Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis. The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus.
Transcriptome Sequence Analysis of an Ornamental Plant, Ananas comosus var. bracteatus, Revealed the Potential Unigenes Involved in Terpenoid and Phenylpropanoid Biosynthesis

PubMed Central

Ma, Jun; Kanakala, S.; He, Yehua; Zhang, Junli; Zhong, Xiaolan

2015-01-01

Background Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. Results The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis. Conclusion The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus. PMID:25769053
Genomics of Adaptation to Multiple Concurrent Stresses: Insights from Comparative Transcriptomics of a Cichlid Fish from One of Earth's Most Extreme Environments, the Hypersaline Soda Lake Magadi in Kenya, East Africa.

PubMed

Kavembe, Geraldine D; Franchini, Paolo; Irisarri, Iker; Machado-Schiaffino, Gonzalo; Meyer, Axel

2015-10-01

The Magadi tilapia (Alcolapia grahami) is a cichlid fish that inhabits one of the Earth's most extreme aquatic environments, with high pH (~10), salinity (~60% of seawater), high temperatures (~40 °C), and fluctuating oxygen regimes. The Magadi tilapia evolved several unique behavioral, physiological, and anatomical adaptations, some of which are constituent and thus retained in freshwater conditions. We conducted a transcriptomic analysis on A. grahami to study the evolutionary basis of tolerance to multiple stressors. To identify the adaptive regulatory changes associated with stress responses, we massively sequenced gill transcriptomes (RNAseq) from wild and freshwater-acclimated specimens of A. grahami. As a control, corresponding transcriptome data from Oreochromis leucostictus, a closely related freshwater species, were generated. We found expression differences in a large number of genes with known functions related to osmoregulation, energy metabolism, ion transport, and chemical detoxification. Over-representation of metabolism-related gene ontology terms in wild individuals compared to laboratory-acclimated specimens suggested that freshwater conditions greatly decrease the metabolic requirements of this species. Twenty-five genes with diverse physiological functions related to responses to water stress showed signs of divergent natural selection between the Magadi tilapia and its freshwater relative, which shared a most recent common ancestor only about four million years ago. The complete set of genes responsible for urea excretion was identified in the gill transcriptome of A. grahami, making it the only fish species to have a functional ornithine-urea cycle pathway in the gills--a major innovation for increasing nitrogenous waste efficiency.
Identification of Major Signaling Pathways in Prion Disease Progression Using Network Analysis

PubMed Central

Newaz, Khalique; Sriram, K.; Bera, Debajyoti

2015-01-01

Prion diseases are transmissible neurodegenerative diseases that arise due to conformational change of normal, cellular prion protein (PrPC) to protease-resistant isofrom (rPrPSc). Deposition of misfolded PrpSc proteins leads to an alteration of many signaling pathways that includes immunological and apoptotic pathways. As a result, this culminates in the dysfunction and death of neuronal cells. Earlier works on transcriptomic studies have revealed some affected pathways, but it is not clear which is (are) the prime network pathway(s) that change during the disease progression and how these pathways are involved in crosstalks with each other from the time of incubation to clinical death. We perform network analysis on large-scale transcriptomic data of differentially expressed genes obtained from whole brain in six different mouse strain-prion strain combination models to determine the pathways involved in prion diseases, and to understand the role of crosstalks in disease propagation. We employ a notion of differential network centrality measures on protein interaction networks to identify the potential biological pathways involved. We also propose a crosstalk ranking method based on dynamic protein interaction networks to identify the core network elements involved in crosstalk with different pathways. We identify 148 DEGs (differentially expressed genes) potentially related to the prion disease progression. Functional association of the identified genes implicates a strong involvement of immunological pathways. We extract a bow-tie structure that is potentially dysregulated in prion disease. We also propose an ODE model for the bow-tie network. Predictions related to diseased condition suggests the downregulation of the core signaling elements (PI3Ks and AKTs) of the bow-tie network. In this work, we show using transcriptomic data that the neuronal dysfunction in prion disease is strongly related to the immunological pathways. We conclude that these immunological pathways occupy influential positions in the PFNs (protein functional networks) that are related to prion disease. Importantly, this functional network involvement is prevalent in all the five different mouse strain-prion strain combinations that we studied. We also conclude that the dysregulation of the core elements of the bow-tie structure, which belongs to PI3K-Akt signaling pathway, leads to dysregulation of the downstream components corresponding to other biological pathways. PMID:26646948
MetaKTSP: a meta-analytic top scoring pair method for robust cross-study validation of omics prediction analysis.

PubMed

Kim, SungHwan; Lin, Chien-Wei; Tseng, George C

2016-07-01

Supervised machine learning is widely applied to transcriptomic data to predict disease diagnosis, prognosis or survival. Robust and interpretable classifiers with high accuracy are usually favored for their clinical and translational potential. The top scoring pair (TSP) algorithm is an example that applies a simple rank-based algorithm to identify rank-altered gene pairs for classifier construction. Although many classification methods perform well in cross-validation of single expression profile, the performance usually greatly reduces in cross-study validation (i.e. the prediction model is established in the training study and applied to an independent test study) for all machine learning methods, including TSP. The failure of cross-study validation has largely diminished the potential translational and clinical values of the models. The purpose of this article is to develop a meta-analytic top scoring pair (MetaKTSP) framework that combines multiple transcriptomic studies and generates a robust prediction model applicable to independent test studies. We proposed two frameworks, by averaging TSP scores or by combining P-values from individual studies, to select the top gene pairs for model construction. We applied the proposed methods in simulated data sets and three large-scale real applications in breast cancer, idiopathic pulmonary fibrosis and pan-cancer methylation. The result showed superior performance of cross-study validation accuracy and biomarker selection for the new meta-analytic framework. In conclusion, combining multiple omics data sets in the public domain increases robustness and accuracy of the classification model that will ultimately improve disease understanding and clinical treatment decisions to benefit patients. An R package MetaKTSP is available online. (http://tsenglab.biostat.pitt.edu/software.htm). ctseng@pitt.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Mildew-Omics: How Global Analyses Aid the Understanding of Life and Evolution of Powdery Mildews.

PubMed

Bindschedler, Laurence V; Panstruga, Ralph; Spanu, Pietro D

2016-01-01

The common powdery mildew plant diseases are caused by ascomycete fungi of the order Erysiphales. Their characteristic life style as obligate biotrophs renders functional analyses in these species challenging, mainly because of experimental constraints to genetic manipulation. Global large-scale ("-omics") approaches are thus particularly valuable and insightful for the characterisation of the life and evolution of powdery mildews. Here we review the knowledge obtained so far from genomic, transcriptomic and proteomic studies in these fungi. We consider current limitations and challenges regarding these surveys and provide an outlook on desired future investigations on the basis of the various -omics technologies.
BIG: a large-scale data integration tool for renal physiology.

PubMed

Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A

2016-10-01

Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.
An automated method for detecting alternatively spliced protein domains.

PubMed

Coelho, Vitor; Sammeth, Michael

2018-06-01

Alternative splicing (AS) has been demonstrated to play a role in shaping eukaryotic gene diversity at the transcriptional level. However, the impact of AS on the proteome is still controversial. Studies that seek to explore the effect of AS at the proteomic level are hampered by technical difficulties in the cumbersome process of casting forth and back between genome, transcriptome and proteome space coordinates, and the naïve prediction of protein domains in the presence of AS suffers many redundant sequence scans that emerge from constitutively spliced regions that are shared between alternative products of a gene. We developed the AstaFunk pipeline that computes for every generic transcriptome all domains that are altered by AS events in a systematic and efficient manner. In a nutshell, our method employs Viterbi dynamic programming, which guarantees to find all score-optimal hits of the domains under consideration, while complementary optimisations at different levels avoid redundant and other irrelevant computations. We evaluate AstaFunk qualitatively and quantitatively using RNAseq in well-studied genes with AS, and on large-scale employing entire transcriptomes. Our study confirms complementary reports that the effect of most AS events on the proteome seems to be rather limited, but our results also pinpoint several cases where AS could have a major impact on the function of a protein domain. The JAVA implementation of AstaFunk is available as an open source project on http://astafunk.sammeth.net. micha@sammeth.net. Supplementary data are available at Bioinformatics online.
Effects of Gene Duplication, Positive Selection, and Shifts in Gene Expression on the Evolution of the Venom Gland Transcriptome in Widow Spiders

PubMed Central

Haney, Robert A.; Clarke, Thomas H.; Gadgil, Rujuta; Fitzpatrick, Ryan; Hayashi, Cheryl Y.; Ayoub, Nadia A.; Garb, Jessica E.

2016-01-01

Gene duplication and positive selection can be important determinants of the evolution of venom, a protein-rich secretion used in prey capture and defense. In a typical model of venom evolution, gene duplicates switch to venom gland expression and change function under the action of positive selection, which together with further duplication produces large gene families encoding diverse toxins. Although these processes have been demonstrated for individual toxin families, high-throughput multitissue sequencing of closely related venomous species can provide insights into evolutionary dynamics at the scale of the entire venom gland transcriptome. By assembling and analyzing multitissue transcriptomes from the Western black widow spider and two closely related species with distinct venom toxicity phenotypes, we do not find that gene duplication and duplicate retention is greater in gene families with venom gland biased expression in comparison with broadly expressed families. Positive selection has acted on some venom toxin families, but does not appear to be in excess for families with venom gland biased expression. Moreover, we find 309 distinct gene families that have single transcripts with venom gland biased expression, suggesting that the switching of genes to venom gland expression in numerous unrelated gene families has been a dominant mode of evolution. We also find ample variation in protein sequences of venom gland–specific transcripts, lineage-specific family sizes, and ortholog expression among species. This variation might contribute to the variable venom toxicity of these species. PMID:26733576
Epigenetics and Proteomics Join Transcriptomics in the Quest for Tuberculosis Biomarkers

PubMed Central

Esterhuyse, Maria M.; Weiner, January; Caron, Etienne; Loxton, Andre G.; Iannaccone, Marco; Wagman, Chandre; Saikali, Philippe; Stanley, Kim; Wolski, Witold E.; Mollenkopf, Hans-Joachim; Schick, Matthias; Aebersold, Ruedi; Linhart, Heinz; Walzl, Gerhard

2015-01-01

ABSTRACT An estimated one-third of the world’s population is currently latently infected with Mycobacterium tuberculosis. Latent M. tuberculosis infection (LTBI) progresses into active tuberculosis (TB) disease in ~5 to 10% of infected individuals. Diagnostic and prognostic biomarkers to monitor disease progression are urgently needed to ensure better care for TB patients and to decrease the spread of TB. Biomarker development is primarily based on transcriptomics. Our understanding of biology combined with evolving technical advances in high-throughput techniques led us to investigate the possibility of additional platforms (epigenetics and proteomics) in the quest to (i) understand the biology of the TB host response and (ii) search for multiplatform biosignatures in TB. We engaged in a pilot study to interrogate the DNA methylome, transcriptome, and proteome in selected monocytes and granulocytes from TB patients and healthy LTBI participants. Our study provides first insights into the levels and sources of diversity in the epigenome and proteome among TB patients and LTBI controls, despite limitations due to small sample size. Functionally the differences between the infection phenotypes (LTBI versus active TB) observed in the different platforms were congruent, thereby suggesting regulation of function not only at the transcriptional level but also by DNA methylation and microRNA. Thus, our data argue for the development of a large-scale study of the DNA methylome, with particular attention to study design in accounting for variation based on gender, age, and cell type. PMID:26374119
ALOMYbase, a resource to investigate non-target-site-based resistance to herbicides inhibiting acetolactate-synthase (ALS) in the major grass weed Alopecurus myosuroides (black-grass).

PubMed

Gardin, Jeanne Aude Christiane; Gouzy, Jérôme; Carrère, Sébastien; Délye, Christophe

2015-08-12

Herbicide resistance in agrestal weeds is a global problem threatening food security. Non-target-site resistance (NTSR) endowed by mechanisms neutralising the herbicide or compensating for its action is considered the most agronomically noxious type of resistance. Contrary to target-site resistance, NTSR mechanisms are far from being fully elucidated. A part of weed response to herbicide stress, NTSR is considered to be largely driven by gene regulation. Our purpose was to establish a transcriptome resource allowing investigation of the transcriptomic bases of NTSR in the major grass weed Alopecurus myosuroides L. (Poaceae) for which almost no genomic or transcriptomic data was available. RNA-Seq was performed from plants in one F2 population that were sensitive or expressing NTSR to herbicides inhibiting acetolactate-synthase. Cloned plants were sampled over seven time-points ranging from before until 73 h after herbicide application. Assembly of over 159M high-quality Illumina reads generated a transcriptomic resource (ALOMYbase) containing 65,558 potentially active contigs (N50 = 1240 nucleotides) predicted to encode 32,138 peptides with 74% GO annotation, of which 2017 were assigned to protein families presumably involved in NTSR. Comparison with the fully sequenced grass genomes indicated good coverage and correct representation of A. myosuroides transcriptome in ALOMYbase. The part of the herbicide transcriptomic response common to the resistant and the sensitive plants was consistent with the expected effects of acetolactate-synthase inhibition, with striking similarities observed with published Arabidopsis thaliana data. A. myosuroides plants with NTSR were first affected by herbicide action like sensitive plants, but ultimately overcame it. Analysis of differences in transcriptomic herbicide response between resistant and sensitive plants did not allow identification of processes directly explaining NTSR. Five contigs associated to NTSR in the F2 population studied were tentatively identified. They were predicted to encode three cytochromes P450 (CYP71A, CYP71B and CYP81D), one peroxidase and one disease resistance protein. Our data confirmed that gene regulation is at the root of herbicide response and of NTSR. ALOMYbase proved to be a relevant resource to support NTSR transcriptomic studies, and constitutes a valuable tool for future research aiming at elucidating gene regulations involved in NTSR in A. myosuroides.
Transcriptome analysis of Pinus monticola primary needles by RNA-seq provides novel insight into host resistance to Cronartium ribicola

PubMed Central

2013-01-01

Background Five-needle pines are important forest species that have been devastated by white pine blister rust (WPBR, caused by Cronartium ribicola) across North America. Currently little transcriptomic and genomic data are available to understand molecular interactions in the WPBR pathosystem. Results We report here RNA-seq analysis results using Illumina deep sequencing of primary needles of western white pine (Pinus monticola) infected with WPBR. De novo gene assembly was used to generate the first P. monticola consensus transcriptome, which contained 39,439 unique transcripts with an average length of 1,303 bp and a total length of 51.4 Mb. About 23,000 P. monticola unigenes produced orthologous hits in the Pinus gene index (PGI) database (BLASTn with E values < e-100) and 6,300 genes were expressed actively (at RPKM ≥ 10) in the healthy tissues. Comparison of transcriptomes from WPBR-susceptible and -resistant genotypes revealed a total of 979 differentially expressed genes (DEGs) with a significant fold change > 1.5 during P. monticola- C. ribicola interactions. Three hundred and ten DEGs were regulated similarly in both susceptible and resistant seedlings and 275 DEGs showed regulatory differences between susceptible and resistant seedlings post infection by C. ribicola. The DEGs up-regulated in resistant seedlings included a set of putative signal receptor genes encoding disease resistance protein homologs, calcineurin B-like (CBL)-interacting protein kinases (CIPK), F-box family proteins (FBP), and abscisic acid (ABA) receptor; transcriptional factor (TF) genes of multiple families; genes homologous to apoptosis-inducing factor (AIF), flowering locus T-like protein (FT), and subtilisin-like protease. DEGs up-regulated in resistant seedlings also included a wide diversity of down-stream genes (encoding enzymes involved in different metabolic pathways, pathogenesis-related -PR proteins of multiple families, and anti-microbial proteins). A large proportion of the down-regulated DEGs were related to photosystems, the metabolic pathways of carbon fixation and flavonoid biosynthesis. Conclusions The novel P. monticola transcriptome data provide a basis for future studies of genetic resistance in a non-model, coniferous species. Our global gene expression profiling presents a comprehensive view of transcriptomic regulation in the WPBR pathosystem and yields novel insights on molecular and biochemical mechanisms of disease resistance in conifers. PMID:24341615
De novo characterization of the Chinese fir (Cunninghamia lanceolata) transcriptome and analysis of candidate genes involved in cellulose and lignin biosynthesis

PubMed Central

2012-01-01

Background Chinese fir (Cunninghamia lanceolata) is an important timber species that accounts for 20–30% of the total commercial timber production in China. However, the available genomic information of Chinese fir is limited, and this severely encumbers functional genomic analysis and molecular breeding in Chinese fir. Recently, major advances in transcriptome sequencing have provided fast and cost-effective approaches to generate large expression datasets that have proven to be powerful tools to profile the transcriptomes of non-model organisms with undetermined genomes. Results In this study, the transcriptomes of nine tissues from Chinese fir were analyzed using the Illumina HiSeq™ 2000 sequencing platform. Approximately 40 million paired-end reads were obtained, generating 3.62 gigabase pairs of sequencing data. These reads were assembled into 83,248 unique sequences (i.e. Unigenes) with an average length of 449 bp, amounting to 37.40 Mb. A total of 73,779 Unigenes were supported by more than 5 reads, 42,663 (57.83%) had homologs in the NCBI non-redundant and Swiss-Prot protein databases, corresponding to 27,224 unique protein entries. Of these Unigenes, 16,750 were assigned to Gene Ontology classes, and 14,877 were clustered into orthologous groups. A total of 21,689 (29.40%) were mapped to 119 pathways by BLAST comparison against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The majority of the genes encoding the enzymes in the biosynthetic pathways of cellulose and lignin were identified in the Unigene dataset by targeted searches of their annotations. And a number of candidate Chinese fir genes in the two metabolic pathways were discovered firstly. Eighteen genes related to cellulose and lignin biosynthesis were cloned for experimental validating of transcriptome data. Overall 49 Unigenes, covering different regions of these selected genes, were found by alignment. Their expression patterns in different tissues were analyzed by qRT-PCR to explore their putative functions. Conclusions A substantial fraction of transcript sequences was obtained from the deep sequencing of Chinese fir. The assembled Unigene dataset was used to discover candidate genes of cellulose and lignin biosynthesis. This transcriptome dataset will provide a comprehensive sequence resource for molecular genetics research of C. lanceolata. PMID:23171398
Quantifying whole transcriptome size, a prerequisite for understanding transcriptome evolution across species: an example from a plant allopolyploid.

PubMed

Coate, Jeremy E; Doyle, Jeff J

2010-01-01

Evolutionary biologists are increasingly comparing gene expression patterns across species. Due to the way in which expression assays are normalized, such studies provide no direct information about expression per gene copy (dosage responses) or per cell and can give a misleading picture of genes that are differentially expressed. We describe an assay for estimating relative expression per cell. When used in conjunction with transcript profiling data, it is possible to compare the sizes of whole transcriptomes, which in turn makes it possible to compare expression per cell for each gene in the transcript profiling data set. We applied this approach, using quantitative reverse transcriptase-polymerase chain reaction and high throughput RNA sequencing, to a recently formed allopolyploid and showed that its leaf transcriptome was approximately 1.4-fold larger than either progenitor transcriptome (70% of the sum of the progenitor transcriptomes). In contrast, the allopolyploid genome is 94.3% as large as the sum of its progenitor genomes and retains > or =93.5% of the sum of its progenitor gene complements. Thus, "transcriptome downsizing" is greater than genome downsizing. Using this transcriptome size estimate, we inferred dosage responses for several thousand genes and showed that the majority exhibit partial dosage compensation. Homoeologue silencing is nonrandomly distributed across dosage responses, with genes showing extreme responses in either direction significantly more likely to have a silent homoeologue. This experimental approach will add value to transcript profiling experiments involving interspecies and interploidy comparisons by converting expression per transcriptome to expression per genome, eliminating the need for assumptions about transcriptome size.

Comparative Transcriptome Analysis of Bombyx mori (Lepidoptera) Larval Midgut Response to BmNPV in Susceptible and Near-Isogenic Resistant Strains

PubMed Central

Geng, Lei; Xu, Jia-Ping; Yu, Dong; Zhang, Shang-Zhi; Ma, Yan; Fei, Dong-Qiong

2016-01-01

Bombyx mori nucleopolyhedrovirus (BmNPV) is one of the primary pathogens causing severe economic losses in sericulture. However, the molecular mechanism of silkworm resistance to BmNPV remains largely unknown. Here, the recurrent parent P50 (susceptible strain) and the near-isogenic line BC9 (resistance strain) were used in a comparative transcriptome study examining the response to infection with BmNPV. A total of 14,300 unigenes were obtained from two different resistant strains; of these, 869 differentially expressed genes (DEGs) were identified after comparing the four transcriptomes. Many DEGs associated with protein metabolism, cytoskeleton, and apoptosis may be involved in the host response to BmNPV infection. Moreover, some immunity related genes were also altered following BmNPV infection. Specifically, after removing genetic background and individual immune stress response genes, 22 genes were found to be potentially involved in repressing BmNPV infection. These genes were related to transport, virus replication, intracellular innate immune, and apoptosis. Our study provided an overview of the molecular mechanism of silkworm resistance to BmNPV infection and laid a foundation for controlling BmNPV in the future. PMID:27168061
Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants.

PubMed

Smith, Stephen A; Moore, Michael J; Brown, Joseph W; Yang, Ya

2015-08-05

The use of transcriptomic and genomic datasets for phylogenetic reconstruction has become increasingly common as researchers attempt to resolve recalcitrant nodes with increasing amounts of data. The large size and complexity of these datasets introduce significant phylogenetic noise and conflict into subsequent analyses. The sources of conflict may include hybridization, incomplete lineage sorting, or horizontal gene transfer, and may vary across the phylogeny. For phylogenetic analysis, this noise and conflict has been accommodated in one of several ways: by binning gene regions into subsets to isolate consistent phylogenetic signal; by using gene-tree methods for reconstruction, where conflict is presumed to be explained by incomplete lineage sorting (ILS); or through concatenation, where noise is presumed to be the dominant source of conflict. The results provided herein emphasize that analysis of individual homologous gene regions can greatly improve our understanding of the underlying conflict within these datasets. Here we examined two published transcriptomic datasets, the angiosperm group Caryophyllales and the aculeate Hymenoptera, for the presence of conflict, concordance, and gene duplications in individual homologs across the phylogeny. We found significant conflict throughout the phylogeny in both datasets and in particular along the backbone. While some nodes in each phylogeny showed patterns of conflict similar to what might be expected with ILS alone, the backbone nodes also exhibited low levels of phylogenetic signal. In addition, certain nodes, especially in the Caryophyllales, had highly elevated levels of strongly supported conflict that cannot be explained by ILS alone. This study demonstrates that phylogenetic signal is highly variable in phylogenomic data sampled across related species and poses challenges when conducting species tree analyses on large genomic and transcriptomic datasets. Further insight into the conflict and processes underlying these complex datasets is necessary to improve and develop adequate models for sequence analysis and downstream applications. To aid this effort, we developed the open source software phyparts ( https://bitbucket.org/blackrim/phyparts ), which calculates unique, conflicting, and concordant bipartitions, maps gene duplications, and outputs summary statistics such as internode certainy (ICA) scores and node-specific counts of gene duplications.
Comparative transcriptomics of early dipteran development

PubMed Central

2013-01-01

Background Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and the scuttle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies). PMID:23432914
Transcriptome Analysis at the Single-Cell Level Using SMART Technology.

PubMed

Fish, Rachel N; Bostick, Magnolia; Lehman, Alisa; Farmer, Andrew

2016-10-10

RNA sequencing (RNA-seq) is a powerful method for analyzing cell state, with minimal bias, and has broad applications within the biological sciences. However, transcriptome analysis of seemingly homogenous cell populations may in fact overlook significant heterogeneity that can be uncovered at the single-cell level. The ultra-low amount of RNA contained in a single cell requires extraordinarily sensitive and reproducible transcriptome analysis methods. As next-generation sequencing (NGS) technologies mature, transcriptome profiling by RNA-seq is increasingly being used to decipher the molecular signature of individual cells. This unit describes an ultra-sensitive and reproducible protocol to generate cDNA and sequencing libraries directly from single cells or RNA inputs ranging from 10 pg to 10 ng. Important considerations for working with minute RNA inputs are given. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
The Extent of mRNA Editing Is Limited in Chicken Liver and Adipose, but Impacted by Tissular Context, Genotype, Age, and Feeding as Exemplified with a Conserved Edited Site in COG3.

PubMed

Roux, Pierre-François; Frésard, Laure; Boutin, Morgane; Leroux, Sophie; Klopp, Christophe; Djari, Anis; Esquerré, Diane; Martin, Pascal G P; Zerjal, Tatiana; Gourichon, David; Pitel, Frédérique; Lagarrigue, Sandrine

2015-12-04

RNA editing is a posttranscriptional process leading to differences between genomic DNA and transcript sequences, potentially enhancing transcriptome diversity. With recent advances in high-throughput sequencing, many efforts have been made to describe mRNA editing at the transcriptome scale, especially in mammals, yielding contradictory conclusions regarding the extent of this phenomenon. We show, by detailed description of the 25 studies focusing so far on mRNA editing at the whole-transcriptome scale, that systematic sequencing artifacts are considered in most studies whereas biological replication is often neglected and multi-alignment not properly evaluated, which ultimately impairs the legitimacy of results. We recently developed a rigorous strategy to identify mRNA editing using mRNA and genomic DNA sequencing, taking into account sequencing and mapping artifacts, and biological replicates. We applied this method to screen for mRNA editing in liver and white adipose tissue from eight chickens and confirm the small extent of mRNA recoding in this species. Among the 25 unique edited sites identified, three events were previously described in mammals, attesting that this phenomenon is conserved throughout evolution. Deeper investigations on five sites revealed the impact of tissular context, genotype, age, feeding conditions, and sex on mRNA editing levels. More specifically, this analysis highlighted that the editing level at the site located on COG3 was strongly regulated by four of these factors. By comprehensively characterizing the mRNA editing landscape in chickens, our results highlight how this phenomenon is limited and suggest regulation of editing levels by various genetic and environmental factors. Copyright © 2016 Roux et al.
The Extent of mRNA Editing Is Limited in Chicken Liver and Adipose, but Impacted by Tissular Context, Genotype, Age, and Feeding as Exemplified with a Conserved Edited Site in COG3

PubMed Central

Roux, Pierre-François; Frésard, Laure; Boutin, Morgane; Leroux, Sophie; Klopp, Christophe; Djari, Anis; Esquerré, Diane; Martin, Pascal GP; Zerjal, Tatiana; Gourichon, David; Pitel, Frédérique; Lagarrigue, Sandrine

2015-01-01

RNA editing is a posttranscriptional process leading to differences between genomic DNA and transcript sequences, potentially enhancing transcriptome diversity. With recent advances in high-throughput sequencing, many efforts have been made to describe mRNA editing at the transcriptome scale, especially in mammals, yielding contradictory conclusions regarding the extent of this phenomenon. We show, by detailed description of the 25 studies focusing so far on mRNA editing at the whole-transcriptome scale, that systematic sequencing artifacts are considered in most studies whereas biological replication is often neglected and multi-alignment not properly evaluated, which ultimately impairs the legitimacy of results. We recently developed a rigorous strategy to identify mRNA editing using mRNA and genomic DNA sequencing, taking into account sequencing and mapping artifacts, and biological replicates. We applied this method to screen for mRNA editing in liver and white adipose tissue from eight chickens and confirm the small extent of mRNA recoding in this species. Among the 25 unique edited sites identified, three events were previously described in mammals, attesting that this phenomenon is conserved throughout evolution. Deeper investigations on five sites revealed the impact of tissular context, genotype, age, feeding conditions, and sex on mRNA editing levels. More specifically, this analysis highlighted that the editing level at the site located on COG3 was strongly regulated by four of these factors. By comprehensively characterizing the mRNA editing landscape in chickens, our results highlight how this phenomenon is limited and suggest regulation of editing levels by various genetic and environmental factors. PMID:26637431
20180312 - Application of a Multiplexed High Content Imaging (HCI) Based Cell Viability and Apoptosis Chemical Screening Assay with Results in MCF-7 Cells (SOT)

EPA Science Inventory

The NCCT high throughput transcriptomics (HTTr) screening program uses whole transcriptome profiling assay in human-derived cells to collect concentration-response data for large numbers (100s-1000s) of environmental chemicals. To contextualize HTTr data, chemical effects on cell...
Differential Expression Patterns in Chemosensory and Non-Chemosensory Tissues of Putative Chemosensory Genes Identified by Transcriptome Analysis of Insect Pest the Purple Stem Borer Sesamia inferens (Walker)

PubMed Central

Zhang, Ya-Nan; Jin, Jun-Yan; Jin, Rong; Xia, Yi-Han; Zhou, Jing-Jiang; Deng, Jian-Yu; Dong, Shuang-Lin

2013-01-01

Background A large number of insect chemosensory genes from different gene subfamilies have been identified and annotated, but their functional diversity and complexity are largely unknown. A systemic examination of expression patterns in chemosensory organs could provide important information. Methodology/Principal Findings We identified 92 putative chemosensory genes by analysing the transcriptome of the antennae and female sex pheromone gland of the purple stem borer Sesamia inferens, among them 87 are novel in this species, including 24 transcripts encoding for odorant binding proteins (OBPs), 24 for chemosensory proteins (CSPs), 2 for sensory neuron membrane proteins (SNMPs), 39 for odorant receptors (ORs) and 3 for ionotropic receptors (IRs). The transcriptome analyses were validated and quantified with a detailed global expression profiling by Reverse Transcription-PCR for all 92 transcripts and by Quantitative Real Time RT-PCR for selected 16 ones. Among the chemosensory gene subfamilies, CSP transcripts are most widely and evenly expressed in different tissues and stages, OBP transcripts showed a clear antenna bias and most of OR transcripts are only detected in adult antennae. Our results also revealed that some OR transcripts, such as the transcripts of SNMP2 and 2 IRs were expressed in non-chemosensory tissues, and some CSP transcripts were antenna-biased expression. Furthermore, no chemosensory transcript is specific to female sex pheromone gland and very few are found in the heads. Conclusion Our study revealed that there are a large number of chemosensory genes expressed in S. inferens, and some of them displayed unusual expression profile in non-chemosensory tissues. The identification of a large set of putative chemosensory genes of each subfamily from a single insect species, together with their different expression profiles provide further information in understanding the functions of these chemosensory genes in S. inferens as well as other insects. PMID:23894529
Differential expression patterns in chemosensory and non-chemosensory tissues of putative chemosensory genes identified by transcriptome analysis of insect pest the purple stem borer Sesamia inferens (Walker).

PubMed

Zhang, Ya-Nan; Jin, Jun-Yan; Jin, Rong; Xia, Yi-Han; Zhou, Jing-Jiang; Deng, Jian-Yu; Dong, Shuang-Lin

2013-01-01

A large number of insect chemosensory genes from different gene subfamilies have been identified and annotated, but their functional diversity and complexity are largely unknown. A systemic examination of expression patterns in chemosensory organs could provide important information. We identified 92 putative chemosensory genes by analysing the transcriptome of the antennae and female sex pheromone gland of the purple stem borer Sesamia inferens, among them 87 are novel in this species, including 24 transcripts encoding for odorant binding proteins (OBPs), 24 for chemosensory proteins (CSPs), 2 for sensory neuron membrane proteins (SNMPs), 39 for odorant receptors (ORs) and 3 for ionotropic receptors (IRs). The transcriptome analyses were validated and quantified with a detailed global expression profiling by Reverse Transcription-PCR for all 92 transcripts and by Quantitative Real Time RT-PCR for selected 16 ones. Among the chemosensory gene subfamilies, CSP transcripts are most widely and evenly expressed in different tissues and stages, OBP transcripts showed a clear antenna bias and most of OR transcripts are only detected in adult antennae. Our results also revealed that some OR transcripts, such as the transcripts of SNMP2 and 2 IRs were expressed in non-chemosensory tissues, and some CSP transcripts were antenna-biased expression. Furthermore, no chemosensory transcript is specific to female sex pheromone gland and very few are found in the heads. Our study revealed that there are a large number of chemosensory genes expressed in S. inferens, and some of them displayed unusual expression profile in non-chemosensory tissues. The identification of a large set of putative chemosensory genes of each subfamily from a single insect species, together with their different expression profiles provide further information in understanding the functions of these chemosensory genes in S. inferens as well as other insects.
An Alternative Strategy for Trypanosome Survival in the Mammalian Bloodstream Revealed through Genome and Transcriptome Analysis of the Ubiquitous Bovine Parasite Trypanosoma (Megatrypanum) theileri

PubMed Central

Kelly, Steven; Ivens, Alasdair; Mott, G. Adam; O’Neill, Ellis; Emms, David; Macleod, Olivia; Voorheis, Paul; Tyler, Kevin; Clark, Matthew; Matthews, Jacqueline

2017-01-01

Abstract There are hundreds of Trypanosoma species that live in the blood and tissue spaces of their vertebrate hosts. The vast majority of these do not have the ornate system of antigenic variation that has evolved in the small number of African trypanosome species, but can still maintain long-term infections in the face of the vertebrate adaptive immune system. Trypanosoma theileri is a typical example, has a restricted host range of cattle and other Bovinae, and is only occasionally reported to cause patent disease although no systematic survey of the effect of infection on agricultural productivity has been performed. Here, a detailed genome sequence and a transcriptome analysis of gene expression in bloodstream form T. theileri have been performed. Analysis of the genome sequence and expression showed that T. theileri has a typical kinetoplastid genome structure and allowed a prediction that it is capable of meiotic exchange, gene silencing via RNA interference and, potentially, density-dependent growth control. In particular, the transcriptome analysis has allowed a comparison of two distinct trypanosome cell surfaces, T. brucei and T. theileri, that have each evolved to enable the maintenance of a long-term extracellular infection in cattle. The T. theileri cell surface can be modeled to contain a mixture of proteins encoded by four novel large and divergent gene families and by members of a major surface protease gene family. This surface composition is distinct from the uniform variant surface glycoprotein coat on African trypanosomes providing an insight into a second mechanism used by trypanosome species that proliferate in an extracellular milieu in vertebrate hosts to avoid the adaptive immune response. PMID:28903536
Transcriptome and proteome analysis of tyrosine kinase inhibitor treated canine mast cell tumour cells identifies potentially kit signaling-dependent genes

PubMed Central

2012-01-01

Background Canine mast cell tumour proliferation depends to a large extent on the activity of KIT, a tyrosine kinase receptor. Inhibitors of the KIT tyrosine kinase have recently been introduced and successfully applied as a therapeutic agent for this tumour type. However, little is known on the downstream target genes of this signaling pathway and molecular changes after inhibition. Results Transcriptome analysis of the canine mast cell tumour cell line C2 treated for up to 72 hours with the tyrosine kinase inhibitor masitinib identified significant changes in the expression levels of approximately 3500 genes or 16% of the canine genome. Approximately 40% of these genes had increased mRNA expression levels including genes associated with the pro-proliferative pathways of B- and T-cell receptors, chemokine receptors, steroid hormone receptors and EPO-, RAS and MAP kinase signaling. Proteome analysis of C2 cells treated for 72 hours identified 24 proteins with changed expression levels, most of which being involved in gene transcription, e.g. EIA3, EIA4, TARDBP, protein folding, e.g. HSP90, UCHL3, PDIA3 and protection from oxidative stress, GSTT3, SELENBP1. Conclusions Transcriptome and proteome analysis of neoplastic canine mast cells treated with masitinib confirmed the strong important and complex role of KIT in these cells. Approximately 16% of the total canine genome and thus the majority of the active genes were significantly transcriptionally regulated. Most of these changes were associated with reduced proliferation and metabolism of treated cells. Interestingly, several pro-proliferative pathways were up-regulated, which may represent attempts of masitinib treated cells to activate alternative pro-proliferative pathways. These pathways may contain hypothetical targets for a combination therapy with masitinib to further improve its therapeutic effect. PMID:22747577
Functional sequencing read annotation for high precision microbiome analysis

PubMed Central

Zhu, Chengsheng; Miller, Maximilian; Marpaka, Srinayani; Vaysberg, Pavel; Rühlemann, Malte C; Wu, Guojun; Heinsen, Femke-Anouska; Tempel, Marie; Zhao, Liping; Lieb, Wolfgang; Franke, Andre; Bromberg, Yana

2018-01-01

Abstract The vast majority of microorganisms on Earth reside in often-inseparable environment-specific communities—microbiomes. Meta-genomic/-transcriptomic sequencing could reveal the otherwise inaccessible functionality of microbiomes. However, existing analytical approaches focus on attributing sequencing reads to known genes/genomes, often failing to make maximal use of available data. We created faser (functional annotation of sequencing reads), an algorithm that is optimized to map reads to molecular functions encoded by the read-correspondent genes. The mi-faser microbiome analysis pipeline, combining faser with our manually curated reference database of protein functions, accurately annotates microbiome molecular functionality. mi-faser’s minutes-per-microbiome processing speed is significantly faster than that of other methods, allowing for large scale comparisons. Microbiome function vectors can be compared between different conditions to highlight environment-specific and/or time-dependent changes in functionality. Here, we identified previously unseen oil degradation-specific functions in BP oil-spill data, as well as functional signatures of individual-specific gut microbiome responses to a dietary intervention in children with Prader–Willi syndrome. Our method also revealed variability in Crohn's Disease patient microbiomes and clearly distinguished them from those of related healthy individuals. Our analysis highlighted the microbiome role in CD pathogenicity, demonstrating enrichment of patient microbiomes in functions that promote inflammation and that help bacteria survive it. PMID:29194524
EuPathDB: the eukaryotic pathogen genomics database resource

PubMed Central

Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie

2017-01-01

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
Determination of male strobilus developmental stages by cytological and gene expression analyses in Japanese cedar (Cryptomeria japonica).

PubMed

Tsubomura, Miyoko; Kurita, Manabu; Watanabe, Atsushi

2016-05-01

The molecular mechanisms that control male strobilus development in conifers are largely unknown because the developmental stages and related genes have not yet been characterized. The determination of male strobilus developmental stages will contribute to genetic research and reproductive biology in conifers. Our objectives in this study were to determine the developmental stages of male strobili by cytological and transcriptome analysis, and to determine the stages at which aberrant morphology is observed in a male-sterile mutant of Cryptomeria japonica D. Don to better understand the molecular mechanisms that control male strobilus and pollen development. Male strobilus development was observed for 8 months, from initiation to pollen dispersal. A set of 19,209 expressed sequence tags (ESTs) collected from a male reproductive library and a pollen library was used for microarray analysis. We divided male strobilus development into 10 stages by cytological and transcriptome analysis. Eight clusters (7324 ESTs) exhibited major changes in transcriptome profiles during male strobili and pollen development in C. japonica Two clusters showed a gradual increase and decline in transcript abundance, respectively, while the other six clusters exhibited stage-specific changes. The stages at which the male sterility trait of Sosyun was expressed were identified using information on male strobilus and pollen developmental stages and gene expression profiles. Aberrant morphology was observed cytologically at Stage 6 (microspore stage), and differences in expression patterns compared with wild type were observed at Stage 4 (tetrad stage). © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Interactome analysis of longitudinal pharyngeal infection of cynomolgus macaques by group A Streptococcus.

PubMed

Shea, Patrick R; Virtaneva, Kimmo; Kupko, John J; Porcella, Stephen F; Barry, William T; Wright, Fred A; Kobayashi, Scott D; Carmody, Aaron; Ireland, Robin M; Sturdevant, Daniel E; Ricklefs, Stacy M; Babar, Imran; Johnson, Claire A; Graham, Morag R; Gardner, Donald J; Bailey, John R; Parnell, Michael J; Deleo, Frank R; Musser, James M

2010-03-09

Relatively little is understood about the dynamics of global host-pathogen transcriptome changes that occur during bacterial infection of mucosal surfaces. To test the hypothesis that group A Streptococcus (GAS) infection of the oropharynx provokes a distinct host transcriptome response, we performed genome-wide transcriptome analysis using a nonhuman primate model of experimental pharyngitis. We also identified host and pathogen biological processes and individual host and pathogen gene pairs with correlated patterns of expression, suggesting interaction. For this study, 509 host genes and seven biological pathways were differentially expressed throughout the entire 32-day infection cycle. GAS infection produced an initial widespread significant decrease in expression of many host genes, including those involved in cytokine production, vesicle formation, metabolism, and signal transduction. This repression lasted until day 4, at which time a large increase in expression of host genes was observed, including those involved in protein translation, antigen presentation, and GTP-mediated signaling. The interactome analysis identified 73 host and pathogen gene pairs with correlated expression levels. We discovered significant correlations between transcripts of GAS genes involved in hyaluronic capsule production and host endocytic vesicle formation, GAS GTPases and host fibrinolytic genes, and GAS response to interaction with neutrophils. We also identified a strong signal, suggesting interaction between host gammadelta T cells and genes in the GAS mevalonic acid synthesis pathway responsible for production of isopentenyl-pyrophosphate, a short-chain phospholipid that stimulates these T cells. Taken together, our results are unique in providing a comprehensive understanding of the host-pathogen interactome during mucosal infection by a bacterial pathogen.
Dinoflagellate phylogeny revisited: Using ribosomal proteins to resolve deep branching dinoflagellate clades

PubMed Central

Bachvaroff, Tsvetan R.; Gornik, Sebastian G.; Concepcion, Gregory T.; Waller, Ross F.; Mendez, Gregory S.; Lippmeier, J. Casey; Delwiche, Charles F.

2014-01-01

The alveolates are composed of three major lineages, the ciliates, dinoflagellates, and apicomplexans. Together these ‘protist’ taxa play key roles in primary production and ecology, as well as in illness of humans and other animals. The interface between the dinoflagellate and apicomplexan clades has been an area of recent discovery, blurring the distinction between these two clades. Moreover, phylogenetic analysis has yet to determine the position of basal dinoflagellate clades hence the deepest branches of the dinoflagellate tree currently remain unresolved. Large-scale mRNA sequencing was applied to 11 species of dinoflagellates, including strains of the syndinean genera Hematodinium and Amoebophrya, parasites of crustaceans and dinoflagellates, respectively, to optimize and update the dinoflagellate tree. From the transcriptome-scale data a total of 73 ribosomal protein-coding genes were selected for phylogeny. After individual gene orthology assessment, the genes were concatenated into a >15,000 amino acid alignment with 76 taxa from dinoflagellates, apicomplexans, ciliates, and the outgroup heterokonts. Overall the tree was well resolved and supported, when the data was subsampled with gblocks or constraint trees were tested with the approximately unbiased test. The deepest branches of the dinoflagellate tree can now be resolved with strong support, and provides a clearer view of the evolution of the distinctive traits of dinoflagellates. PMID:24135237
Hypertranscription in development, stem cells, and regeneration

PubMed Central

Percharde, Michelle; Bulut-Karslioglu, Aydan; Ramalho-Santos, Miguel

2016-01-01

SUMMARY Cells can globally up-regulate their transcriptome during specific transitions, a phenomenon called hypertranscription. Evidence for hypertranscription dates back over 70 years, but it has gone largely ignored in the genomics era until recently. We discuss data supporting the notion that hypertranscription is a unifying theme in embryonic development, stem cell biology, regeneration and cell competition. We review the history, methods for analysis, underlying mechanisms and biological significance of hypertranscription. PMID:27989554
Transcriptomic analysis of persistent infection with foot-and-mouth disease virus in cattle suggests impairment of cell-mediated immunity in the nasopharynx

USDA-ARS?s Scientific Manuscript database

In order to investigate the mechanisms of persistent foot-and-mouth disease virus (FMDV) infection in cattle, transcriptome alterations associated with the FMDV carrier state were characterized using a bovine whole-transcriptome microarray. Eighteen cattle (8 vaccinated with a recombinant FMDV A vac...
New approach for the study of mite reproduction: the first transcriptome analysis of a mite, Phytoseiulus persimilis (Acari: Phytoseiidae)

USDA-ARS?s Scientific Manuscript database

Many species of mites and ticks are of agricultural and medical importance. Much can be learned from the study of transcriptomes of acarines which can generate DNA-sequence information of potential target genes for the control of acarine pests. High throughput transcriptome sequencing can also yie...
Transcriptional changes of rice in response to rice black-streaked dwarf virus.

PubMed

Ahmed, Mohamed M S; Ji, Wen; Wang, Muyue; Bian, Shiquan; Xu, Meng; Wang, Weiyun; Zhang, Jiangxiang; Xu, Zhihao; Yu, Meimei; Liu, Qiaoquan; Zhang, Changquan; Zhang, Honggen; Tang, Shuzhu; Gu, Minghong; Yu, Hengxiu

2017-09-10

Rice black-streaked dwarf virus (RBSDV), a member of the genus Fijivirus in the family Reoviridae, causes significant economic losses in rice production in China and many other Asian countries. Although a great deal of effort has been made to elucidate the interactions among the virus, insect vectors, host and environmental conditions, few RBSDV proteins involved in pathogenesis have been identified, and the biological basis of disease development in rice remains largely unknown. Transcriptomic information associated with the disease development in rice would be helpful to unravel the biological mechanism. To determine how the rice transcriptome changes in response to RBSDV infection, we carried out RNA-Seq to perform a genome-wide gene expression analysis of a susceptible rice cultivar KTWYJ3. The transcriptomes of RBSDV-infected samples were compared to those of RBSDV-free (healthy) at two time points (time points are represented by group I and II). The results derived from the differential expression analysis in RBSDV-infected libraries vs. healthy ones in group I revealed that 102 out of a total of 281 significant differentially expressed genes (DEGs) were up-regulated and 179 DEGs were down-regulated. Of the 2592 identified DEGs in group II, 1588 DEGs were up-regulated and 1004 DEGs were down-regulated. A total of 66 DEGs were commonly identified in both groups. Of these 66 DEGs, expression patterns for 36 DEGs were similar in both groups. Our analysis demonstrated that some genes related to disease defense and stress resistance were up-regulated while genes associated with chloroplast were down-regulated in response to RBSDV infection. In addition, some genes associated with plant-height were differentially expressed. This result indicates those genes might be involved in dwarf symptoms caused by RBSDV. Taken together, our results provide a genome-wide transcriptome analysis for rice plants in response to RBSDV infection which may contribute to the understanding of the regulatory mechanisms involved in rice-RBSDV interaction and the biological basis of rice black-streaked dwarf disease development in rice. Copyright © 2017 Elsevier B.V. All rights reserved.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.