Plant genome and transcriptome annotations: from misconceptions to simple solutions
Bolger, Marie E; Arsova, Borjana; Usadel, Björn
2018-01-01
Abstract Next-generation sequencing has triggered an explosion of available genomic and transcriptomic resources in the plant sciences. Although genome and transcriptome sequencing has become orders of magnitudes cheaper and more efficient, often the functional annotation process is lagging behind. This might be hampered by the lack of a comprehensive enumeration of simple-to-use tools available to the plant researcher. In this comprehensive review, we present (i) typical ontologies to be used in the plant sciences, (ii) useful databases and resources used for functional annotation, (iii) what to expect from an annotated plant genome, (iv) an automated annotation pipeline and (v) a recipe and reference chart outlining typical steps used to annotate plant genomes/transcriptomes using publicly available resources. PMID:28062412
De novo transcriptome assembly of drought tolerant CAM plants, Agave deserti and Agave tequilana.
Gross, Stephen M; Martin, Jeffrey A; Simpson, June; Abraham-Juarez, María Jazmín; Wang, Zhong; Visel, Axel
2013-08-19
Agaves are succulent monocotyledonous plants native to xeric environments of North America. Because of their adaptations to their environment, including crassulacean acid metabolism (CAM, a water-efficient form of photosynthesis), and existing technologies for ethanol production, agaves have gained attention both as potential lignocellulosic bioenergy feedstocks and models for exploring plant responses to abiotic stress. However, the lack of comprehensive Agave sequence datasets limits the scope of investigations into the molecular-genetic basis of Agave traits. Here, we present comprehensive, high quality de novo transcriptome assemblies of two Agave species, A. tequilana and A. deserti, built from short-read RNA-seq data. Our analyses support completeness and accuracy of the de novo transcriptome assemblies, with each species having a minimum of approximately 35,000 protein-coding genes. Comparison of agave proteomes to those of additional plant species identifies biological functions of gene families displaying sequence divergence in agave species. Additionally, a focus on the transcriptomics of the A. deserti juvenile leaf confirms evolutionary conservation of monocotyledonous leaf physiology and development along the proximal-distal axis. Our work presents a comprehensive transcriptome resource for two Agave species and provides insight into their biology and physiology. These resources are a foundation for further investigation of agave biology and their improvement for bioenergy development.
De novo transcriptome assembly of drought tolerant CAM plants, Agave deserti and Agave tequilana
2013-01-01
Background Agaves are succulent monocotyledonous plants native to xeric environments of North America. Because of their adaptations to their environment, including crassulacean acid metabolism (CAM, a water-efficient form of photosynthesis), and existing technologies for ethanol production, agaves have gained attention both as potential lignocellulosic bioenergy feedstocks and models for exploring plant responses to abiotic stress. However, the lack of comprehensive Agave sequence datasets limits the scope of investigations into the molecular-genetic basis of Agave traits. Results Here, we present comprehensive, high quality de novo transcriptome assemblies of two Agave species, A. tequilana and A. deserti, built from short-read RNA-seq data. Our analyses support completeness and accuracy of the de novo transcriptome assemblies, with each species having a minimum of approximately 35,000 protein-coding genes. Comparison of agave proteomes to those of additional plant species identifies biological functions of gene families displaying sequence divergence in agave species. Additionally, a focus on the transcriptomics of the A. deserti juvenile leaf confirms evolutionary conservation of monocotyledonous leaf physiology and development along the proximal-distal axis. Conclusions Our work presents a comprehensive transcriptome resource for two Agave species and provides insight into their biology and physiology. These resources are a foundation for further investigation of agave biology and their improvement for bioenergy development. PMID:23957668
The aquatic animals' transcriptome resource for comparative functional analysis.
Chou, Chih-Hung; Huang, Hsi-Yuan; Huang, Wei-Chih; Hsu, Sheng-Da; Hsiao, Chung-Der; Liu, Chia-Yu; Chen, Yu-Hung; Liu, Yu-Chen; Huang, Wei-Yun; Lee, Meng-Lin; Chen, Yi-Chang; Huang, Hsien-Da
2018-05-09
Aquatic animals have great economic and ecological importance. Among them, non-model organisms have been studied regarding eco-toxicity, stress biology, and environmental adaptation. Due to recent advances in next-generation sequencing techniques, large amounts of RNA-seq data for aquatic animals are publicly available. However, currently there is no comprehensive resource exist for the analysis, unification, and integration of these datasets. This study utilizes computational approaches to build a new resource of transcriptomic maps for aquatic animals. This aquatic animal transcriptome map database dbATM provides de novo assembly of transcriptome, gene annotation and comparative analysis of more than twenty aquatic organisms without draft genome. To improve the assembly quality, three computational tools (Trinity, Oases and SOAPdenovo-Trans) were employed to enhance individual transcriptome assembly, and CAP3 and CD-HIT-EST software were then used to merge these three assembled transcriptomes. In addition, functional annotation analysis provides valuable clues to gene characteristics, including full-length transcript coding regions, conserved domains, gene ontology and KEGG pathways. Furthermore, all aquatic animal genes are essential for comparative genomics tasks such as constructing homologous gene groups and blast databases and phylogenetic analysis. In conclusion, we establish a resource for non model organism aquatic animals, which is great economic and ecological importance and provide transcriptomic information including functional annotation and comparative transcriptome analysis. The database is now publically accessible through the URL http://dbATM.mbc.nctu.edu.tw/ .
Chauhan, Pallavi; Hansson, Bengt; Kraaijeveld, Ken; de Knijff, Peter; Svensson, Erik I; Wellenreuther, Maren
2014-09-22
There is growing interest in odonates (damselflies and dragonflies) as model organisms in ecology and evolutionary biology but the development of genomic resources has been slow. So far only one draft genome (Ladona fulva) and one transcriptome assembly (Enallagma hageni) have been published. Odonates have some of the most advanced visual systems among insects and several species are colour polymorphic, and genomic and transcriptomic data would allow studying the genomic architecture of these interesting traits and make detailed comparative studies between related species possible. Here, we present a comprehensive de novo transcriptome assembly for the blue-tailed damselfly Ischnura elegans (Odonata: Coenagrionidae) built from short-read RNA-seq data. The transcriptome analysis in this paper provides a first step towards identifying genes and pathways underlying the visual and colour systems in this insect group. Illumina RNA sequencing performed on tissues from the head, thorax and abdomen generated 428,744,100 paired-ends reads amounting to 110 Gb of sequence data, which was assembled de novo with Trinity. A transcriptome was produced after filtering and quality checking yielding a final set of 60,232 high quality transcripts for analysis. CEGMA software identified 247 out of 248 ultra-conserved core proteins as 'complete' in the transcriptome assembly, yielding a completeness of 99.6%. BLASTX and InterProScan annotated 55% of the assembled transcripts and showed that the three tissue types differed both qualitatively and quantitatively in I. elegans. Differential expression identified 8,625 transcripts to be differentially expressed in head, thorax and abdomen. Targeted analyses of vision and colour functional pathways identified the presence of four different opsin types and three pigmentation pathways. We also identified transcripts involved in temperature sensitivity, thermoregulation and olfaction. All these traits and their associated transcripts are of considerable ecological and evolutionary interest for this and other insect orders. Our work presents a comprehensive transcriptome resource for the ancient insect order Odonata and provides insight into their biology and physiology. The transcriptomic resource can provide a foundation for future investigations into this diverse group, including the evolution of colour, vision, olfaction and thermal adaptation.
Improving amphibian genomic resources: a multitissue reference transcriptome of an iconic invader.
Richardson, Mark F; Sequeira, Fernando; Selechnik, Daniel; Carneiro, Miguel; Vallinoto, Marcelo; Reid, Jack G; West, Andrea J; Crossland, Michael R; Shine, Richard; Rollins, Lee A
2018-01-01
Cane toads (Rhinella marina) are an iconic invasive species introduced to 4 continents and well utilized for studies of rapid evolution in introduced environments. Despite the long introduction history of this species, its profound ecological impacts, and its utility for demonstrating evolutionary principles, genetic information is sparse. Here we produce a de novo transcriptome spanning multiple tissues and life stages to enable investigation of the genetic basis of previously identified rapid phenotypic change over the introduced range. Using approximately 1.9 billion reads from developing tadpoles and 6 adult tissue-specific cDNA libraries, as well as a transcriptome assembly pipeline encompassing 100 separate de novo assemblies, we constructed 62 202 transcripts, of which we functionally annotated ∼50%. Our transcriptome assembly exhibits 90% full-length completeness of the Benchmarking Universal Single-Copy Orthologs data set. Robust assembly metrics and comparisons with several available anuran transcriptomes and genomes indicate that our cane toad assembly is one of the most complete anuran genomic resources available. This comprehensive anuran transcriptome will provide a valuable resource for investigation of genes under selection during invasion in cane toads, but will also greatly expand our general knowledge of anuran genomes, which are underrepresented in the literature. The data set is publically available in NCBI and GigaDB to serve as a resource for other researchers. © The Authors 2017. Published by Oxford University Press.
Improving amphibian genomic resources: a multitissue reference transcriptome of an iconic invader
Reid, Jack G; Crossland, Michael R
2018-01-01
Abstract Background Cane toads (Rhinella marina) are an iconic invasive species introduced to 4 continents and well utilized for studies of rapid evolution in introduced environments. Despite the long introduction history of this species, its profound ecological impacts, and its utility for demonstrating evolutionary principles, genetic information is sparse. Here we produce a de novo transcriptome spanning multiple tissues and life stages to enable investigation of the genetic basis of previously identified rapid phenotypic change over the introduced range. Findings Using approximately 1.9 billion reads from developing tadpoles and 6 adult tissue-specific cDNA libraries, as well as a transcriptome assembly pipeline encompassing 100 separate de novo assemblies, we constructed 62 202 transcripts, of which we functionally annotated ∼50%. Our transcriptome assembly exhibits 90% full-length completeness of the Benchmarking Universal Single-Copy Orthologs data set. Robust assembly metrics and comparisons with several available anuran transcriptomes and genomes indicate that our cane toad assembly is one of the most complete anuran genomic resources available. Conclusions This comprehensive anuran transcriptome will provide a valuable resource for investigation of genes under selection during invasion in cane toads, but will also greatly expand our general knowledge of anuran genomes, which are underrepresented in the literature. The data set is publically available in NCBI and GigaDB to serve as a resource for other researchers. PMID:29186423
Zhao, Feng; Yan, Chao; Wang, Xuan; Yang, Yang; Wang, Guangyin; Lee, Wenhui; Xiang, Yang; Zhang, Yun
2014-01-01
Amphibians occupy a key phylogenetic position in vertebrates and evolution of the immune system. But, the resources of its transcriptome or genome are still little now. Bombina maxima possess strong ability to survival in very harsh environment with a more mature immune system. We obtained a comprehensive transcriptome by RNA-sequencing technology. 14.3% of transcripts were identified to be skin-specific genes, most of which were not isolated from skin secretion in previous works or novel non-coding RNAs. 27.9% of transcripts were mapped into 242 predicted KEGG pathways and 6.16% of transcripts related to human disease and cancer. Of 39 448 transcripts with the coding sequence, at least 1501 transcripts (570 genes) related to the immune system process. The molecules of immune signalling pathway were almost presented, several transcripts with high expression in skin and stomach. Experiments showed that lipopolysaccharide or bacteria challenge stimulated pro-inflammatory cytokine production and activation of pro-inflammatory caspase-1. These frog's data can remarkably expand the existing genome or transcriptome resources of amphibians, especially immunity data. The entity of the data provides a valuable platform for further investigation on more detailed immune response in B. maxima and a comparative study with other amphibians. PMID:23942912
RAID: a comprehensive resource for human RNA-associated (RNA–RNA/RNA–protein) interaction
Zhang, Xiaomeng; Wu, Deng; Chen, Liqun; Li, Xiang; Yang, Jinxurong; Fan, Dandan; Dong, Tingting; Liu, Mingyue; Tan, Puwen; Xu, Jintian; Yi, Ying; Wang, Yuting; Zou, Hua; Hu, Yongfei; Fan, Kaili; Kang, Juanjuan; Huang, Yan; Miao, Zhengqiang; Bi, Miaoman; Jin, Nana; Li, Kongning; Li, Xia; Xu, Jianzhen; Wang, Dong
2014-01-01
Transcriptomic analyses have revealed an unexpected complexity in the eukaryote transcriptome, which includes not only protein-coding transcripts but also an expanding catalog of noncoding RNAs (ncRNAs). Diverse coding and noncoding RNAs (ncRNAs) perform functions through interaction with each other in various cellular processes. In this project, we have developed RAID (http://www.rna-society.org/raid), an RNA-associated (RNA–RNA/RNA–protein) interaction database. RAID intends to provide the scientific community with all-in-one resources for efficient browsing and extraction of the RNA-associated interactions in human. This version of RAID contains more than 6100 RNA-associated interactions obtained by manually reviewing more than 2100 published papers, including 4493 RNA–RNA interactions and 1619 RNA–protein interactions. Each entry contains detailed information on an RNA-associated interaction, including RAID ID, RNA/protein symbol, RNA/protein categories, validated method, expressing tissue, literature references (Pubmed IDs), and detailed functional description. Users can query, browse, analyze, and manipulate RNA-associated (RNA–RNA/RNA–protein) interaction. RAID provides a comprehensive resource of human RNA-associated (RNA–RNA/RNA–protein) interaction network. Furthermore, this resource will help in uncovering the generic organizing principles of cellular function network. PMID:24803509
Novel transcriptome resources for three scleractinian coral species from the Indo-Pacific
Kenkel, Carly D.; Bay, Line K
2017-01-01
Abstract Transcriptomic resources for coral species can provide insight into coral evolutionary history and stress-response physiology. Goniopora columna, Galaxea astreata, and Galaxea acrhelia are scleractinian corals of the Indo-Pacific, representing a diversity of morphologies and life-history traits. G. columna and G. astreata are common and cosmopolitan, while G. acrhelia is largely restricted to the coral triangle and Great Barrier Reef. Reference transcriptomes for these species were assembled from replicate colony fragments exposed to elevated (31°C) and ambient (27°C) temperatures. Trinity was used to create de novo assemblies for each species from 92–102 million raw Illumina Hiseq 2 × 150 bp reads. Host-specific assemblies contained 65 460–72 405 contigs, representing 26 693–37 894 isogroups (∼genes) with an average N50 of 2254. Gene name and/or gene ontology annotations were possible for 58% of isogroups on average. Transcriptomes contained 93.1–94.3% of EuKaryotic Orthologous Groups comprising the core eukaryotic gene set, and 89.98–91.92% of the single-copy metazoan core gene set orthologs were complete, indicating fairly comprehensive assemblies. This work expands the complement of transcriptomic resources available for scleractinian coral species, including the first reference for a representative of Goniopora spp. as well as species with novel morphology. PMID:28938722
Novel transcriptome resources for three scleractinian coral species from the Indo-Pacific.
Kenkel, Carly D; Bay, Line K
2017-09-01
Transcriptomic resources for coral species can provide insight into coral evolutionary history and stress-response physiology. Goniopora columna, Galaxea astreata, and Galaxea acrhelia are scleractinian corals of the Indo-Pacific, representing a diversity of morphologies and life-history traits. G. columna and G. astreata are common and cosmopolitan, while G. acrhelia is largely restricted to the coral triangle and Great Barrier Reef. Reference transcriptomes for these species were assembled from replicate colony fragments exposed to elevated (31°C) and ambient (27°C) temperatures. Trinity was used to create de novo assemblies for each species from 92-102 million raw Illumina Hiseq 2 × 150 bp reads. Host-specific assemblies contained 65 460-72 405 contigs, representing 26 693-37 894 isogroups (∼genes) with an average N50 of 2254. Gene name and/or gene ontology annotations were possible for 58% of isogroups on average. Transcriptomes contained 93.1-94.3% of EuKaryotic Orthologous Groups comprising the core eukaryotic gene set, and 89.98-91.92% of the single-copy metazoan core gene set orthologs were complete, indicating fairly comprehensive assemblies. This work expands the complement of transcriptomic resources available for scleractinian coral species, including the first reference for a representative of Goniopora spp. as well as species with novel morphology. © The Authors 2017. Published by Oxford University Press.
Transcriptome sequencing and de novo analysis of the copepod Calanus sinicus using 454 GS FLX.
Ning, Juan; Wang, Minxiao; Li, Chaolun; Sun, Song
2013-01-01
Despite their species abundance and primary economic importance, genomic information about copepods is still limited. In particular, genomic resources are lacking for the copepod Calanus sinicus, which is a dominant species in the coastal waters of East Asia. In this study, we performed de novo transcriptome sequencing to produce a large number of expressed sequence tags for the copepod C. sinicus. Copepodid larvae and adults were used as the basic material for transcriptome sequencing. Using 454 pyrosequencing, a total of 1,470,799 reads were obtained, which were assembled into 56,809 high quality expressed sequence tags. Based on their sequence similarity to known proteins, about 14,000 different genes were identified, including members of all major conserved signaling pathways. Transcripts that were putatively involved with growth, lipid metabolism, molting, and diapause were also identified among these genes. Differentially expressed genes related to several processes were found in C. sinicus copepodid larvae and adults. We detected 284,154 single nucleotide polymorphisms (SNPs) that provide a resource for gene function studies. Our data provide the most comprehensive transcriptome resource available for C. sinicus. This resource allowed us to identify genes associated with primary physiological processes and SNPs in coding regions, which facilitated the quantitative analysis of differential gene expression. These data should provide foundation for future genetic and genomic studies of this and related species.
Ochsner, Scott A; Watkins, Christopher M; LaGrone, Benjamin S; Steffen, David L; McKenna, Neil J
2010-10-01
Nuclear receptors (NRs) are ligand-regulated transcription factors that recruit coregulators and other transcription factors to gene promoters to effect regulation of tissue-specific transcriptomes. The prodigious rate at which the NR signaling field has generated high content gene expression and, more recently, genome-wide location analysis datasets has not been matched by a committed effort to archiving this information for routine access by bench and clinical scientists. As a first step towards this goal, we searched the MEDLINE database for studies, which referenced either expression microarray and/or genome-wide location analysis datasets in which a NR or NR ligand was an experimental variable. A total of 1122 studies encompassing 325 unique organs, tissues, primary cells, and cell lines, 35 NRs, and 91 NR ligands were retrieved and annotated. The data were incorporated into a new section of the Nuclear Receptor Signaling Atlas Molecule Pages, Transcriptomics and Cistromics, for which we designed an intuitive, freely accessible user interface to browse the studies. Each study links to an abstract, the MEDLINE record, and, where available, Gene Expression Omnibus and ArrayExpress records. The resource will be updated on a regular basis to provide a current and comprehensive entrez into the sum of transcriptomic and cistromic research in this field.
RAID: a comprehensive resource for human RNA-associated (RNA-RNA/RNA-protein) interaction.
Zhang, Xiaomeng; Wu, Deng; Chen, Liqun; Li, Xiang; Yang, Jinxurong; Fan, Dandan; Dong, Tingting; Liu, Mingyue; Tan, Puwen; Xu, Jintian; Yi, Ying; Wang, Yuting; Zou, Hua; Hu, Yongfei; Fan, Kaili; Kang, Juanjuan; Huang, Yan; Miao, Zhengqiang; Bi, Miaoman; Jin, Nana; Li, Kongning; Li, Xia; Xu, Jianzhen; Wang, Dong
2014-07-01
Transcriptomic analyses have revealed an unexpected complexity in the eukaryote transcriptome, which includes not only protein-coding transcripts but also an expanding catalog of noncoding RNAs (ncRNAs). Diverse coding and noncoding RNAs (ncRNAs) perform functions through interaction with each other in various cellular processes. In this project, we have developed RAID (http://www.rna-society.org/raid), an RNA-associated (RNA-RNA/RNA-protein) interaction database. RAID intends to provide the scientific community with all-in-one resources for efficient browsing and extraction of the RNA-associated interactions in human. This version of RAID contains more than 6100 RNA-associated interactions obtained by manually reviewing more than 2100 published papers, including 4493 RNA-RNA interactions and 1619 RNA-protein interactions. Each entry contains detailed information on an RNA-associated interaction, including RAID ID, RNA/protein symbol, RNA/protein categories, validated method, expressing tissue, literature references (Pubmed IDs), and detailed functional description. Users can query, browse, analyze, and manipulate RNA-associated (RNA-RNA/RNA-protein) interaction. RAID provides a comprehensive resource of human RNA-associated (RNA-RNA/RNA-protein) interaction network. Furthermore, this resource will help in uncovering the generic organizing principles of cellular function network. © 2014 Zhang et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Transcriptome sequencing reveals high isoform diversity in the ant Formica exsecta
Paviala, Jenni; Morandin, Claire; Wheat, Christopher; Sundström, Liselotte; Helanterä, Heikki
2017-01-01
Transcriptome resources for social insects have the potential to provide new insight into polyphenism, i.e., how divergent phenotypes arise from the same genome. Here we present a transcriptome based on paired-end RNA sequencing data for the ant Formica exsecta (Formicidae, Hymenoptera). The RNA sequencing libraries were constructed from samples of several life stages of both sexes and female castes of queens and workers, in order to maximize representation of expressed genes. We first compare the performance of common assembly and scaffolding software (Trinity, Velvet-Oases, and SOAPdenovo-trans), in producing de novo assemblies. Second, we annotate the resulting expressed contigs to the currently published genomes of ants, and other insects, including the honeybee, to filter genes that have annotation evidence of being true genes. Our pipeline resulted in a final assembly of altogether 39,262 mRNA transcripts, with an average coverage of >300X, belonging to 17,496 unique genes with annotation in the related ant species. From these genes, 536 genes were unique to one caste or sex only, highlighting the importance of comprehensive sampling. Our final assembly also showed expression of several splice variants in 6,975 genes, and we show that accounting for splice variants affects the outcome of downstream analyses such as gene ontologies. Our transcriptome provides an outstanding resource for future genetic studies on F. exsecta and other ant species, and the presented transcriptome assembly can be adapted to any non-model species that has genomic resources available from a related taxon. PMID:29177112
Jeon, Jin; Kim, Jae Kwang; Kim, HyeRan; Kim, Yeon Jeong; Park, Yun Ji; Kim, Sun Ju; Kim, Changsoo; Park, Sang Un
2018-02-15
Kale (Brassica oleracea var. acephala) is a rich source of numerous health-benefiting compounds, including vitamins, glucosinolates, phenolic compounds, and carotenoids. However, the genetic resources for exploiting the phyto-nutritional traits of kales are limited. To acquire precise information on secondary metabolites in kales, we performed a comprehensive analysis of the transcriptome and metabolome of green and red kale seedlings. Kale transcriptome datasets revealed 37,149 annotated genes and several secondary metabolite biosynthetic genes. HPLC analysis revealed 14 glucosinolates, 20 anthocyanins, 3 phenylpropanoids, and 6 carotenoids in the kale seedlings that were examined. Red kale contained more glucosinolates, anthocyanins, and phenylpropanoids than green kale, whereas the carotenoid contents were much higher in green kale than in red kale. Ultimately, our data will be a valuable resource for future research on kale bio-engineering and will provide basic information to define gene-to-metabolite networks in kale. Copyright © 2017 Elsevier Ltd. All rights reserved.
García, C Fernando; Pedrini, Nicolas; Sánchez-Paz, Arturo; Reyna-Blanco, Carlos S; Lavarias, Sabrina; Muhlia-Almazán, Adriana; Fernández-Giménez, Analía; Laino, Aldana; de-la-Re-Vega, Enrique; Lukaszewicz, German; López-Zavala, Alonso A; Brieba, Luis G; Criscitello, Michael F; Carrasco-Miranda, Jesús S; García-Orozco, Karina D; Ochoa-Leyva, Adrian; Rudiño-Piñera, Enrique; Sanchez-Flores, Alejandro; Sotelo-Mundo, Rogerio R
2018-02-01
Palaemonetes argentinus, an abundant freshwater prawn species in the northern and central region of Argentina, has been used as a bioindicator of environmental pollutants as it displays a very high sensitivity to pollutants exposure. Despite their extraordinary ecological relevance, a lack of genomic information has hindered a more thorough understanding of the molecular mechanisms potentially involved in detoxification processes of this species. Thus, transcriptomic profiling studies represent a promising approach to overcome the limitations imposed by the lack of extensive genomic resources for P. argentinus, and may improve the understanding of its physiological and molecular response triggered by pollutants. This work represents the first comprehensive transcriptome-based characterization of the non-model species P. argentinus to generate functional genomic annotations and provides valuable resources for future genetic studies. Trinity de novo assembly consisted of 24,738 transcripts with high representation of detoxification (phase I and II), anti-oxidation, osmoregulation pathways and DNA replication and bioenergetics. This crustacean transcriptome provides valuable molecular information about detoxification and biochemical processes that could be applied as biomarkers in further ecotoxicology studies. Copyright © 2017 Elsevier B.V. All rights reserved.
Assessing the gene content of the megagenome: sugar pine (Pinus lambertiana)
Daniel Gonzalez-Ibeas; Pedro J. Martinez-Garcia; Randi A. Famula; Annette Deflino-Mix; Kristian A. Stevens; Carol A. Loopstra; Charles H. Landley; David B. Neale; Jill L. Wegryzn
2016-01-01
Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana...
Transcriptome of interstitial cells of Cajal reveals unique and selective gene signatures
Park, Paul J.; Fuchs, Robert; Wei, Lai; Jorgensen, Brian G.; Redelman, Doug; Ward, Sean M.; Sanders, Kenton M.
2017-01-01
Transcriptome-scale data can reveal essential clues into understanding the underlying molecular mechanisms behind specific cellular functions and biological processes. Transcriptomics is a continually growing field of research utilized in biomarker discovery. The transcriptomic profile of interstitial cells of Cajal (ICC), which serve as slow-wave electrical pacemakers for gastrointestinal (GI) smooth muscle, has yet to be uncovered. Using copGFP-labeled ICC mice and flow cytometry, we isolated ICC populations from the murine small intestine and colon and obtained their transcriptomes. In analyzing the transcriptome, we identified a unique set of ICC-restricted markers including transcription factors, epigenetic enzymes/regulators, growth factors, receptors, protein kinases/phosphatases, and ion channels/transporters. This analysis provides new and unique insights into the cellular and biological functions of ICC in GI physiology. Additionally, we constructed an interactive ICC genome browser (http://med.unr.edu/physio/transcriptome) based on the UCSC genome database. To our knowledge, this is the first online resource that provides a comprehensive library of all known genetic transcripts expressed in primary ICC. Our genome browser offers a new perspective into the alternative expression of genes in ICC and provides a valuable reference for future functional studies. PMID:28426719
Annadurai, Ramasamy S; Neethiraj, Ramprasad; Jayakumar, Vasanthan; Damodaran, Anand C; Rao, Sudha Narayana; Katta, Mohan A V S K; Gopinathan, Sreeja; Sarma, Santosh Prasad; Senthilkumar, Vanitha; Niranjan, Vidya; Gopinath, Ashok; Mugasimangalam, Raja C
2013-01-01
Herbal remedies are increasingly being recognised in recent years as alternative medicine for a number of diseases including cancer. Curcuma longa L., commonly known as turmeric is used as a culinary spice in India and in many Asian countries has been attributed to lower incidences of gastrointestinal cancers. Curcumin, a secondary metabolite isolated from the rhizomes of this plant has been shown to have significant anticancer properties, in addition to antimalarial and antioxidant effects. We sequenced the transcriptome of the rhizome of the 3 varieties of Curcuma longa L. using Illumina reversible dye terminator sequencing followed by de novo transcriptome assembly. Multiple databases were used to obtain a comprehensive annotation and the transcripts were functionally classified using GO, KOG and PlantCyc. Special emphasis was given for annotating the secondary metabolite pathways and terpenoid biosynthesis pathways. We report for the first time, the presence of transcripts related to biosynthetic pathways of several anti-cancer compounds like taxol, curcumin, and vinblastine in addition to anti-malarial compounds like artemisinin and acridone alkaloids, emphasizing turmeric's importance as a highly potent phytochemical. Our data not only provides molecular signatures for several terpenoids but also a comprehensive molecular resource for facilitating deeper insights into the transcriptome of C. longa.
Jayakumar, Vasanthan; Damodaran, Anand C.; Rao, Sudha Narayana; Katta, Mohan A. V. S. K.; Gopinathan, Sreeja; Sarma, Santosh Prasad; Senthilkumar, Vanitha; Niranjan, Vidya; Gopinath, Ashok; Mugasimangalam, Raja C.
2013-01-01
Herbal remedies are increasingly being recognised in recent years as alternative medicine for a number of diseases including cancer. Curcuma longa L., commonly known as turmeric is used as a culinary spice in India and in many Asian countries has been attributed to lower incidences of gastrointestinal cancers. Curcumin, a secondary metabolite isolated from the rhizomes of this plant has been shown to have significant anticancer properties, in addition to antimalarial and antioxidant effects. We sequenced the transcriptome of the rhizome of the 3 varieties of Curcuma longa L. using Illumina reversible dye terminator sequencing followed by de novo transcriptome assembly. Multiple databases were used to obtain a comprehensive annotation and the transcripts were functionally classified using GO, KOG and PlantCyc. Special emphasis was given for annotating the secondary metabolite pathways and terpenoid biosynthesis pathways. We report for the first time, the presence of transcripts related to biosynthetic pathways of several anti-cancer compounds like taxol, curcumin, and vinblastine in addition to anti-malarial compounds like artemisinin and acridone alkaloids, emphasizing turmeric's importance as a highly potent phytochemical. Our data not only provides molecular signatures for several terpenoids but also a comprehensive molecular resource for facilitating deeper insights into the transcriptome of C. longa. PMID:23468859
Fukushima, Atsushi; Nakamura, Michimi; Suzuki, Hideyuki; Yamazaki, Mami; Knoch, Eva; Mori, Tetsuya; Umemoto, Naoyuki; Morita, Masaki; Hirai, Go; Sodeoka, Mikiko; Saito, Kazuki
2016-01-01
The genus Physalis in the Solanaceae family contains several species of benefit to humans. Examples include P. alkekengi (Chinese-lantern plant, hôzuki in Japanese) used for medicinal and for decorative purposes, and P. peruviana, also known as Cape gooseberry, which bears an edible, vitamin-rich fruit. Members of the Physalis genus are a valuable resource for phytochemicals needed for the development of medicines and functional foods. To fully utilize the potential of these phytochemicals we need to understand their biosynthesis, and for this we need genomic data, especially comprehensive transcriptome datasets for gene discovery. We report the de novo assembly of the transcriptome from leaves of P. alkekengi and P. peruviana using Illumina RNA-seq technologies. We identified 75,221 unigenes in P. alkekengi and 54,513 in P. peruviana. All unigenes were annotated with gene ontology (GO), Enzyme Commission (EC) numbers, and pathway information from the Kyoto Encyclopedia of Genes and Genomes (KEGG). We classified unigenes encoding enzyme candidates putatively involved in the secondary metabolism and identified more than one unigenes for each step in terpenoid backbone- and steroid biosynthesis in P. alkekengi and P. peruviana. To measure the variability of the withanolides including physalins and provide insights into their chemical diversity in Physalis, we also analyzed the metabolite content in leaves of P. alkekengi and P. peruviana at five different developmental stages by liquid chromatography-mass spectrometry. We discuss that comprehensive transcriptome approaches within a family can yield a clue for gene discovery in Physalis and provide insights into their complex chemical diversity. The transcriptome information we submit here will serve as an important public resource for further studies of the specialized metabolism of Physalis species. PMID:28066454
Fukushima, Atsushi; Nakamura, Michimi; Suzuki, Hideyuki; Yamazaki, Mami; Knoch, Eva; Mori, Tetsuya; Umemoto, Naoyuki; Morita, Masaki; Hirai, Go; Sodeoka, Mikiko; Saito, Kazuki
2016-01-01
The genus Physalis in the Solanaceae family contains several species of benefit to humans. Examples include P. alkekengi (Chinese-lantern plant, hôzuki in Japanese) used for medicinal and for decorative purposes, and P. peruviana , also known as Cape gooseberry, which bears an edible, vitamin-rich fruit. Members of the Physalis genus are a valuable resource for phytochemicals needed for the development of medicines and functional foods. To fully utilize the potential of these phytochemicals we need to understand their biosynthesis, and for this we need genomic data, especially comprehensive transcriptome datasets for gene discovery. We report the de novo assembly of the transcriptome from leaves of P. alkekengi and P. peruviana using Illumina RNA-seq technologies. We identified 75,221 unigenes in P. alkekengi and 54,513 in P. peruviana . All unigenes were annotated with gene ontology (GO), Enzyme Commission (EC) numbers, and pathway information from the Kyoto Encyclopedia of Genes and Genomes (KEGG). We classified unigenes encoding enzyme candidates putatively involved in the secondary metabolism and identified more than one unigenes for each step in terpenoid backbone- and steroid biosynthesis in P. alkekengi and P. peruviana . To measure the variability of the withanolides including physalins and provide insights into their chemical diversity in Physalis , we also analyzed the metabolite content in leaves of P. alkekengi and P. peruviana at five different developmental stages by liquid chromatography-mass spectrometry. We discuss that comprehensive transcriptome approaches within a family can yield a clue for gene discovery in Physalis and provide insights into their complex chemical diversity. The transcriptome information we submit here will serve as an important public resource for further studies of the specialized metabolism of Physalis species.
Necklace: combining reference and assembled transcriptomes for more comprehensive RNA-Seq analysis.
Davidson, Nadia M; Oshlack, Alicia
2018-05-01
RNA sequencing (RNA-seq) analyses can benefit from performing a genome-guided and de novo assembly, in particular for species where the reference genome or the annotation is incomplete. However, tools for integrating an assembled transcriptome with reference annotation are lacking. Necklace is a software pipeline that runs genome-guided and de novo assembly and combines the resulting transcriptomes with reference genome annotations. Necklace constructs a compact but comprehensive superTranscriptome out of the assembled and reference data. Reads are subsequently aligned and counted in preparation for differential expression testing. Necklace allows a comprehensive transcriptome to be built from a combination of assembled and annotated transcripts, which results in a more comprehensive transcriptome for the majority of organisms. In addition RNA-seq data are mapped back to this newly created superTranscript reference to enable differential expression testing with standard methods.
Isoform Sequencing Provides a More Comprehensive View of the Panax ginseng Transcriptome.
Jo, Ick-Hyun; Lee, Jinsu; Hong, Chi Eun; Lee, Dong Jin; Bae, Wonsil; Park, Sin-Gi; Ahn, Yong Ju; Kim, Young Chang; Kim, Jang Uk; Lee, Jung Woo; Hyun, Dong Yun; Rhee, Sung-Keun; Hong, Chang Pyo; Bang, Kyong Hwan; Ryu, Hojin
2017-09-15
Korean ginseng ( Panax ginseng C.A. Meyer) has been widely used for medicinal purposes and contains potent plant secondary metabolites, including ginsenosides. To obtain transcriptomic data that offers a more comprehensive view of functional genomics in P. ginseng , we generated genome-wide transcriptome data from four different P. ginseng tissues using PacBio isoform sequencing (Iso-Seq) technology. A total of 135,317 assembled transcripts were generated with an average length of 3.2 kb and high assembly completeness. Of those unigenes, 67.5% were predicted to be complete full-length (FL) open reading frames (ORFs) and exhibited a high gene annotation rate. Furthermore, we successfully identified unique full-length genes involved in triterpenoid saponin synthesis and plant hormonal signaling pathways, including auxin and cytokinin. Studies on the functional genomics of P. ginseng seedlings have confirmed the rapid upregulation of negative feed-back loops by auxin and cytokinin signaling cues. The conserved evolutionary mechanisms in the auxin and cytokinin canonical signaling pathways of P. ginseng are more complex than those in Arabidopsis thaliana . Our analysis also revealed a more detailed view of transcriptome-wide alternative isoforms for 88 genes. Finally, transposable elements (TEs) were also identified, suggesting transcriptional activity of TEs in P. ginseng . In conclusion, our results suggest that long-read, full-length or partial-unigene data with high-quality assemblies are invaluable resources as transcriptomic references in P. ginseng and can be used for comparative analyses in closely related medicinal plants.
KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella.
Jouraku, Akiya; Yamamoto, Kimiko; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Narukawa, Junko; Miyamoto, Kazuhisa; Kurita, Kanako; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Noda, Hiroaki
2013-07-09
The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to facilitate the development of insecticides with a novel mode of action for more effective and environmentally less harmful insecticide rotation. To contribute to this goal, we developed KONAGAbase, a genomic and transcriptomic database for DBM (KONAGA is the Japanese word for DBM). KONAGAbase provides (1) transcriptomic sequences of 37,340 ESTs/mRNAs and 147,370 RNA-seq contigs which were clustered and assembled into 84,570 unigenes (30,695 contigs, 50,548 pseudo singletons, and 3,327 singletons); and (2) genomic sequences of 88,530 WGS contigs with 246,244 degenerate contigs and 106,455 singletons from which 6,310 de novo identified repeat sequences and 34,890 predicted gene-coding sequences were extracted. The unigenes and predicted gene-coding sequences were clustered and 32,800 representative sequences were extracted as a comprehensive putative gene set. These sequences were annotated with BLAST descriptions, Gene Ontology (GO) terms, and Pfam descriptions, respectively. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data. Five useful search interfaces consisting of BLAST search, keyword search, BLAST result-based search, GO tree-based search, and genome browser are provided. KONAGAbase is publicly available from our website (http://dbm.dna.affrc.go.jp/px/) through standard web browsers. KONAGAbase provides DBM comprehensive transcriptomic and draft genomic sequences with useful annotation information with easy-to-use web interfaces, which helps researchers to efficiently search for target sequences such as insect resistance-related genes. KONAGAbase will be continuously updated and additional genomic/transcriptomic resources and analysis tools will be provided for further efficient analysis of the mechanism of insecticide resistance and the development of effective insecticides with a novel mode of action for DBM.
Assessing the Gene Content of the Megagenome: Sugar Pine (Pinus lambertiana)
Gonzalez-Ibeas, Daniel; Martinez-Garcia, Pedro J.; Famula, Randi A.; Delfino-Mix, Annette; Stevens, Kristian A.; Loopstra, Carol A.; Langley, Charles H.; Neale, David B.; Wegrzyn, Jill L.
2016-01-01
Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana transcriptome through deep sequencing of a variety of tissue types to generate more than 2.5 billion short reads. Third generation, long reads generated through PacBio Iso-Seq have been included for the first time in conifers to combat the challenges associated with de novo transcriptome assembly. A technology comparison is provided here to contribute to the otherwise scarce comparisons of second and third generation transcriptome sequencing approaches in plant species. In addition, the transcriptome reference was essential for gene model identification and quality assessment in the parallel project responsible for sequencing and assembly of the entire genome. In this study, the transcriptomic data were also used to address questions surrounding lineage-specific Dicer-like proteins in conifers. These proteins play a role in the control of transposable element proliferation and the related genome expansion in conifers. PMID:27799338
Gschloessl, B; Dorkeld, F; Berges, H; Beydon, G; Bouchez, O; Branco, M; Bretaudeau, A; Burban, C; Dubois, E; Gauthier, P; Lhuillier, E; Nichols, J; Nidelet, S; Rocha, S; Sauné, L; Streiff, R; Gautier, M; Kerdelhué, C
2018-05-01
The pine processionary moth Thaumetopoea pityocampa (Lepidoptera: Notodontidae) is the main pine defoliator in the Mediterranean region. Its urticating larvae cause severe human and animal health concerns in the invaded areas. This species shows a high phenotypic variability for various traits, such as phenology, fecundity and tolerance to extreme temperatures. This study presents the construction and analysis of extensive genomic and transcriptomic resources, which are an obligate prerequisite to understand their underlying genetic architecture. Using a well-studied population from Portugal with peculiar phenological characteristics, the karyotype was first determined and a first draft genome of 537 Mb total length was assembled into 68,292 scaffolds (N50 = 164 kb). From this genome assembly, 29,415 coding genes were predicted. To circumvent some limitations for fine-scale physical mapping of genomic regions of interest, a 3X coverage BAC library was also developed. In particular, 11 BACs from this library were individually sequenced to assess the assembly quality. Additionally, de novo transcriptomic resources were generated from various developmental stages sequenced with HiSeq and MiSeq Illumina technologies. The reads were de novo assembled into 62,376 and 63,175 transcripts, respectively. Then, a robust subset of the genome-predicted coding genes, the de novo transcriptome assemblies and previously published 454/Sanger data were clustered to obtain a high-quality and comprehensive reference transcriptome consisting of 29,701 bona fide unigenes. These sequences covered 99% of the cegma and 88% of the busco highly conserved eukaryotic genes and 84% of the busco arthropod gene set. Moreover, 90% of these transcripts could be localized on the draft genome. The described information is available via a genome annotation portal (http://bipaa.genouest.org/sp/thaumetopoea_pityocampa/). © 2018 John Wiley & Sons Ltd.
An anatomically comprehensive atlas of the adult human brain transcriptome
Guillozet-Bongaarts, Angela L.; Shen, Elaine H.; Ng, Lydia; Miller, Jeremy A.; van de Lagemaat, Louie N.; Smith, Kimberly A.; Ebbert, Amanda; Riley, Zackery L.; Abajian, Chris; Beckmann, Christian F.; Bernard, Amy; Bertagnolli, Darren; Boe, Andrew F.; Cartagena, Preston M.; Chakravarty, M. Mallar; Chapin, Mike; Chong, Jimmy; Dalley, Rachel A.; David Daly, Barry; Dang, Chinh; Datta, Suvro; Dee, Nick; Dolbeare, Tim A.; Faber, Vance; Feng, David; Fowler, David R.; Goldy, Jeff; Gregor, Benjamin W.; Haradon, Zeb; Haynor, David R.; Hohmann, John G.; Horvath, Steve; Howard, Robert E.; Jeromin, Andreas; Jochim, Jayson M.; Kinnunen, Marty; Lau, Christopher; Lazarz, Evan T.; Lee, Changkyu; Lemon, Tracy A.; Li, Ling; Li, Yang; Morris, John A.; Overly, Caroline C.; Parker, Patrick D.; Parry, Sheana E.; Reding, Melissa; Royall, Joshua J.; Schulkin, Jay; Sequeira, Pedro Adolfo; Slaughterbeck, Clifford R.; Smith, Simon C.; Sodt, Andy J.; Sunkin, Susan M.; Swanson, Beryl E.; Vawter, Marquis P.; Williams, Derric; Wohnoutka, Paul; Zielke, H. Ronald; Geschwind, Daniel H.; Hof, Patrick R.; Smith, Stephen M.; Koch, Christof; Grant, Seth G. N.; Jones, Allan R.
2014-01-01
Neuroanatomically precise, genome-wide maps of transcript distributions are critical resources to complement genomic sequence data and to correlate functional and genetic brain architecture. Here we describe the generation and analysis of a transcriptional atlas of the adult human brain, comprising extensive histological analysis and comprehensive microarray profiling of ~900 neuroanatomically precise subdivisions in two individuals. Transcriptional regulation varies enormously by anatomical location, with different regions and their constituent cell types displaying robust molecular signatures that are highly conserved between individuals. Analysis of differential gene expression and gene co-expression relationships demonstrates that brain-wide variation strongly reflects the distributions of major cell classes such as neurons, oligodendrocytes, astrocytes and microglia. Local neighbourhood relationships between fine anatomical subdivisions are associated with discrete neuronal subtypes and genes involved with synaptic transmission. The neocortex displays a relatively homogeneous transcriptional pattern, but with distinct features associated selectively with primary sensorimotor cortices and with enriched frontal lobe expression. Notably, the spatial topography of the neocortex is strongly reflected in its molecular topography— the closer two cortical regions, the more similar their transcriptomes. This freely accessible online data resource forms a high-resolution transcriptional baseline for neurogenetic studies of normal and abnormal human brain function. PMID:22996553
The Spatial and Temporal Transcriptomic Landscapes of Ginseng, Panax ginseng C. A. Meyer.
Wang, Kangyu; Jiang, Shicui; Sun, Chunyu; Lin, Yanping; Yin, Rui; Wang, Yi; Zhang, Meiping
2015-12-11
Ginseng, including Asian ginseng (Panax ginseng C. A. Meyer) and American ginseng (P. quinquefolius L.), is one of the most important medicinal herbs in Asia and North America, but significantly understudied. This study sequenced and characterized the transcriptomes and expression profiles of genes expressed in 14 tissues and four different aged roots of Asian ginseng. A total of 265.2 million 100-bp clean reads were generated using the high-throughput sequencing platform HiSeq 2000, representing >8.3x of the 3.2-Gb ginseng genome. From the sequences, 248,993 unigenes were assembled for whole plant, 61,912-113,456 unigenes for each tissue and 54,444-65,412 unigenes for different year-old roots. We comprehensively analyzed the unigene sets and gene expression profiles. We found that the number of genes allocated to each functional category is stable across tissues or developmental stages, while the expression profiles of different genes of a gene family or involved in ginsenoside biosynthesis dramatically diversified spatially and temporally. These results provide an overall insight into the spatial and temporal transcriptome dynamics and landscapes of Asian ginseng, and comprehensive resources for advanced research and breeding of ginseng and related species.
Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A; Lyons, Russell E; Salin, Krishna R; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B
2016-05-07
The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world's most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium.
Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A.; Lyons, Russell E.; Salin, Krishna R.; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B.
2016-01-01
The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world’s most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium. PMID:27164098
Kang, Seunghyun; Kim, Sanghee; Park, Hyun
2015-12-01
Gondogeneia antarctica is widely distributed off the western Antarctic Peninsula and is a key species in the Antarctic food web. In this study, we performed Illumina sequencing to produce a total of 4,599,079,601 (4.6Gb) nucleotides and a comprehensive transcript dataset for G. antarctica. Over 46 million total reads were assembled into 20,749 contigs, and 12,461 annotated genes were predicted by Blastx. The RNA-seq results after exposure to three pollutants showed that 658, 169 and 367 genes that were potential biomarkers of responses to pollutants for this species were specifically upregulated after exposure to PCBs (Polychlorinated biphenyls), PFOS (Perfluorooctanesulfonic acid) and PFOA (Perfluorooctanoic acid), respectively. These data represent the first transcriptome resource for the Antarctic amphipod G. antarctica and provide a useful resource for studying Antarctic marine species. Copyright © 2015 Elsevier B.V. All rights reserved.
Zhang, Yu-Juan; Hao, Youjin; Si, Fengling; Ren, Shuang; Hu, Ganyu; Shen, Li; Chen, Bin
2014-01-01
The onion maggot Delia antiqua is a major insect pest of cultivated vegetables, especially the onion, and a good model to investigate the molecular mechanisms of diapause. To better understand the biology and diapause mechanism of the insect pest species, D. antiqua, the transcriptome was sequenced using Illumina paired-end sequencing technology. Approximately 54 million reads were obtained, trimmed, and assembled into 29,659 unigenes, with an average length of 607 bp and an N50 of 818 bp. Among these unigenes, 21,605 (72.8%) were annotated in the public databases. All unigenes were then compared against Drosophila melanogaster and Anopheles gambiae. Codon usage bias was analyzed and 332 simple sequence repeats (SSRs) were detected in this organism. These data represent the most comprehensive transcriptomic resource currently available for D. antiqua and will facilitate the study of genetics, genomics, diapause, and further pest control of D. antiqua. PMID:24615268
Hyun, Tae Kyung; Lee, Sarah; Kumar, Dhinesh; Rim, Yeonggil; Kumar, Ritesh; Lee, Sang Yeol; Lee, Choong Hwan; Kim, Jae-Yean
2014-10-01
Using Illumina sequencing technology, we have generated the large-scale transcriptome sequencing data containing abundant information on genes involved in the metabolic pathways in R. idaeus cv. Nova fruits. Rubus idaeus (Red raspberry) is one of the important economical crops that possess numerous nutrients, micronutrients and phytochemicals with essential health benefits to human. The molecular mechanism underlying the ripening process and phytochemical biosynthesis in red raspberry is attributed to the changes in gene expression, but very limited transcriptomic and genomic information in public databases is available. To address this issue, we generated more than 51 million sequencing reads from R. idaeus cv. Nova fruit using Illumina RNA-Seq technology. After de novo assembly, we obtained 42,604 unigenes with an average length of 812 bp. At the protein level, Nova fruit transcriptome showed 77 and 68 % sequence similarities with Rubus coreanus and Fragaria versa, respectively, indicating the evolutionary relationship between them. In addition, 69 % of assembled unigenes were annotated using public databases including NCBI non-redundant, Cluster of Orthologous Groups and Gene ontology database, suggesting that our transcriptome dataset provides a valuable resource for investigating metabolic processes in red raspberry. To analyze the relationship between several novel transcripts and the amounts of metabolites such as γ-aminobutyric acid and anthocyanins, real-time PCR and target metabolite analysis were performed on two different ripening stages of Nova. This is the first attempt using Illumina sequencing platform for RNA sequencing and de novo assembly of Nova fruit without reference genome. Our data provide the most comprehensive transcriptome resource available for Rubus fruits, and will be useful for understanding the ripening process and for breeding R. idaeus cultivars with improved fruit quality.
De novo assembly of maritime pine transcriptome: implications for forest breeding and biotechnology.
Canales, Javier; Bautista, Rocio; Label, Philippe; Gómez-Maldonado, Josefa; Lesur, Isabelle; Fernández-Pozo, Noe; Rueda-López, Marina; Guerrero-Fernández, Dario; Castro-Rodríguez, Vanessa; Benzekri, Hicham; Cañas, Rafael A; Guevara, María-Angeles; Rodrigues, Andreia; Seoane, Pedro; Teyssier, Caroline; Morel, Alexandre; Ehrenmann, François; Le Provost, Grégoire; Lalanne, Céline; Noirot, Céline; Klopp, Christophe; Reymond, Isabelle; García-Gutiérrez, Angel; Trontin, Jean-François; Lelu-Walter, Marie-Anne; Miguel, Celia; Cervera, María Teresa; Cantón, Francisco R; Plomion, Christophe; Harvengt, Luc; Avila, Concepción; Gonzalo Claros, M; Cánovas, Francisco M
2014-04-01
Maritime pine (Pinus pinasterAit.) is a widely distributed conifer species in Southwestern Europe and one of the most advanced models for conifer research. In the current work, comprehensive characterization of the maritime pine transcriptome was performed using a combination of two different next-generation sequencing platforms, 454 and Illumina. De novo assembly of the transcriptome provided a catalogue of 26 020 unique transcripts in maritime pine trees and a collection of 9641 full-length cDNAs. Quality of the transcriptome assembly was validated by RT-PCR amplification of selected transcripts for structural and regulatory genes. Transcription factors and enzyme-encoding transcripts were annotated. Furthermore, the available sequencing data permitted the identification of polymorphisms and the establishment of robust single nucleotide polymorphism (SNP) and simple-sequence repeat (SSR) databases for genotyping applications and integration of translational genomics in maritime pine breeding programmes. All our data are freely available at SustainpineDB, the P. pinaster expressional database. Results reported here on the maritime pine transcriptome represent a valuable resource for future basic and applied studies on this ecological and economically important pine species. © 2013 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram P; Gupta, Deepak K; Singh, Sangeeta; Dogra, Vivek; Gaikwad, Kishor; Sharma, Tilak R; Raje, Ranjeet S; Bandhopadhya, Tapas K; Datta, Subhojit; Singh, Mahendra N; Bashasab, Fakrudin; Kulwal, Pawan; Wanjari, K B; K Varshney, Rajeev; Cook, Douglas R; Singh, Nagendra K
2011-01-20
Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥ 18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea.
2011-01-01
Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea. PMID:21251263
Comprehensive discovery of noncoding RNAs in acute myeloid leukemia cell transcriptomes.
Zhang, Jin; Griffith, Malachi; Miller, Christopher A; Griffith, Obi L; Spencer, David H; Walker, Jason R; Magrini, Vincent; McGrath, Sean D; Ly, Amy; Helton, Nichole M; Trissal, Maria; Link, Daniel C; Dang, Ha X; Larson, David E; Kulkarni, Shashikant; Cordes, Matthew G; Fronick, Catrina C; Fulton, Robert S; Klco, Jeffery M; Mardis, Elaine R; Ley, Timothy J; Wilson, Richard K; Maher, Christopher A
2017-11-01
To detect diverse and novel RNA species comprehensively, we compared deep small RNA and RNA sequencing (RNA-seq) methods applied to a primary acute myeloid leukemia (AML) sample. We were able to discover previously unannotated small RNAs using deep sequencing of a library method using broader insert size selection. We analyzed the long noncoding RNA (lncRNA) landscape in AML by comparing deep sequencing from multiple RNA-seq library construction methods for the sample that we studied and then integrating RNA-seq data from 179 AML cases. This identified lncRNAs that are completely novel, differentially expressed, and associated with specific AML subtypes. Our study revealed the complexity of the noncoding RNA transcriptome through a combined strategy of strand-specific small RNA and total RNA-seq. This dataset will serve as an invaluable resource for future RNA-based analyses. Copyright © 2017 ISEH – Society for Hematology and Stem Cells. Published by Elsevier Inc. All rights reserved.
2012-01-01
Background Roses (Rosa sp.), which belong to the family Rosaceae, are the most economically important ornamental plants—making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. Results We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: ‘Vital’, ‘Maroussia’, and ‘Sympathy’ and Rosa rugosa Thunb. , respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO) terms, Plant Ontology (PO) terms, and MIPS Functional Catalogue (FunCat) terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach) and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. Conclusions In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a comprehensive genetic resource which can be used to better understand rose flower development and to identify candidate genes for important phenotypes. PMID:23171001
Kim, Jungeun; Park, June Hyun; Lim, Chan Ju; Lim, Jae Yun; Ryu, Jee-Youn; Lee, Bong-Woo; Choi, Jae-Pil; Kim, Woong Bom; Lee, Ha Yeon; Choi, Yourim; Kim, Donghyun; Hur, Cheol-Goo; Kim, Sukweon; Noh, Yoo-Sun; Shin, Chanseok; Kwon, Suk-Yoon
2012-11-21
Roses (Rosa sp.), which belong to the family Rosaceae, are the most economically important ornamental plants--making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: 'Vital', 'Maroussia', and 'Sympathy' and Rosa rugosa Thunb., respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO) terms, Plant Ontology (PO) terms, and MIPS Functional Catalogue (FunCat) terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach) and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a comprehensive genetic resource which can be used to better understand rose flower development and to identify candidate genes for important phenotypes.
Genome-wide transcriptome and expression profile analysis of Phalaenopsis during explant browning.
Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei
2015-01-01
Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning.
Genome-Wide Transcriptome and Expression Profile Analysis of Phalaenopsis during Explant Browning
Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei
2015-01-01
Background Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. Methodology/Principal Findings We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Conclusions/Significance Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning. PMID:25874455
CRCDA—Comprehensive resources for cancer NGS data analysis
Thangam, Manonanthini; Gopal, Ramesh Kumar
2015-01-01
Next generation sequencing (NGS) innovations put a compelling landmark in life science and changed the direction of research in clinical oncology with its productivity to diagnose and treat cancer. The aim of our portal comprehensive resources for cancer NGS data analysis (CRCDA) is to provide a collection of different NGS tools and pipelines under diverse classes with cancer pathways and databases and furthermore, literature information from PubMed. The literature data was constrained to 18 most common cancer types such as breast cancer, colon cancer and other cancers that exhibit in worldwide population. NGS-cancer tools for the convenience have been categorized into cancer genomics, cancer transcriptomics, cancer epigenomics, quality control and visualization. Pipelines for variant detection, quality control and data analysis were listed to provide out-of-the box solution for NGS data analysis, which may help researchers to overcome challenges in selecting and configuring individual tools for analysing exome, whole genome and transcriptome data. An extensive search page was developed that can be queried by using (i) type of data [literature, gene data and sequence read archive (SRA) data] and (ii) type of cancer (selected based on global incidence and accessibility of data). For each category of analysis, variety of tools are available and the biggest challenge is in searching and using the right tool for the right application. The objective of the work is collecting tools in each category available at various places and arranging the tools and other data in a simple and user-friendly manner for biologists and oncologists to find information easier. To the best of our knowledge, we have collected and presented a comprehensive package of most of the resources available in cancer for NGS data analysis. Given these factors, we believe that this website will be an useful resource to the NGS research community working on cancer. Database URL: http://bioinfo.au-kbc.org.in/ngs/ngshome.html. PMID:26450948
Pairett, Autum N.; Serb, Jeanne M.
2013-01-01
Background The eye has evolved across 13 separate lineages of molluscs. Yet, there have been very few studies examining the molecular machinary underlying eye function of this group, which is due, in part, to a lack of genomic resources. The scallop (Bivalvia: Pectinidae) represents a compeling molluscan model to study photoreception due to its morphologically novel and separately evolved mirror-type eye. We sequenced the adult eye transcriptome of two scallop species to: 1) identify the phototransduction pathway components; 2) identify any additional light detection functions; and 3) test the hypothesis that molluscs possess genes not found in other animal lineages. Results A total of 3,039 contigs from the bay scallop, Argopecten irradians and 26,395 contigs from the sea scallop, Placopecten magellanicus were produced by 454 sequencing. Targeted BLAST searches and functional annotation using Gene Ontology (GO) terms and KEGG pathways identified transcripts from three light detection systems: two phototransduction pathways and the circadian clock, a previously unrecognized function of the scallop eye. By comparing the scallop transcriptomes to molluscan and non-molluscan genomes, we discovered that a large proportion of the transcripts (7,776 sequences) may be specific to the scallop lineage. Nearly one-third of these contain transmembrane protein domains, suggesting these unannotated transcripts may be sensory receptors. Conclusions Our data provide the most comprehensive transcriptomic resource currently available from a single molluscan eye type. Candidate genes potentially involved in sensory reception were identified, and are worthy of further investigation. This resource, combined with recent phylogenetic and genomic data, provides a strong foundation for future investigations of the function and evolution of molluscan photosensory systems in this morphologically and taxonomically diverse phylum. PMID:23922823
Kamphuis, Lars G; Hane, James K; Nelson, Matthew N; Gao, Lingling; Atkins, Craig A; Singh, Karam B
2015-01-01
Narrow-leafed lupin (NLL; Lupinus angustifolius L.) is an important grain legume crop that is valuable for sustainable farming and is becoming recognized as a human health food. NLL breeding is directed at improving grain production, disease resistance, drought tolerance and health benefits. However, genetic and genomic studies have been hindered by a lack of extensive genomic resources for the species. Here, the generation, de novo assembly and annotation of transcriptome datasets derived from five different NLL tissue types of the reference accession cv. Tanjil are described. The Tanjil transcriptome was compared to transcriptomes of an early domesticated cv. Unicrop, a wild accession P27255, as well as accession 83A:476, together being the founding parents of two recombinant inbred line (RIL) populations. In silico predictions for transcriptome-derived gene-based length and SNP polymorphic markers were conducted and corroborated using a survey assembly sequence for NLL cv. Tanjil. This yielded extensive indel and SNP polymorphic markers for the two RIL populations. A total of 335 transcriptome-derived markers and 66 BAC-end sequence-derived markers were evaluated, and 275 polymorphic markers were selected to genotype the reference NLL 83A:476 × P27255 RIL population. This significantly improved the completeness, marker density and quality of the reference NLL genetic map. PMID:25060816
Devi, Kamalakshi; Mishra, Surajit K; Sahu, Jagajjit; Panda, Debashis; Modi, Mahendra K; Sen, Priyabrata
2016-02-15
Advances in transcriptome sequencing provide fast, cost-effective and reliable approach to generate large expression datasets especially suitable for non-model species to identify putative genes, key pathway and regulatory mechanism. Citronella (Cymbopogon winterianus) is an aromatic medicinal grass used for anti-tumoral, antibacterial, anti-fungal, antiviral, detoxifying and natural insect repellent properties. Despite of having number of utilities, the genes involved in terpenes biosynthetic pathway is not yet clearly elucidated. The present study is a pioneering attempt to generate an exhaustive molecular information of secondary metabolite pathway and to increase genomic resources in Citronella. Using high-throughput RNA-Seq technology, root and leaf transcriptome was analysed at an unprecedented depth (11.7 Gb). Targeted searches identified majority of the genes associated with metabolic pathway and other natural product pathway viz. antibiotics synthesis along with many novel genes. Terpenoid biosynthesis genes comparative expression results were validated for 15 unigenes by RT-PCR and qRT-PCR. Thus the coverage of these transcriptome is comprehensive enough to discover all known genes of major metabolic pathways. This transcriptome dataset can serve as important public information for gene expression, genomics and function genomics studies in Citronella and shall act as a benchmark for future improvement of the crop.
Zhang, Yu-Juan; Hao, Youjin; Si, Fengling; Ren, Shuang; Hu, Ganyu; Shen, Li; Chen, Bin
2014-03-10
The onion maggot Delia antiqua is a major insect pest of cultivated vegetables, especially the onion, and a good model to investigate the molecular mechanisms of diapause. To better understand the biology and diapause mechanism of the insect pest species, D. antiqua, the transcriptome was sequenced using Illumina paired-end sequencing technology. Approximately 54 million reads were obtained, trimmed, and assembled into 29,659 unigenes, with an average length of 607 bp and an N50 of 818 bp. Among these unigenes, 21,605 (72.8%) were annotated in the public databases. All unigenes were then compared against Drosophila melanogaster and Anopheles gambiae. Codon usage bias was analyzed and 332 simple sequence repeats (SSRs) were detected in this organism. These data represent the most comprehensive transcriptomic resource currently available for D. antiqua and will facilitate the study of genetics, genomics, diapause, and further pest control of D. antiqua. Copyright © 2014 Zhang et al.
Xu, Jiajia; Li, Yuanyuan; Ma, Xiuling; Ding, Jianfeng; Wang, Kai; Wang, Sisi; Tian, Ye; Zhang, Hui; Zhu, Xin-Guang
2013-09-01
Setaria viridis is an emerging model species for genetic studies of C4 photosynthesis. Many basic molecular resources need to be developed to support for this species. In this paper, we performed a comprehensive transcriptome analysis from multiple developmental stages and tissues of S. viridis using next-generation sequencing technologies. Sequencing of the transcriptome from multiple tissues across three developmental stages (seed germination, vegetative growth, and reproduction) yielded a total of 71 million single end 100 bp long reads. Reference-based assembly using Setaria italica genome as a reference generated 42,754 transcripts. De novo assembly generated 60,751 transcripts. In addition, 9,576 and 7,056 potential simple sequence repeats (SSRs) covering S. viridis genome were identified when using the reference based assembled transcripts and the de novo assembled transcripts, respectively. This identified transcripts and SSR provided by this study can be used for both reverse and forward genetic studies based on S. viridis.
AmphiBase: A new genomic resource for non-model amphibian species.
Kwon, Taejoon
2017-01-01
More than five thousand genes annotated in the recently published Xenopus laevis and Xenopus tropicalis genomes do not have a candidate orthologous counterpart in other vertebrate species. To determine whether these sequences represent genuine amphibian-specific genes or annotation errors, it is necessary to analyze them alongside sequences from other amphibian species. However, due to large genome sizes and an abundance of repeat sequences, there are limited numbers of gene sequences available from amphibian species other than Xenopus. AmphiBase is a new genomic resource covering non-model amphibian species, based on public domain transcriptome data and computational methods developed during the X. laevis genome project. Here, I review the current status of AmphiBase, including amphibian species with available transcriptome data or biological samples, and describe the challenges of building a comprehensive amphibian genomic resource in the absence of genomes. This mini-review will be informative for researchers interested in functional genomic experiments using amphibian model organisms, such as Xenopus and axolotl, and will assist in interpretation of results implicating "orphan genes." Additionally, this study highlights an opportunity for researchers working on non-model amphibian species to collaborate in their future efforts and develop amphibian genomic resources as a community. © 2017 Wiley Periodicals, Inc.
Kudapa, Himabindu; Bharti, Arvind K; Cannon, Steven B; Farmer, Andrew D; Mulaosmanovic, Benjamin; Kramer, Robin; Bohra, Abhishek; Weeks, Nathan T; Crow, John A; Tuteja, Reetu; Shah, Trushar; Dutta, Sutapa; Gupta, Deepak K; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R; May, Gregory D; Singh, Nagendra K; Varshney, Rajeev K
2012-09-01
A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18 353 Sanger expressed sequenced tags from more than 16 genotypes. The resultant transcriptome assembly, referred to as CcTA v2, comprised 21 434 transcript assembly contigs (TACs) with an N50 of 1510 bp, the largest one being ~8 kb. Of the 21 434 TACs, 16 622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters. Based on knowledge of intron junctions, 10 009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs). By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference, putative mapping positions at the chromosome level were predicted for 6284 ISR markers, covering all 11 pigeonpea chromosomes. A subset of 128 ISR markers were analyzed on a set of eight genotypes. While 116 markers were validated, 70 markers showed one to three alleles, with an average of 0.16 polymorphism information content (PIC) value. In summary, the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea.
Optimization of De Novo Short Read Assembly of Seabuckthorn (Hippophae rhamnoides L.) Transcriptome
Ghangal, Rajesh; Chaudhary, Saurabh; Jain, Mukesh; Purty, Ram Singh; Chand Sharma, Prakash
2013-01-01
Seabuckthorn ( Hippophae rhamnoides L.) is known for its medicinal, nutritional and environmental importance since ancient times. However, very limited efforts have been made to characterize the genome and transcriptome of this wonder plant. Here, we report the use of next generation massive parallel sequencing technology (Illumina platform) and de novo assembly to gain a comprehensive view of the seabuckthorn transcriptome. We assembled 86,253,874 high quality short reads using six assembly tools. At our hand, assembly of non-redundant short reads following a two-step procedure was found to be the best considering various assembly quality parameters. Initially, ABySS tool was used following an additive k-mer approach. The assembled transcripts were subsequently subjected to TGICL suite. Finally, de novo short read assembly yielded 88,297 transcripts (> 100 bp), representing about 53 Mb of seabuckthorn transcriptome. The average length of transcripts was 610 bp, N50 length 1198 BP and 91% of the short reads uniquely mapped back to seabuckthorn transcriptome. A total of 41,340 (46.8%) transcripts showed significant similarity with sequences present in nr protein databases of NCBI (E-value < 1E-06). We also screened the assembled transcripts for the presence of transcription factors and simple sequence repeats. Our strategy involving the use of short read assembler (ABySS) followed by TGICL will be useful for the researchers working with a non-model organism’s transcriptome in terms of saving time and reducing complexity in data management. The seabuckthorn transcriptome data generated here provide a valuable resource for gene discovery and development of functional molecular markers. PMID:23991119
The developmental transcriptome atlas of the spoon worm Urechis unicinctus (Echiurida: Annelida).
Park, Chungoo; Han, Yong-Hee; Lee, Sung-Gwon; Ry, Kyoung-Bin; Oh, Jooseong; Kern, Elizabeth M A; Park, Joong-Ki; Cho, Sung-Jin
2018-03-01
Echiurida is one of the most intriguing major subgroups of annelida because, unlike most other annelids, echiurids lack metameric body segmentation as adults. For this reason, transcriptome analyses from various developmental stages of echiurid species can be of substantial value for understanding precise expression levels and the complex regulatory networks during early and larval development. A total of 914 million raw RNA-Seq reads were produced from 14 developmental stages of Urechis unicinctus and were de novo assembled into contigs spanning 63,928,225 bp with an N50 length of 2700 bp. The resulting comprehensive transcriptome database of the early developmental stages of U. unicinctus consists of 20,305 representative functional protein-coding transcripts. Approximately 66% of unigenes were assigned to superphylum-level taxa, including Lophotrochozoa (40%). The completeness of the transcriptome assembly was assessed using benchmarking universal single-copy orthologs; 75.7% of the single-copy orthologs were presented in our transcriptome database. We observed 3 distinct patterns of global transcriptome profiles from 14 developmental stages and identified 12,705 genes that showed dynamic regulation patterns during the differentiation and maturation of U. unicinctus cells. We present the first large-scale developmental transcriptome dataset of U. unicinctus and provide a general overview of the dynamics of global gene expression changes during its early developmental stages. The analysis of time-course gene expression data is a first step toward understanding the complex developmental gene regulatory networks in U. unicinctus and will furnish a valuable resource for analyzing the functions of gene repertoires in various developmental phases.
Thanh, Nguyen Minh; Jung, Hyungtaek; Lyons, Russell E; Njaci, Isaac; Yoon, Byoung-Ha; Chand, Vincent; Tuan, Nguyen Viet; Thu, Vo Thi Minh; Mather, Peter
2015-10-01
Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478bp and N50 length of 506bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species. Copyright © 2015 Elsevier B.V. All rights reserved.
2012-01-01
Background We present a comprehensive transcriptome analysis of the fungus Ascosphaera apis, an economically important pathogen of the Western honey bee (Apis mellifera) that causes chalkbrood disease. Our goals were to further annotate the A. apis reference genome and to identify genes that are candidates for being differentially expressed during host infection versus axenic culture. Results We compared A. apis transcriptome sequence from mycelia grown on liquid or solid media with that dissected from host-infected tissue. 454 pyrosequencing provided 252 Mb of filtered sequence reads from both culture types that were assembled into 10,087 contigs. Transcript contigs, protein sequences from multiple fungal species, and ab initio gene predictions were included as evidence sources in the Maker gene prediction pipeline, resulting in 6,992 consensus gene models. A phylogeny based on 12 of these protein-coding loci further supported the taxonomic placement of Ascosphaera as sister to the core Onygenales. Several common protein domains were less abundant in A. apis compared with related ascomycete genomes, particularly cytochrome p450 and protein kinase domains. A novel gene family was identified that has expanded in some ascomycete lineages, but not others. We manually annotated genes with homologs in other fungal genomes that have known relevance to fungal virulence and life history. Functional categories of interest included genes involved in mating-type specification, intracellular signal transduction, and stress response. Computational and manual annotations have been made publicly available on the Bee Pests and Pathogens website. Conclusions This comprehensive transcriptome analysis substantially enhances our understanding of the A. apis genome and its expression during infection of honey bee larvae. It also provides resources for future molecular studies of chalkbrood disease and ultimately improved disease management. PMID:22747707
Barling, Adam; Swaminathan, Kankshita; Mitros, Therese; James, Brandon T; Morris, Juliette; Ngamboma, Ornella; Hall, Megan C; Kirkpatrick, Jessica; Alabady, Magdy; Spence, Ashley K; Hudson, Matthew E; Rokhsar, Daniel S; Moose, Stephen P
2013-12-09
The Miscanthus genus of perennial C4 grasses contains promising biofuel crops for temperate climates. However, few genomic resources exist for Miscanthus, which limits understanding of its interesting biology and future genetic improvement. A comprehensive catalog of expressed sequences were generated from a variety of Miscanthus species and tissue types, with an emphasis on characterizing gene expression changes in spring compared to fall rhizomes. Illumina short read sequencing technology was used to produce transcriptome sequences from different tissues and organs during distinct developmental stages for multiple Miscanthus species, including Miscanthus sinensis, Miscanthus sacchariflorus, and their interspecific hybrid Miscanthus × giganteus. More than fifty billion base-pairs of Miscanthus transcript sequence were produced. Overall, 26,230 Sorghum gene models (i.e., ~ 96% of predicted Sorghum genes) had at least five Miscanthus reads mapped to them, suggesting that a large portion of the Miscanthus transcriptome is represented in this dataset. The Miscanthus × giganteus data was used to identify genes preferentially expressed in a single tissue, such as the spring rhizome, using Sorghum bicolor as a reference. Quantitative real-time PCR was used to verify examples of preferential expression predicted via RNA-Seq. Contiguous consensus transcript sequences were assembled for each species and annotated using InterProScan. Sequences from the assembled transcriptome were used to amplify genomic segments from a doubled haploid Miscanthus sinensis and from Miscanthus × giganteus to further disentangle the allelic and paralogous variations in genes. This large expressed sequence tag collection creates a valuable resource for the study of Miscanthus biology by providing detailed gene sequence information and tissue preferred expression patterns. We have successfully generated a database of transcriptome assemblies and demonstrated its use in the study of genes of interest. Analysis of gene expression profiles revealed biological pathways that exhibit altered regulation in spring compared to fall rhizomes, which are consistent with their different physiological functions. The expression profiles of the subterranean rhizome provides a better understanding of the biological activities of the underground stem structures that are essentials for perenniality and the storage or remobilization of carbon and nutrient resources.
A rat RNA-Seq transcriptomic BodyMap across 11 organs and 4 developmental stages
Yu, Ying; Fuscoe, James C.; Zhao, Chen; Guo, Chao; Jia, Meiwen; Qing, Tao; Bannon, Desmond I.; Lancashire, Lee; Bao, Wenjun; Du, Tingting; Luo, Heng; Su, Zhenqiang; Jones, Wendell D.; Moland, Carrie L.; Branham, William S.; Qian, Feng; Ning, Baitang; Li, Yan; Hong, Huixiao; Guo, Lei; Mei, Nan; Shi, Tieliu; Wang, Kevin Y.; Wolfinger, Russell D.; Nikolsky, Yuri; Walker, Stephen J.; Duerksen-Hughes, Penelope; Mason, Christopher E.; Tong, Weida; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Shi, Leming; Wang, Charles
2014-01-01
The rat has been used extensively as a model for evaluating chemical toxicities and for understanding drug mechanisms. However, its transcriptome across multiple organs, or developmental stages, has not yet been reported. Here we show, as part of the SEQC consortium efforts, a comprehensive rat transcriptomic BodyMap created by performing RNA-Seq on 320 samples from 11 organs of both sexes of juvenile, adolescent, adult and aged Fischer 344 rats. We catalogue the expression profiles of 40,064 genes, 65,167 transcripts, 31,909 alternatively spliced transcript variants and 2,367 non-coding genes/non-coding RNAs (ncRNAs) annotated in AceView. We find that organ-enriched, differentially expressed genes reflect the known organ-specific biological activities. A large number of transcripts show organ-specific, age-dependent or sex-specific differential expression patterns. We create a web-based, open-access rat BodyMap database of expression profiles with crosslinks to other widely used databases, anticipating that it will serve as a primary resource for biomedical research using the rat model. PMID:24510058
Floral gene resources from basal angiosperms for comparative genomics research
Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H
2005-01-01
Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways. PMID:15799777
Cardiac Endothelial Cell Transcriptome.
Lother, Achim; Bergemann, Stella; Deng, Lisa; Moser, Martin; Bode, Christoph; Hein, Lutz
2018-03-01
Endothelial cells (ECs) are a highly specialized cell type with marked diversity between different organs or vascular beds. Cardiac ECs are an important player in cardiac physiology and pathophysiology but are not sufficiently characterized yet. Thus, the aim of the present study was to analyze the cardiac EC transcriptome. We applied fluorescence-assisted cell sorting to isolate pure ECs from adult mouse hearts. RNAseq revealed 1288 genes predominantly expressed in cardiac ECs versus heart tissue including several transcription factors. We found an overrepresentation of corresponding transcription factor binding motifs within the promotor region of EC-enriched genes, suggesting that they control the EC transcriptome. Cardiac ECs exhibit a distinct gene expression profile when compared with renal, cerebral, or pulmonary ECs. For example, we found the Meox2 / Tcf15, Fabp4 , and Cd36 signaling cascade higher expressed in cardiac ECs which is a key regulator of fatty acid uptake and involved in the development of atherosclerosis. The results from this study provide a comprehensive resource of gene expression and transcriptional control in cardiac ECs. The cardiac EC transcriptome exhibits distinct differences in gene expression compared with other cardiac cell types and ECs from other organs. We identified new candidate genes that have not been investigated in ECs yet as promising targets for future evaluation. © 2018 American Heart Association, Inc.
Bian, Hai-Xu; Ma, Hong-Fang; Zheng, Xi-Xi; Peng, Ming-Hui; Li, Yu-Ping; Su, Jun-Fang; Wang, Huan; Li, Qun; Xia, Run-Xi; Liu, Yan-Qun; Jiang, Xing-Fu
2017-05-24
The oriental armyworm Mythimna separate is an economically important insect with a wide distribution and strong migratory activity. However, knowledge about the molecular mechanisms regulating the physiological and behavioural responses of the oriental armyworm is scarce. In the present study, we took a transcriptomic approach to characterize the gene network in the adult head of M. separate. The sequencing and de novo assembly yielded 63,499 transcripts, which were further assembled into 46,459 unigenes with an N50 of 1,153 bp. In the head transcriptome data, unigenes involved in the 'signal transduction mechanism' are the most abundant. In total, 937 signal transduction unigenes were assigned to 22 signalling pathways. The circadian clock, melanin synthesis, and non-receptor protein of olfactory gene families were then identified, and phylogenetic analyses were performed with these M. separate genes, the model insect Bombyx mori and other insects. Furthermore, 1,372 simple sequence repeats of 2-6 bp in unit length were identified. The transcriptome data represent a comprehensive molecular resource for the adult head of M. separate, and these identified genes can be valid targets for further gene function research to address the molecular mechanisms regulating the migratory and olfaction genes of the oriental armyworm.
Quantitative developmental transcriptomes of the Mediterranean sea urchin Paracentrotus lividus.
Gildor, Tsvia; Malik, Assaf; Sher, Noa; Avraham, Linor; Ben-Tabou de-Leon, Smadar
2016-02-01
Embryonic development progresses through the timely activation of thousands of differentially activated genes. Quantitative developmental transcriptomes provide the means to relate global patterns of differentially expressed genes to the emerging body plans they generate. The sea urchin is one of the classic model systems for embryogenesis and the models of its developmental gene regulatory networks are of the most comprehensive of their kind. Thus, the sea urchin embryo is an excellent system for studies of its global developmental transcriptional profiles. Here we produced quantitative developmental transcriptomes of the sea urchin Paracentrotus lividus (P. lividus) at seven developmental stages from the fertilized egg to prism stage. We generated de-novo reference transcriptome and identified 29,817 genes that are expressed at this time period. We annotated and quantified gene expression at the different developmental stages and confirmed the reliability of the expression profiles by QPCR measurement of a subset of genes. The progression of embryo development is reflected in the observed global expression patterns and in our principle component analysis. Our study illuminates the rich patterns of gene expression that participate in sea urchin embryogenesis and provide an essential resource for further studies of the dynamic expression of P. lividus genes. Copyright © 2015 Elsevier B.V. All rights reserved.
Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin
2011-01-01
The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
Yang, Qing; Sun, Fanyue; Yang, Zhi; Li, Hongjun
2014-01-01
Calanus sinicus Brodsky (Copepoda, Crustacea) is a dominant zooplanktonic species widely distributed in the margin seas of the Northwest Pacific Ocean. In this study, we utilized an RNA-Seq-based approach to develop molecular resources for C. sinicus. Adult samples were sequenced using the Illumina HiSeq 2000 platform. The sequencing data generated 69,751 contigs from 58.9 million filtered reads. The assembled contigs had an average length of 928.8 bp. Gene annotation allowed the identification of 43,417 unigene hits against the NCBI database. Gene ontology (GO) and KEGG pathway mapping analysis revealed various functional genes related to diverse biological functions and processes. Transcripts potentially involved in stress response and lipid metabolism were identified among these genes. Furthermore, 4,871 microsatellites and 110,137 single nucleotide polymorphisms (SNPs) were identified in the C. sinicus transcriptome sequences. SNP validation by the melting temperature (T m)-shift method suggested that 16 primer pairs amplified target products and showed biallelic polymorphism among 30 individuals. The present work demonstrates the power of Illumina-based RNA-Seq for the rapid development of molecular resources in nonmodel species. The validated SNP set from our study is currently being utilized in an ongoing ecological analysis to support a future study of C. sinicus population genetics. PMID:24982883
Hernandez-Prieto, Miguel A; Futschik, Matthias E
2012-01-01
Synechocystis sp. PCC6803 is one of the best studied cyanobacteria and an important model organism for our understanding of photosynthesis. The early availability of its complete genome sequence initiated numerous transcriptome studies, which have generated a wealth of expression data. Analysis of the accumulated data can be a powerful tool to study transcription in a comprehensive manner and to reveal underlying regulatory mechanisms, as well as to annotate genes whose functions are yet unknown. However, use of divergent microarray platforms, as well as distributed data storage make meta-analyses of Synechocystis expression data highly challenging, especially for researchers with limited bioinformatic expertise and resources. To facilitate utilisation of the accumulated expression data for a wider research community, we have developed CyanoEXpress, a web database for interactive exploration and visualisation of transcriptional response patterns in Synechocystis. CyanoEXpress currently comprises expression data for 3073 genes and 178 environmental and genetic perturbations obtained in 31 independent studies. At present, CyanoEXpress constitutes the most comprehensive collection of expression data available for Synechocystis and can be freely accessed. The database is available for free at http://cyanoexpress.sysbiolab.eu.
Hobbs, Matthew; Pavasovic, Ana; King, Andrew G; Prentis, Peter J; Eldridge, Mark D B; Chen, Zhiliang; Colgan, Donald J; Polkinghorne, Adam; Wilkins, Marc R; Flanagan, Cheyne; Gillett, Amber; Hanger, Jon; Johnson, Rebecca N; Timms, Peter
2014-09-11
The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus. RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene.Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population. This transcriptomic dataset is a useful resource for molecular genetic studies of the koala, for evolutionary genetic studies of marsupials, for validation and annotation of the koala genome sequence, and for investigation of koala retrovirus. Annotated transcripts can be browsed and queried at http://koalagenome.org.
Neuropathological and transcriptomic characteristics of the aged brain
Miller, Jeremy A; Guillozet-Bongaarts, Angela; Gibbons, Laura E; Postupna, Nadia; Renz, Anne; Beller, Allison E; Sunkin, Susan M; Ng, Lydia; Rose, Shannon E; Smith, Kimberly A; Szafer, Aaron; Barber, Chris; Bertagnolli, Darren; Bickley, Kristopher; Brouner, Krissy; Caldejon, Shiella; Chapin, Mike; Chua, Mindy L; Coleman, Natalie M; Cudaback, Eiron; Cuhaciyan, Christine; Dalley, Rachel A; Dee, Nick; Desta, Tsega; Dolbeare, Tim A; Dotson, Nadezhda I; Fisher, Michael; Gaudreault, Nathalie; Gee, Garrett; Gilbert, Terri L; Goldy, Jeff; Griffin, Fiona; Habel, Caroline; Haradon, Zeb; Hejazinia, Nika; Hellstern, Leanne L; Horvath, Steve; Howard, Kim; Howard, Robert; Johal, Justin; Jorstad, Nikolas L; Josephsen, Samuel R; Kuan, Chihchau L; Lai, Florence; Lee, Eric; Lee, Felix; Lemon, Tracy; Li, Xianwu; Marshall, Desiree A; Melchor, Jose; Mukherjee, Shubhabrata; Nyhus, Julie; Pendergraft, Julie; Potekhina, Lydia; Rha, Elizabeth Y; Rice, Samantha; Rosen, David; Sapru, Abharika; Schantz, Aimee; Shen, Elaine; Sherfield, Emily; Shi, Shu; Sodt, Andy J; Thatra, Nivretta; Tieu, Michael; Wilson, Angela M; Montine, Thomas J; Larson, Eric B; Bernard, Amy; Crane, Paul K; Ellenbogen, Richard G
2017-01-01
As more people live longer, age-related neurodegenerative diseases are an increasingly important societal health issue. Treatments targeting specific pathologies such as amyloid beta in Alzheimer’s disease (AD) have not led to effective treatments, and there is increasing evidence of a disconnect between traditional pathology and cognitive abilities with advancing age, indicative of individual variation in resilience to pathology. Here, we generated a comprehensive neuropathological, molecular, and transcriptomic characterization of hippocampus and two regions cortex in 107 aged donors (median = 90) from the Adult Changes in Thought (ACT) study as a freely-available resource (http://aging.brain-map.org/). We confirm established associations between AD pathology and dementia, albeit with increased, presumably aging-related variability, and identify sets of co-expressed genes correlated with pathological tau and inflammation markers. Finally, we demonstrate a relationship between dementia and RNA quality, and find common gene signatures, highlighting the importance of properly controlling for RNA quality when studying dementia. PMID:29120328
Hah, Nasun; Danko, Charles G.; Core, Leighton; Waterfall, Joshua J.; Siepel, Adam; Lis, John T.; Kraus, W. Lee
2011-01-01
Summary We report the immediate effects of estrogen signaling on the transcriptome of breast cancer cells using Global Run-On and sequencing (GRO-seq). The data were analyzed using a new bioinformatic approach that allowed us to identify transcripts directly from the GRO-seq data. We found that estrogen signaling directly regulates a strikingly large fraction of the transcriptome in a rapid, robust, and unexpectedly transient manner. In addition to protein coding genes, estrogen regulates the distribution and activity of all three RNA polymerases, and virtually every class of non-coding RNA that has been described to date. We also identified a large number of previously undetected estrogen-regulated intergenic transcripts, many of which are found proximal to estrogen receptor binding sites. Collectively, our results provide the most comprehensive measurement of the primary and immediate estrogen effects to date and a resource for understanding rapid signal-dependent transcription in other systems. PMID:21549415
Transcriptome assembly, gene annotation and tissue gene expression atlas of the rainbow trout
USDA-ARS?s Scientific Manuscript database
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complimented by transcriptome information that will enhance genome assembly and annotation. Previously, we reported a transcriptome reference sequence using a 19X coverage of Sanger and 454-pyrosequencing dat...
Roel Verhaak, Ph.D., Presents the Somatic Genomic Landscape of Glioblastoma - TCGA
Diffuse lower grade gliomas (LGGs) are infiltrative neoplasms of the central nervous system that include astrocytoma, oligodendroglioma and oligo-astrocytoma histologies of grades II and III. Roel G.W. Verhaak, Ph.D., presents a comprehensive analysis of 293 LGGs using multiple advanced genomic, transcriptomic and proteomic platforms from The Cancer Genome Atlas to provide a deeper understanding of the molecular features of this group of neoplasms, to classify them in a clinically-relevant manner, and to provide a public resource that identifies potential targets for emerging therapies.
Mykles, Donald L.; Burnett, Karen G.; Durica, David S.; Joyce, Blake L.; McCarthy, Fiona M.; Schmidt, Carl J.; Stillman, Jonathon H.
2016-01-01
High-throughput RNA sequencing (RNA-seq) technology has become an important tool for studying physiological responses of organisms to changes in their environment. De novo assembly of RNA-seq data has allowed researchers to create a comprehensive catalog of genes expressed in a tissue and to quantify their expression without a complete genome sequence. The contributions from the “Tapping the Power of Crustacean Transcriptomics to Address Grand Challenges in Comparative Biology” symposium in this issue show the successes and limitations of using RNA-seq in the study of crustaceans. In conjunction with the symposium, the Animal Genome to Phenome Research Coordination Network collated comments from participants at the meeting regarding the challenges encountered when using transcriptomics in their research. Input came from novices and experts ranging from graduate students to principal investigators. Many were unaware of the bioinformatics analysis resources currently available on the CyVerse platform. Our analysis of community responses led to three recommendations for advancing the field: (1) integration of genomic and RNA-seq sequence assemblies for crustacean gene annotation and comparative expression; (2) development of methodologies for the functional analysis of genes; and (3) information and training exchange among laboratories for transmission of best practices. The field lacks the methods for manipulating tissue-specific gene expression. The decapod crustacean research community should consider the cherry shrimp, Neocaridina denticulata, as a decapod model for the application of transgenic tools for functional genomics. This would require a multi-investigator effort. PMID:27639274
Zimmer, C T; Maiwald, F; Schorn, C; Bass, C; Ott, M-C; Nauen, R
2014-08-01
The pollen beetle Meligethes aeneus is the most important coleopteran pest in European oilseed rape cultivation, annually infesting millions of hectares and responsible for substantial yield losses if not kept under economic damage thresholds. This species is primarily controlled with insecticides but has recently developed high levels of resistance to the pyrethroid class. The aim of the present study was to provide a transcriptomic resource to investigate mechanisms of resistance. cDNA was sequenced on both Roche (Indianapolis, IN, USA) and Illumina (LGC Genomics, Berlin, Germany) platforms, resulting in a total of ∼53 m reads which assembled into 43 396 expressed sequence tags (ESTs). Manual annotation revealed good coverage of genes encoding insecticide target sites and detoxification enzymes. A total of 77 nonredundant cytochrome P450 genes were identified. Mapping of Illumina RNAseq sequences (from susceptible and pyrethroid-resistant strains) against the reference transcriptome identified a cytochrome P450 (CYP6BQ23) as highly overexpressed in pyrethroid resistance strains. Single-nucleotide polymorphism analysis confirmed the presence of a target-site resistance mutation (L1014F) in the voltage-gated sodium channel of one resistant strain. Our results provide new insights into the important genes associated with pyrethroid resistance in M. aeneus. Furthermore, a comprehensive EST resource is provided for future studies on insecticide modes of action and resistance mechanisms in pollen beetle. © 2014 The Royal Entomological Society.
Lee, Jungeun; Noh, Eun Kyeung; Choi, Hyung-Seok; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok
2013-03-01
Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been studied as an extremophile that has successfully adapted to marginal land with the harshest environment for terrestrial plants. However, limited genetic research has focused on this species due to the lack of genomic resources. Here, we present the first de novo assembly of its transcriptome by massive parallel sequencing and its expression profile using D. antarctica grown under various stress conditions. Total sequence reads generated by pyrosequencing were assembled into 60,765 unigenes (28,177 contigs and 32,588 singletons). A total of 29,173 unique protein-coding genes were identified based on sequence similarities to known proteins. The combined results from all three stress conditions indicated differential expression of 3,110 genes. Quantitative reverse transcription polymerase chain reaction showed that several well-known stress-responsive genes encoding late embryogenesis abundant protein, dehydrin 1, and ice recrystallization inhibition protein were induced dramatically and that genes encoding U-box-domain-containing protein, electron transfer flavoprotein-ubiquinone, and F-box-containing protein were induced by abiotic stressors in a manner conserved with other plant species. We identified more than 2,000 simple sequence repeats that can be developed as functional molecular markers. This dataset is the most comprehensive transcriptome resource currently available for D. antarctica and is therefore expected to be an important foundation for future genetic studies of grasses and extremophiles.
Zhao, M; Wang, T; Adamson, K J; Storey, K B; Cummins, S F
2016-02-08
The land snail Theba pisana is native to the Mediterranean region but has become one of the most abundant invasive species worldwide. Here, we present three transcriptomes of this agriculture pest derived from three tissues: the central nervous system, hepatopancreas (digestive gland), and foot muscle. Sequencing of the three tissues produced 339,479,092 high quality reads and a global de novo assembly generated a total of 250,848 unique transcripts (unigenes). BLAST analysis mapped 52,590 unigenes to NCBI non-redundant protein databases and further functional analysis annotated 21,849 unigenes with gene ontology. We report that T. pisana transcripts have representatives in all functional classes and a comparison of differentially expressed transcripts amongst all three tissues demonstrates enormous differences in their potential metabolic activities. The genes differentially expressed include those with sequence similarity to those genes associated with multiple bacterial diseases and neurological diseases. To provide a valuable resource that will assist functional genomics study, we have implemented a user-friendly web interface, ThebaDB (http://thebadb.bioinfo-minzhao.org/). This online database allows for complex text queries, sequence searches, and data browsing by enriched functional terms and KEGG mapping.
SmedGD 2.0: The Schmidtea mediterranea genome database
Robb, Sofia M.C.; Gotting, Kirsten; Ross, Eric; Sánchez Alvarado, Alejandro
2016-01-01
Planarians have emerged as excellent models for the study of key biological processes such as stem cell function and regulation, axial polarity specification, regeneration, and tissue homeostasis among others. The most widely used organism for these studies is the free-living flatworm Schmidtea mediterranea. In 2007, the Schmidtea mediterranea Genome Database (SmedGD) was first released to provide a much needed resource for the small, but growing planarian community. SmedGD 1.0 has been a depository for genome sequence, a draft assembly, and related experimental data (e.g., RNAi phenotypes, in situ hybridization images, and differential gene expression results). We report here a comprehensive update to SmedGD (SmedGD 2.0) that aims to expand its role as an interactive community resource. The new database includes more recent, and up-to-date transcription data, provides tools that enhance interconnectivity between different genome assemblies and transcriptomes, including next generation assemblies for both the sexual and asexual biotypes of S. mediterranea. SmedGD 2.0 (http://smedgd.stowers.org) not only provides significantly improved gene annotations, but also tools for data sharing, attributes that will help both the planarian and biomedical communities to more efficiently mine the genomics and transcriptomics of S. mediterranea. PMID:26138588
Waiho, Khor; Fazhan, Hanafiah; Shahreza, Md Sheriff; Moh, Julia Hwei Zhong; Noorbaiduri, Shaibani; Wong, Li Lian; Sinnasamy, Saranya
2017-01-01
Adequate genetic information is essential for sustainable crustacean fisheries and aquaculture management. The commercially important orange mud crab, Scylla olivacea, is prevalent in Southeast Asia region and is highly sought after. Although it is a suitable aquaculture candidate, full domestication of this species is hampered by the lack of knowledge about the sexual maturation process and the molecular mechanisms behind it, especially in males. To date, data on its whole genome is yet to be reported for S. olivacea. The available transcriptome data published previously on this species focus primarily on females and the role of central nervous system in reproductive development. De novo transcriptome sequencing for the testes of S. olivacea from immature, maturing and mature stages were performed. A total of approximately 144 million high-quality reads were generated and de novo assembled into 160,569 transcripts with a total length of 142.2 Mb. Approximately 15–23% of the total assembled transcripts were annotated when compared to public protein sequence databases (i.e. UniProt database, Interpro database, Pfam database and Drosophila melanogaster protein database), and GO-categorised with GO Ontology terms. A total of 156,181 high-quality Single-Nucleotide Polymorphisms (SNPs) were mined from the transcriptome data of present study. Transcriptome comparison among the testes of different maturation stages revealed one gene (beta crystallin like gene) with the most significant differential expression—up-regulated in immature stage and down-regulated in maturing and mature stages. This was further validated by qRT-PCR. In conclusion, a comprehensive transcriptome of the testis of orange mud crabs from different maturation stages were obtained. This report provides an invaluable resource for enhancing our understanding of this species’ genome structure and biology, as expressed and controlled by their gonads. PMID:28135340
2014-01-01
Background Tuber melanosporum, also known in the gastronomic community as “truffle”, features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. Findings We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody (“truffle”), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. Conclusions The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles. PMID:25392735
Chen, Pao-Yang; Montanini, Barbara; Liao, Wen-Wei; Morselli, Marco; Jaroszewicz, Artur; Lopez, David; Ottonello, Simone; Pellegrini, Matteo
2014-01-01
Tuber melanosporum, also known in the gastronomic community as "truffle", features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody ("truffle"), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles.
OperomeDB: A Database of Condition-Specific Transcription Units in Prokaryotic Genomes.
Chetal, Kashish; Janga, Sarath Chandra
2015-01-01
Background. In prokaryotic organisms, a substantial fraction of adjacent genes are organized into operons-codirectionally organized genes in prokaryotic genomes with the presence of a common promoter and terminator. Although several available operon databases provide information with varying levels of reliability, very few resources provide experimentally supported results. Therefore, we believe that the biological community could benefit from having a new operon prediction database with operons predicted using next-generation RNA-seq datasets. Description. We present operomeDB, a database which provides an ensemble of all the predicted operons for bacterial genomes using available RNA-sequencing datasets across a wide range of experimental conditions. Although several studies have recently confirmed that prokaryotic operon structure is dynamic with significant alterations across environmental and experimental conditions, there are no comprehensive databases for studying such variations across prokaryotic transcriptomes. Currently our database contains nine bacterial organisms and 168 transcriptomes for which we predicted operons. User interface is simple and easy to use, in terms of visualization, downloading, and querying of data. In addition, because of its ability to load custom datasets, users can also compare their datasets with publicly available transcriptomic data of an organism. Conclusion. OperomeDB as a database should not only aid experimental groups working on transcriptome analysis of specific organisms but also enable studies related to computational and comparative operomics.
De novo characterization of Lentinula edodes C(91-3) transcriptome by deep Solexa sequencing.
Zhong, Mintao; Liu, Ben; Wang, Xiaoli; Liu, Lei; Lun, Yongzhi; Li, Xingyun; Ning, Anhong; Cao, Jing; Huang, Min
2013-02-01
Lentinula edodes, has been utilized as food, as well as, in popular medicine, moreover, its extract isolated from its mycelium and fruiting body have shown several therapeutic properties. Yet little is understood about its genes involved in these properties, and the absence of L.edodes genomes has been a barrier to the development of functional genomics research. However, high throughput sequencing technologies are now being widely applied to non-model species. To facilitate research on L.edodes, we leveraged Solexa sequencing technology in de novo assembly of L.edodes C(91-3) transcriptome. In a single run, we produced more than 57 million sequencing reads. These reads were assembled into 28,923 unigene sequences (mean size=689bp) including 18,120 unigenes with coding sequence (CDS). Based on similarity search with known proteins, assembled unigene sequences were annotated with gene descriptions, gene ontology (GO) and clusters of orthologous group (COG) terms. Our data provides the first comprehensive sequence resource available for functional genomics studies in L.edodes, and demonstrates the utility of Illumina/Solexa sequencing for de novo transcriptome characterization and gene discovery in a non-model mushroom. Copyright © 2012 Elsevier Inc. All rights reserved.
Rong, Liping; Li, Qianzhong; Li, Shushun; Tang, Ling; Wen, Jing
2016-04-01
Maple (Acer palmatum) is an important species for landscape planting worldwide. Salt stress affects the normal growth of the Maple leaf directly, leading to loss of esthetic value. However, the limited availability of Maple genomic information has hindered research on the mechanisms underlying this tolerance. In this study, we performed comprehensive analyses of the salt tolerance in two genotypes of Maple using RNA-seq. Approximately 146.4 million paired-end reads, representing 181,769 unigenes, were obtained. The N50 length of the unigenes was 738 bp, and their total length over 102.66 Mb. 14,090 simple sequence repeats and over 500,000 single nucleotide polymorphisms were identified, which represent useful resources for marker development. Importantly, 181,769 genes were detected in at least one library, and 303 differentially expressed genes (DEGs) were identified between salt-sensitive and salt-tolerant genotypes. Among these DEGs, 125 were upregulated and 178 were downregulated genes. Two MYB-related proteins and one LEA protein were detected among the first 10 most downregulated genes. Moreover, a methyltransferase-related gene was detected among the first 10 most upregulated genes. The three most significantly enriched pathways were plant hormone signal transduction, arginine and proline metabolism, and photosynthesis. The transcriptome analysis provided a rich genetic resource for gene discovery related to salt tolerance in Maple, and in closely related species. The data will serve as an important public information platform to further our understanding of the molecular mechanisms involved in salt tolerance in Maple.
Jiao, Chen; Gao, Min; Wang, Xiping; Fei, Zhangjun
2015-03-21
Grape is one of the most valuable fruit crops and can serve for both fresh consumption and wine production. Grape cultivars have been selected and evolved to produce high-quality fruits during their domestication over thousands of years. However, current widely planted grape cultivars suffer extensive loss to many diseases while most wild species show resistance to various pathogens. Therefore, a comprehensive evaluation of wild grapes would contribute to the improvement of disease resistance in grape breeding programs. We performed deep transcriptome sequencing of three Chinese wild grapes using the Illumina strand-specific RNA-Seq technology. High quality transcriptomes were assembled de novo and more than 93% transcripts were shared with the reference PN40024 genome. Over 1,600 distinct transcripts, which were absent or highly divergent from sequences in the reference PN40024 genome, were identified in each of the three wild grapes, among which more than 1,000 were potential protein-coding genes. Gene Ontology (GO) and pathway annotations of these distinct genes showed those involved in defense responses and plant secondary metabolisms were highly enriched. More than 87,000 single nucleotide polymorphisms (SNPs) and 2,000 small insertions or deletions (indels) were identified between each genotype and PN40024, and approximately 20% of the SNPs caused nonsynonymous mutations. Finally, we discovered 100 to 200 highly confident cis-natural antisense transcript (cis-NAT) pairs in each genotype. These transcripts were significantly enriched with genes involved in secondary metabolisms and plant responses to abiotic stresses. The three de novo assembled transcriptomes provide a comprehensive sequence resource for molecular genetic research in grape. The newly discovered genes from wild Vitis, as well as SNPs and small indels we identified, may facilitate future studies on the molecular mechanisms related to valuable traits possessed by these wild Vitis and contribute to the grape breeding programs. Furthermore, we identified hundreds of cis-NAT pairs which showed their potential regulatory roles in secondary metabolism and abiotic stress responses.
Hu, Zhendi; Chen, Huanyu; Yin, Fei; Li, Zhenyu; Dong, Xiaolin; Zhang, Deyong; Ren, Shunxiang; Feng, Xia
2013-01-01
Background The diamondback moth Plutella xyllostella has developed a high level of resistance to the latest insecticide chlorantraniliprole. A better understanding of P. xylostella’s resistance mechanism to chlorantraniliprole is needed to develop effective approaches for insecticide resistance management. Principal Findings To provide a comprehensive insight into the resistance mechanisms of P. xylostella to chlorantraniliprole, transcriptome assembly and tag-based digital gene expression (DGE) system were performed using Illumina HiSeq™ 2000. The transcriptome analysis of the susceptible strain (SS) provided 45,231 unigenes (with the size ranging from 200 bp to 13,799 bp), which would be efficient for analyzing the differences in different chlorantraniliprole-resistant P. xylostella stains. DGE analysis indicated that a total of 1215 genes (189 up-regulated and 1026 down-regulated) were gradient differentially expressed among the susceptible strain (SS) and different chlorantraniliprole-resistant P. xylostella strains, including low-level resistance (GXA), moderate resistance (LZA) and high resistance strains (HZA). A detailed analysis of gradient differentially expressed genes elucidated the existence of a phase-dependent divergence of biological investment at the molecular level. The genes related to insecticide resistance, such as P450, GST, the ryanodine receptor, and connectin, had different expression profiles in the different chlorantraniliprole-resistant DGE libraries, suggesting that the genes related to insecticide resistance are involved in P. xylostella resistance development against chlorantraniliprole. To confirm the results from the DGE, the expressional profiles of 4 genes related to insecticide resistance were further validated by qRT-PCR analysis. Conclusions The obtained transcriptome information provides large gene resources available for further studying the resistance development of P. xylostella to pesticides. The DGE data provide comprehensive insights into the gene expression profiles of the different chlorantraniliprole-resistant stains. These genes are specifically related to insecticide resistance, with different expressional profiles facilitating the study of the role of each gene in chlorantraniliprole resistance development. PMID:23977278
Torre, Sara; Tattini, Massimiliano; Brunetti, Cecilia; Guidi, Lucia; Gori, Antonella; Marzano, Cristina; Landi, Marco; Sebastiani, Federico
2016-01-01
Sweet basil (Ocimum basilicum), one of the most popular cultivated herbs worldwide, displays a number of varieties differing in several characteristics, such as the color of the leaves. The development of a reference transcriptome for sweet basil, and the analysis of differentially expressed genes in acyanic and cyanic cultivars exposed to natural sunlight irradiance, has interest from horticultural and biological point of views. There is still great uncertainty about the significance of anthocyanins in photoprotection, and how green and red morphs may perform when exposed to photo-inhibitory light, a condition plants face on daily and seasonal basis. We sequenced the leaf transcriptome of the green-leaved Tigullio (TIG) and the purple-leaved Red Rubin (RR) exposed to full sunlight over a four-week experimental period. We assembled and annotated 111,007 transcripts. A total of 5,468 and 5,969 potential SSRs were identified in TIG and RR, respectively, out of which 66 were polymorphic in silico. Comparative analysis of the two transcriptomes showed 2,372 differentially expressed genes (DEGs) clustered in 222 enriched Gene ontology terms. Green and red basil mostly differed for transcripts abundance of genes involved in secondary metabolism. While the biosynthesis of waxes was up-regulated in red basil, the biosynthesis of flavonols and carotenoids was up-regulated in green basil. Data from our study provides a comprehensive transcriptome survey, gene sequence resources and microsatellites that can be used for further investigations in sweet basil. The analysis of DEGs and their functional classification also offers new insights on the functional role of anthocyanins in photoprotection.
NASA Astrophysics Data System (ADS)
Zhang, Hui; Zhai, Yuxiu; Yao, Lin; Jiang, Yanhua; Li, Fengling
2017-05-01
Chlamys farreri is an economically important mollusk that can accumulate excessive amounts of cadmium (Cd). Studying the molecular mechanism of Cd accumulation in bivalves is difficult because of the lack of genome background. Transcriptomic analysis based on high-throughput RNA sequencing has been shown to be an efficient and powerful method for the discovery of relevant genes in non-model and genome reference-free organisms. Here, we constructed two cDNA libraries (control and Cd exposure groups) from the digestive gland of C. farreri and compared the transcriptomic data between them. A total of 227 673 transcripts were assembled into 105 071 unigenes, most of which shared high similarity with sequences in the NCBI non-redundant protein database. For functional classification, 24 493 unigenes were assigned to Gene Ontology terms. Additionally, EuKaryotic Ortholog Groups and Kyoto Encyclopedia of Genes and Genomes analyses assigned 12 028 unigenes to 26 categories and 7 849 unigenes to five pathways, respectively. Comparative transcriptomics analysis identified 3 800 unigenes that were differentially expressed in the Cd-treated group compared with the control group. Among them, genes associated with heavy metal accumulation were screened, including metallothionein, divalent metal transporter, and metal tolerance protein. The functional genes and predicted pathways identified in our study will contribute to a better understanding of the metabolic and immune system in the digestive gland of C. farreri. In addition, the transcriptomic data will provide a comprehensive resource that may contribute to the understanding of molecular mechanisms that respond to marine pollutants in bivalves.
High-confidence coding and noncoding transcriptome maps
2017-01-01
The advent of high-throughput RNA sequencing (RNA-seq) has led to the discovery of unprecedentedly immense transcriptomes encoded by eukaryotic genomes. However, the transcriptome maps are still incomplete partly because they were mostly reconstructed based on RNA-seq reads that lack their orientations (known as unstranded reads) and certain boundary information. Methods to expand the usability of unstranded RNA-seq data by predetermining the orientation of the reads and precisely determining the boundaries of assembled transcripts could significantly benefit the quality of the resulting transcriptome maps. Here, we present a high-performing transcriptome assembly pipeline, called CAFE, that significantly improves the original assemblies, respectively assembled with stranded and/or unstranded RNA-seq data, by orienting unstranded reads using the maximum likelihood estimation and by integrating information about transcription start sites and cleavage and polyadenylation sites. Applying large-scale transcriptomic data comprising 230 billion RNA-seq reads from the ENCODE, Human BodyMap 2.0, The Cancer Genome Atlas, and GTEx projects, CAFE enabled us to predict the directions of about 220 billion unstranded reads, which led to the construction of more accurate transcriptome maps, comparable to the manually curated map, and a comprehensive lncRNA catalog that includes thousands of novel lncRNAs. Our pipeline should not only help to build comprehensive, precise transcriptome maps from complex genomes but also to expand the universe of noncoding genomes. PMID:28396519
The testes transcriptome derived from the New World Screwworm, Cochliomyia hominivorax TSA
USDA-ARS?s Scientific Manuscript database
In a collaboration with National Center for Genome Resources researchers, we sequenced and assembled the testes transcriptome derived from the Pacora, Panama, production plant strain of the New World Screwworm, Cochliomyia hominivorax. This transcriptome contains 4,149 unigenes and the Transcriptome...
DBGC: A Database of Human Gastric Cancer
Wang, Chao; Zhang, Jun; Cai, Mingdeng; Zhu, Zhenggang; Gu, Wenjie; Yu, Yingyan; Zhang, Xiaoyan
2015-01-01
The Database of Human Gastric Cancer (DBGC) is a comprehensive database that integrates various human gastric cancer-related data resources. Human gastric cancer-related transcriptomics projects, proteomics projects, mutations, biomarkers and drug-sensitive genes from different sources were collected and unified in this database. Moreover, epidemiological statistics of gastric cancer patients in China and clinicopathological information annotated with gastric cancer cases were also integrated into the DBGC. We believe that this database will greatly facilitate research regarding human gastric cancer in many fields. DBGC is freely available at http://bminfor.tongji.edu.cn/dbgc/index.do PMID:26566288
2012-01-01
Introduction Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. Results cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. Conclusions We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene fragments) and protein families for ten newly sequenced non-model organisms, some of commercial importance (i.e., Octopus vulgaris). These comprehensive sets of genes can be readily used for phylogenetic analysis, gene expression profiling, developmental analysis, and can also be a powerful resource for gene discovery. The characterization of the transcriptomes of such a diverse array of animal species permitted the comparison of sequencing depth, functional annotation, and efficiency of genomic sampling using the same pipelines, which proved to be similar for all considered species. In addition, the datasets revealed their potential as a resource for paralogue detection, a recurrent concern in various aspects of biological inquiry, including phylogenetics, molecular evolution, development, and cellular biochemistry. PMID:23190771
2012-01-01
Background Plants are sessile and therefore have to perceive and adjust to changes in their environment. The presence of neighbours leads to a competitive situation where resources and space will be limited. Complex adaptive responses to such situation are poorly understood at the molecular level. Results Using microarrays, we analysed whole-genome expression changes in Arabidopsis thaliana plants subjected to intraspecific competition. The leaf and root transcriptome was strongly altered by competition. Differentially expressed genes were enriched in genes involved in nutrient deficiency (mainly N, P, K), perception of light quality, and responses to abiotic and biotic stresses. Interestingly, performance of the generalist insect Spodoptera littoralis on densely grown plants was significantly reduced, suggesting that plants under competition display enhanced resistance to herbivory. Conclusions This study provides a comprehensive list of genes whose expression is affected by intraspecific competition in Arabidopsis. The outcome is a unique response that involves genes related to light, nutrient deficiency, abiotic stress, and defence responses. PMID:23194435
The root transcriptome for North American ginseng assembled and profiled across seasonal development
2013-01-01
Background Ginseng including North American ginseng (Panax quinquefolius L.) is one of the most widely used medicinal plants. Its success is thought to be due to a diverse collection of ginsenosides that serve as its major bioactive compounds. However, few genomic resources exist and the details concerning its various biosynthetic pathways remain poorly understood. As the root is the primary tissue harvested commercially for ginsenosides, next generation sequencing was applied to the characterization and assembly of the root transcriptome throughout seasonal development. Transcripts showing homology to ginsenoside biosynthesis enzymes were profiled in greater detail. Results RNA extracts from root samples from seven development stages of North American ginseng were subjected to 454 sequencing, filtered for quality and used in the de novo assembly of a collective root reference transcriptome consisting of 41,623 transcripts. Annotation efforts using a number of public databases resulted in detailed annotation information for 34,801 (84%) transcripts. In addition, 3,955 genes were assigned to metabolic pathways using the Kyoto Encyclopedia of Genes and Genomes. Among our results, we found all of the known enzymes involved in the ginsenoside backbone biosynthesis and used co-expression analysis to identify a number of candidate sequences involved in the latter stages ginsenoside biosynthesis pathway. Transcript profiles suggest ginsenoside biosynthesis occurs at distinct stages of development. Conclusions The assembly generated provides a comprehensive annotated reference for future transcriptomic study of North American ginseng. A collection of putative ginsenoside biosynthesis genes were identified and candidate genes predicted from the lesser understood downstream stages of biosynthesis. Transcript expression profiles across seasonal development suggest a primary dammarane-type ginsenoside biosynthesis occurs just prior to plant senescence, with secondary ginsenoside production occurring throughout development. Data from the study provide a valuable resource for conducting future ginsenoside biosynthesis research in this important medicinal plant. PMID:23957709
Tan, Xiaoyan; Sun, Junshe; Ning, Huijuan; Qin, Zifang; Miao, Yuxin; Sun, Tian; Zhang, Xiuqing
2018-06-30
Ganoderma lucidum is a valuable basidiomycete with numerous pharmacological compounds, which is widely consumed throughout China. We previously found that the polysaccharide content of Ganoderma lucidum fruiting bodies could be significantly improved by 45.63% with treatment of 42 °C heat stress (HS) for 2 h. To further investigate genes involved in HS response and explore the mechanisms of HS regulating the carbohydrate metabolism in Ganoderma lucidum, high-throughput RNA-Seq was conducted to analyse the difference between control and heat-treated mycelia at transcriptome level. We sequenced six cDNA libraries with three from control group (mycelia cultivated at 28 °C) and three from heat-treated group (mycelia subjected to 42 °C for 2 h). A total of 99,899 transcripts were generated using Trinity method and 59,136 unigenes were annotated by seven public databases. Among them, 2790 genes were identified to be differential expressed genes (DEGs) under HS condition, which included 1991 up-regulated and 799 down-regulated. 176 DEGs were then manually classified into five main responsive-related categories according to their putative functions and possible metabolic pathways. These groups include stress resistance-related factors; protein assembly, transportation and degradation; signal transduction; carbohydrate metabolism and energy provision-related process; other related functions, suggesting that a series of metabolic pathways in Ganoderma lucidum are activated by HS and the response mechanism involves a complex molecular network which needs further study. Remarkably, 48 DEGs were found to regulate carbohydrate metabolism, both in carbohydrate hydrolysis for energy provision and polysaccharide synthesis. In summary, this comprehensive transcriptome analysis will provide enlarged resource for further investigation into the molecular mechanisms of basidiomycete under HS condition. Copyright © 2018 Elsevier B.V. All rights reserved.
Bolisetty, Mohan; Kursawe, Romy; Sun, Lili; Sivakamasundari, V.; Kycia, Ina
2017-01-01
Blood glucose levels are tightly controlled by the coordinated action of at least four cell types constituting pancreatic islets. Changes in the proportion and/or function of these cells are associated with genetic and molecular pathophysiology of monogenic, type 1, and type 2 (T2D) diabetes. Cellular heterogeneity impedes precise understanding of the molecular components of each islet cell type that govern islet (dys)function, particularly the less abundant delta and gamma/pancreatic polypeptide (PP) cells. Here, we report single-cell transcriptomes for 638 cells from nondiabetic (ND) and T2D human islet samples. Analyses of ND single-cell transcriptomes identified distinct alpha, beta, delta, and PP/gamma cell-type signatures. Genes linked to rare and common forms of islet dysfunction and diabetes were expressed in the delta and PP/gamma cell types. Moreover, this study revealed that delta cells specifically express receptors that receive and coordinate systemic cues from the leptin, ghrelin, and dopamine signaling pathways implicating them as integrators of central and peripheral metabolic signals into the pancreatic islet. Finally, single-cell transcriptome profiling revealed genes differentially regulated between T2D and ND alpha, beta, and delta cells that were undetectable in paired whole islet analyses. This study thus identifies fundamental cell-type–specific features of pancreatic islet (dys)function and provides a critical resource for comprehensive understanding of islet biology and diabetes pathogenesis. PMID:27864352
Chen, Yuehong; Cao, Qinghua; Tao, Xiang; Shao, Huanhuan; Zhang, Kun; Zhang, Yizheng; Tan, Xuemei
2017-03-01
White-rot basidiomycete Coriolopsis gallica HTC is one of the main biodegraders of poplar. In our previous study, we have shown the strong capacity of C. gallica HTC to degrade lignocellulose. In this study, equal amounts of total RNA fromC. Gallica HTC cultures grown in different conditions were pooled together. Illumina paired-end RNA sequencing was performed, and 13.2 million 90-bp paired-end reads were generated. We chose the Merged Assembly of Oases data-set for the following blast searches and gene ontology analyses. The reads were assembled de novo into 28,034 transcripts (≥ 100 bp) using combined assembly strategy MAO. The transcripts were annotated using Blast2GO. In all, 18,810 transcripts (≥100 bp) achieved BLASTX hits, of which, 7048 transcripts had GO term and 2074 had ECs. The expression level of 11 lignocellulolytic enzyme genes from the assembled C. gallica HTC transcriptome were detected by real-time quantitative polymerase chain reaction. The results showed that expression levels of these genes were affected by carbon source and nitrogen source at the level of transcription. The current abundant transcriptome data allowed the identification of many new transcripts in C. gallica HTC. Data provided here represent the most comprehensive and integrated genomic resources for cloning and identifying genes of interest from C. gallica HTC. Characterization of C. gallica HTC transcriptome provides an effective tool to understand mechanisms underlying cellular and molecular functions of C. gallica HTC.
He, Lin; Jiang, Hui; Cao, Dandan; Liu, Lihua; Hu, Songnian; Wang, Qun
2013-01-01
The accessory sex gland (ASG) is an important component of the male reproductive system, which functions to enhance the fertility of spermatozoa during male reproduction. Certain proteins secreted by the ASG are known to bind to the spermatozoa membrane and affect its function. The ASG gene expression profile in Chinese mitten crab (Eriocheir sinensis) has not been extensively studied, and limited genetic research has been conducted on this species. The advent of high-throughput sequencing technologies enables the generation of genomic resources within a short period of time and at minimal cost. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for the ASG of E. sinensis using Illumina sequencing technology. This analysis yielded a total of 33,221,284 sequencing reads, including 2.6 Gb of total nucleotides. Reads were assembled into 85,913 contigs (average 218 bp), or 58,567 scaffold sequences (average 292 bp), that identified 37,955 unigenes (average 385 bp). We assembled all unigenes and compared them with the published testis transcriptome from E. sinensis. In order to identify which genes may be involved in ASG function, as it pertains to modification of spermatozoa, we compared the ASG and testis transcriptome of E. sinensis. Our analysis identified specific genes with both higher and lower tissue expression levels in the two tissues, and the functions of these genes were analyzed to elucidate their potential roles during maturation of spermatozoa. Availability of detailed transcriptome data from ASG and testis in E. sinensis can assist our understanding of the molecular mechanisms involved with spermatozoa conservation, transport, maturation and capacitation and potentially acrosome activation. PMID:23342039
Ma, Jun; Kanakala, S; He, Yehua; Zhang, Junli; Zhong, Xiaolan
2015-01-01
Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis. The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus.
Ma, Jun; Kanakala, S.; He, Yehua; Zhang, Junli; Zhong, Xiaolan
2015-01-01
Background Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. Results The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis. Conclusion The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus. PMID:25769053
Xu, Hai-Ming; Kong, Xiang-Dong; Chen, Fei; Huang, Ji-Xiang; Lou, Xiang-Yang; Zhao, Jian-Yi
2015-10-24
Brassica napus is an important oilseed crop. Dissection of the genetic architecture underlying oil-related biological processes will greatly facilitates the genetic improvement of rapeseed. The differential gene expression during pod development offers a snapshot on the genes responsible for oil accumulation in. To identify candidate genes in the linkage peaks reported previously, we used RNA sequencing (RNA-Seq) technology to analyze the pod transcriptomes of German cultivar Sollux and Chinese inbred line Gaoyou. The RNA samples were collected for RNA-Seq at 5-7, 15-17 and 25-27 days after flowering (DAF). Bioinformatics analysis was performed to investigate differentially expressed genes (DEGs). Gene annotation analysis was integrated with QTL mapping and Brassica napus pod transcriptome profiling to detect potential candidate genes in oilseed. Four hundred sixty five and two thousand, one hundred fourteen candidate DEGs were identified, respectively, between two varieties at the same stages and across different periods of each variety. Then, 33 DEGs between Sollux and Gaoyou were identified as the candidate genes affecting seed oil content by combining those DEGs with the quantitative trait locus (QTL) mapping results, of which, one was found to be homologous to Arabidopsis thaliana lipid-related genes. Intervarietal DEGs of lipid pathways in QTL regions represent important candidate genes for oil-related traits. Integrated analysis of transcriptome profiling, QTL mapping and comparative genomics with other relative species leads to efficient identification of most plausible functional genes underlying oil-content related characters, offering valuable resources for bettering breeding program of Brassica napus. This study provided a comprehensive overview on the pod transcriptomes of two varieties with different oil-contents at the three developmental stages.
Transcriptomic immune response of Tenebrio molitor pupae to parasitization by Scleroderma guani.
Zhu, Jia-Ying; Yang, Pu; Zhang, Zhong; Wu, Guo-Xing; Yang, Bin
2013-01-01
Host and parasitoid interaction is one of the most fascinating relationships of insects, which is currently receiving an increasing interest. Understanding the mechanisms evolved by the parasitoids to evade or suppress the host immune system is important for dissecting this interaction, while it was still poorly known. In order to gain insight into the immune response of Tenebrio molitor to parasitization by Scleroderma guani, the transcriptome of T. molitor pupae was sequenced with focus on immune-related gene, and the non-parasitized and parasitized T. molitor pupae were analyzed by digital gene expression (DGE) analysis with special emphasis on parasitoid-induced immune-related genes using Illumina sequencing. In a single run, 264,698 raw reads were obtained. De novo assembly generated 71,514 unigenes with mean length of 424 bp. Of those unigenes, 37,373 (52.26%) showed similarity to the known proteins in the NCBI nr database. Via analysis of the transcriptome data in depth, 430 unigenes related to immunity were identified. DGE analysis revealed that parasitization by S. guani had considerable impacts on the transcriptome profile of T. molitor pupae, as indicated by the significant up- or down-regulation of 3,431 parasitism-responsive transcripts. The expression of a total of 74 unigenes involved in immune response of T. molitor was significantly altered after parasitization. obtained T. molitor transcriptome, in addition to establishing a fundamental resource for further research on functional genomics, has allowed the discovery of a large group of immune genes that might provide a meaningful framework to better understand the immune response in this species and other beetles. The DGE profiling data provides comprehensive T. molitor immune gene expression information at the transcriptional level following parasitization, and sheds valuable light on the molecular understanding of the host-parasitoid interaction.
Torre, Sara; Tattini, Massimiliano; Brunetti, Cecilia; Guidi, Lucia; Gori, Antonella; Marzano, Cristina; Landi, Marco; Sebastiani, Federico
2016-01-01
Sweet basil (Ocimum basilicum), one of the most popular cultivated herbs worldwide, displays a number of varieties differing in several characteristics, such as the color of the leaves. The development of a reference transcriptome for sweet basil, and the analysis of differentially expressed genes in acyanic and cyanic cultivars exposed to natural sunlight irradiance, has interest from horticultural and biological point of views. There is still great uncertainty about the significance of anthocyanins in photoprotection, and how green and red morphs may perform when exposed to photo-inhibitory light, a condition plants face on daily and seasonal basis. We sequenced the leaf transcriptome of the green-leaved Tigullio (TIG) and the purple-leaved Red Rubin (RR) exposed to full sunlight over a four-week experimental period. We assembled and annotated 111,007 transcripts. A total of 5,468 and 5,969 potential SSRs were identified in TIG and RR, respectively, out of which 66 were polymorphic in silico. Comparative analysis of the two transcriptomes showed 2,372 differentially expressed genes (DEGs) clustered in 222 enriched Gene ontology terms. Green and red basil mostly differed for transcripts abundance of genes involved in secondary metabolism. While the biosynthesis of waxes was up-regulated in red basil, the biosynthesis of flavonols and carotenoids was up-regulated in green basil. Data from our study provides a comprehensive transcriptome survey, gene sequence resources and microsatellites that can be used for further investigations in sweet basil. The analysis of DEGs and their functional classification also offers new insights on the functional role of anthocyanins in photoprotection. PMID:27483170
Zhao, Ying-Jun; Zeng, Yan; Chen, Lei; Dong, Yang; Wang, Wen
2014-12-01
As an ancient arthropod with a history of 390 million years, spiders evolved numerous morphological forms resulting from adaptation to different environments. The venom and silk of spiders, which have promising commercial applications in agriculture, medicine and engineering fields, are of special interests to researchers. However, little is known about their genomic components, which hinders not only understanding spider biology but also utilizing their valuable genes. Here we report on deep sequenced and de novo assembled transcriptomes of three orb-web spider species, Gasteracantha arcuata, Nasoonaria sinensis and Gasteracantha hasselti which are distributed in tropical forests of south China. With Illumina paired-end RNA-seq technology, 54 871, 101 855 and 75 455 unigenes for the three spider species were obtained, respectively, among which 9 300, 10 001 and 10 494 unique genes are annotated, respectively. From these annotated unigenes, we comprehensively analyzed silk and toxin gene components and structures for the three spider species. Our study provides valuable transcriptome data for three spider species which previously lacked any genetic/genomic data. The results have laid the first fundamental genomic basis for exploiting gene resources from these spiders. © 2013 Institute of Zoology, Chinese Academy of Sciences.
Wang, Yonglin; Xiong, Dianguang; Jiang, Ning; Li, Xuewu; Yang, Qiqing; Tian, Chengming
2016-01-01
Arceuthobium (dwarf mistletoes) are hemiparasites that may cause great damage to infected trees belonging to Pinaceae and Cupressaceae. Currently, dwarf mistletoe control involves the use of the ethylene-producing product ethephon (ETH), which acts by inducing dwarf mistletoe shoot abscission. However, the process by which ETH functions is mostly unknown. Therefore, the transcriptome of the ETH-exposed plants was compared to non-exposed controls to identify genes associated with the response to ethephon. In this study, the reference transcriptome was contained 120,316 annotated unigenes, with a total of 21,764 ETH-responsive differentially expressed unigenes were identified. These ETH-associated genes clustered into 20 distinctly expressed pattern groups, providing a view of molecular events with good spatial and temporal resolution. As expected, the greatest number of unigenes with changed expression were observed at the onset of abscission, suggesting induction by ethylene. ETH also affected genes associated with shoot abscission processes including hormone biosynthesis and signaling, cell wall hydrolysis and modification, lipid transference, and more. The comprehensive transcriptome data set provides a wealth of genomic resources for dwarf mistletoe communities and contributes to a better understanding of the molecular regulatory mechanism of ethylene-caused shoots abscission. PMID:27941945
A comprehensive analysis of the human placenta transcriptome
USDA-ARS?s Scientific Manuscript database
As the conduit for nutrients and growth signals, the placenta is critical to establishing an environment sufficient for fetal growth and development. To better understand the mechanisms regulating placental development and gene expression, we characterized the transcriptome of term placenta from 20 ...
Picking Cell Lines for High-Throughput Transcriptomic Toxicity Screening (SOT)
High throughput, whole genome transcriptomic profiling is a promising approach to comprehensively evaluate chemicals for potential biological effects. To be useful for in vitro toxicity screening, gene expression must be quantified in a set of representative cell types that captu...
Deep Sequencing-Based Analysis of the Cymbidium ensifolium Floral Transcriptome
Li, Xiaobai; Luo, Jie; Yan, Tianlian; Xiang, Lin; Jin, Feng; Qin, Dehui; Sun, Chongbo; Xie, Ming
2013-01-01
Cymbidium ensifolium is a Chinese Cymbidium with an elegant shape, beautiful appearance, and a fragrant aroma. C. ensifolium has a long history of cultivation in China and it has excellent commercial value as a potted plant and cut flower. The development of C. ensifolium genomic resources has been delayed because of its large genome size. Taking advantage of technical and cost improvement of RNA-Seq, we extracted total mRNA from flower buds and mature flowers and obtained a total of 9.52 Gb of filtered nucleotides comprising 98,819,349 filtered reads. The filtered reads were assembled into 101,423 isotigs, representing 51,696 genes. Of the 101,423 isotigs, 41,873 were putative homologs of annotated sequences in the public databases, of which 158 were associated with floral development and 119 were associated with flowering. The isotigs were categorized according to their putative functions. In total, 10,212 of the isotigs were assigned into 25 eukaryotic orthologous groups (KOGs), 41,690 into 58 gene ontology (GO) terms, and 9,830 into 126 Arabidopsis Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and 9,539 isotigs into 123 rice pathways. Comparison of the isotigs with those of the two related orchid species P. equestris and C. sinense showed that 17,906 isotigs are unique to C. ensifolium. In addition, a total of 7,936 SSRs and 16,676 putative SNPs were identified. To our knowledge, this transcriptome database is the first major genomic resource for C. ensifolium and the most comprehensive transcriptomic resource for genus Cymbidium. These sequences provide valuable information for understanding the molecular mechanisms of floral development and flowering. Sequences predicted to be unique to C. ensifolium would provide more insights into C. ensifolium gene diversity. The numerous SNPs and SSRs identified in the present study will contribute to marker development for C. ensifolium. PMID:24392013
Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.
Jayakodi, Murukarthick; Choi, Beom-Soon; Lee, Sang-Choon; Kim, Nam-Hoon; Park, Jee Young; Jang, Woojong; Lakshmanan, Meiyappan; Mohan, Shobhana V G; Lee, Dong-Yup; Yang, Tae-Jin
2018-04-12
The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb. The first draft genome sequences of P. ginseng cultivar "Chunpoong" were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page. This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.
De novo transciptome assembly in polyploid species
USDA-ARS?s Scientific Manuscript database
In the absence of a reference genome, the ultimate goal of a de novo transcriptome assembly is to accurately and comprehensively reconstruct the set of messenger RNA transcripts represented in the sample. Non-reference assembly of the transcriptome of polyploid species poses a particular challenge b...
Chen, Xin; Zhang, Jin; Liu, Qingzhong; Guo, Wei; Zhao, Tiantian; Ma, Qinghua; Wang, Guixi
2014-01-01
The genus Corylus is an important woody species in Northeast China. Its products, hazelnuts, constitute one of the most important raw materials for the pastry and chocolate industry. However, limited genetic research has focused on Corylus because of the lack of genomic resources. The advent of high-throughput sequencing technologies provides a turning point for Corylus research. In the present study, we performed de novo transcriptome sequencing for the first time to produce a comprehensive database for the Corylus heterophylla Fisch floral buds. The C. heterophylla Fisch floral buds transcriptome was sequenced using the Illumina paired-end sequencing technology. We produced 28,930,890 raw reads and assembled them into 82,684 contigs. A total of 40,941 unigenes were identified, among which 30,549 were annotated in the NCBI Non-redundant (Nr) protein database and 18,581 were annotated in the Swiss-Prot database. Of these annotated unigenes, 25,311 and 10,514 unigenes were assigned to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. We could map 17,207 unigenes onto 128 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway (KEGG) database. Additionally, based on the transcriptome, we constructed a candidate cold tolerance gene set of C. heterophylla Fisch floral buds. The expression patterns of selected genes during four stages of cold acclimation suggested that these genes might be involved in different cold responsive stages in C. heterophylla Fisch floral buds. The transcriptome of C. heterophylla Fisch floral buds was deep sequenced, de novo assembled, and annotated, providing abundant data to better understand the C. heterophylla Fisch floral buds transcriptome. Candidate genes potentially involved in cold tolerance were identified, providing a material basis for future molecular mechanism analysis of C. heterophylla Fisch floral buds tolerant to cold stress.
Sputnik: a database platform for comparative plant genomics.
Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F X
2003-01-01
Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics.
Sputnik: a database platform for comparative plant genomics
Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F.X.
2003-01-01
Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics. PMID:12519965
The Whole-Genome and Transcriptome of the Manila Clam (Ruditapes philippinarum).
Mun, Seyoung; Kim, Yun-Ji; Markkandan, Kesavan; Shin, Wonseok; Oh, Sumin; Woo, Jiyoung; Yoo, Jongsu; An, Hyesuck; Han, Kyudong
2017-06-01
The manila clam, Ruditapes philippinarum, is an important bivalve species in worldwide aquaculture including Korea. The aquaculture production of R. philippinarum is under threat from diverse environmental factors including viruses, microorganisms, parasites, and water conditions with subsequently declining production. In spite of its importance as a marine resource, the reference genome of R. philippinarum for comprehensive genetic studies is largely unexplored. Here, we report the de novo whole-genome and transcriptome assembly of R. philippinarum across three different tissues (foot, gill, and adductor muscle), and provide the basic data for advanced studies in selective breeding and disease control in order to obtain successful aquaculture systems. An approximately 2.56 Gb high quality whole-genome was assembled with various library construction methods. A total of 108,034 protein coding gene models were predicted and repetitive elements including simple sequence repeats and noncoding RNAs were identified to further understanding of the genetic background of R. philippinarum for genomics-assisted breeding. Comparative analysis with the bivalve marine invertebrates uncover that the gene family related to complement C1q was enriched. Furthermore, we performed transcriptome analysis with three different tissues in order to support genome annotation and then identified 41,275 transcripts which were annotated. The R. philippinarum genome resource will markedly advance a wide range of potential genetic studies, a reference genome for comparative analysis of bivalve species and unraveling mechanisms of biological processes in molluscs. We believe that the R. philippinarum genome will serve as an initial platform for breeding better-quality clams using a genomic approach. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Mykles, Donald L; Burnett, Karen G; Durica, David S; Joyce, Blake L; McCarthy, Fiona M; Schmidt, Carl J; Stillman, Jonathon H
2016-12-01
High-throughput RNA sequencing (RNA-seq) technology has become an important tool for studying physiological responses of organisms to changes in their environment. De novo assembly of RNA-seq data has allowed researchers to create a comprehensive catalog of genes expressed in a tissue and to quantify their expression without a complete genome sequence. The contributions from the "Tapping the Power of Crustacean Transcriptomics to Address Grand Challenges in Comparative Biology" symposium in this issue show the successes and limitations of using RNA-seq in the study of crustaceans. In conjunction with the symposium, the Animal Genome to Phenome Research Coordination Network collated comments from participants at the meeting regarding the challenges encountered when using transcriptomics in their research. Input came from novices and experts ranging from graduate students to principal investigators. Many were unaware of the bioinformatics analysis resources currently available on the CyVerse platform. Our analysis of community responses led to three recommendations for advancing the field: (1) integration of genomic and RNA-seq sequence assemblies for crustacean gene annotation and comparative expression; (2) development of methodologies for the functional analysis of genes; and (3) information and training exchange among laboratories for transmission of best practices. The field lacks the methods for manipulating tissue-specific gene expression. The decapod crustacean research community should consider the cherry shrimp, Neocaridina denticulata, as a decapod model for the application of transgenic tools for functional genomics. This would require a multi-investigator effort. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Kim, Chan-Hee; Go, Hye-Jin; Oh, Hye Young; Jo, Yong Hun; Elphick, Maurice R; Park, Nam Gyu
2018-02-01
Starfish (Phylum Echinodermata) are of interest from an evolutionary perspective because as deuterostomian invertebrates they occupy an "intermediate" phylogenetic position with respect to chordates (e.g. vertebrates) and protostomian invertebrates (e.g. Drosophila). Furthermore, starfish are model organisms for research on fertilization, embryonic development, innate immunity and tissue regeneration. However, large-scale molecular data for starfish tissues/organs are limited. To provide a comprehensive genetic resource for the starfish Patiria pectinifera, we report de novo transcriptome assemblies and global gene expression analysis for six P. pectinifera tissues/organs - body wall (BW), coelomic epithelium (CE), tube feet (TF), stomach (SM), pyloric caeca (PC) and gonad (GN). A total of 408 million high-quality reads obtained from six cDNA libraries were assembled de novo using Trinity, resulting in a total of 549,598 contigs with a mean length of 835 nucleotides (nt), an N50 of 1473nt, and GC ratio of 42.5%. A total of 126,136 contigs (22.9%) were obtained as predicted open reading frames (ORFs) by TransDecoder, of which 102,187 were annotated with NCBI non-redundant (NR) hits, and 51,075 and 10,963 were annotated with Gene Ontology (GO) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) using the Blast2GO program, respectively. Gene expression analysis revealed that tissues/organs are grouped into three clusters: BW/CE/TF, SM/PC, and GN, which likely reflect functional relationships. 2408, 8560, 2687, 1727, 3321, and 2667 specifically expressed genes were identified for BW, GN, PC, CE, SM and TF, respectively, using the ROKU method. This study provides a valuable transcriptome resource and novel molecular insights into the functional biology of different tissues/organs in starfish as a model organism. Copyright © 2017 Elsevier B.V. All rights reserved.
Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W.; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G.
2013-01-01
Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus. PMID:23671567
Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis
Jones, Beryl M.; Wcislo, William T.; Robinson, Gene E.
2015-01-01
Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome for analysis of gene expression profiles throughout development. Gene Ontology analysis indicates that stage-specific genes are involved in ion transport, cell–cell signaling, and metabolism. A number of distinct biological processes are upregulated in each life stage, and transitions between life stages involve shifts in dominant functional processes, including shifts from transcriptional regulation in embryos to metabolism in larvae, and increased lipid metabolism in adults. We expect that this transcriptome will provide a useful resource for future analyses to better understand the molecular basis of the evolution of eusociality and, more generally, phenotypic plasticity. PMID:26276382
Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis.
Jones, Beryl M; Wcislo, William T; Robinson, Gene E
2015-08-14
Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome for analysis of gene expression profiles throughout development. Gene Ontology analysis indicates that stage-specific genes are involved in ion transport, cell-cell signaling, and metabolism. A number of distinct biological processes are upregulated in each life stage, and transitions between life stages involve shifts in dominant functional processes, including shifts from transcriptional regulation in embryos to metabolism in larvae, and increased lipid metabolism in adults. We expect that this transcriptome will provide a useful resource for future analyses to better understand the molecular basis of the evolution of eusociality and, more generally, phenotypic plasticity. Copyright © 2015 Jones et al.
Wang, Xiao-Wei; Zhao, Qiong-Yi; Luan, Jun-Bo; Wang, Yu-Jun; Yan, Gen-Hong; Liu, Shu-Sheng
2012-10-04
Genomic divergence between invasive and native species may provide insight into the molecular basis underlying specific characteristics that drive the invasion and displacement of closely related species. In this study, we sequenced the transcriptome of an indigenous species, Asia II 3, of the Bemisia tabaci complex and compared its genetic divergence with the transcriptomes of two invasive whiteflies species, Middle East Asia Minor 1 (MEAM1) and Mediterranean (MED), respectively. More than 16 million reads of 74 base pairs in length were obtained for the Asia II 3 species using the Illumina sequencing platform. These reads were assembled into 52,535 distinct sequences (mean size: 466 bp) and 16,596 sequences were annotated with an E-value above 10-5. Protein family comparisons revealed obvious diversification among the transcriptomes of these species suggesting species-specific adaptations during whitefly evolution. On the contrary, substantial conservation of the whitefly transcriptomes was also evident, despite their differences. The overall divergence of coding sequences between the orthologous gene pairs of Asia II 3 and MEAM1 is 1.73%, which is comparable to the average divergence of Asia II 3 and MED transcriptomes (1.84%) and much higher than that of MEAM1 and MED (0.83%). This is consistent with the previous phylogenetic analyses and crossing experiments suggesting these are distinct species. We also identified hundreds of highly diverged genes and compiled sequence identify data into gene functional groups and found the most divergent gene classes are Cytochrome P450, Glutathione metabolism and Oxidative phosphorylation. These results strongly suggest that the divergence of genes related to metabolism might be the driving force of the MEAM1 and Asia II 3 differentiation. We also analyzed single nucleotide polymorphisms within the orthologous gene pairs of indigenous and invasive whiteflies which are helpful for the investigation of association between allelic and phenotypes. Our data present the most comprehensive sequences for the indigenous whitefly species Asia II 3. The extensive comparisons of Asia II 3, MEAM1 and MED transcriptomes will serve as an invaluable resource for revealing the genetic basis of whitefly invasion and the molecular mechanisms underlying their biological differences.
2012-01-01
Background Genomic divergence between invasive and native species may provide insight into the molecular basis underlying specific characteristics that drive the invasion and displacement of closely related species. In this study, we sequenced the transcriptome of an indigenous species, Asia II 3, of the Bemisia tabaci complex and compared its genetic divergence with the transcriptomes of two invasive whiteflies species, Middle East Asia Minor 1 (MEAM1) and Mediterranean (MED), respectively. Results More than 16 million reads of 74 base pairs in length were obtained for the Asia II 3 species using the Illumina sequencing platform. These reads were assembled into 52,535 distinct sequences (mean size: 466 bp) and 16,596 sequences were annotated with an E-value above 10-5. Protein family comparisons revealed obvious diversification among the transcriptomes of these species suggesting species-specific adaptations during whitefly evolution. On the contrary, substantial conservation of the whitefly transcriptomes was also evident, despite their differences. The overall divergence of coding sequences between the orthologous gene pairs of Asia II 3 and MEAM1 is 1.73%, which is comparable to the average divergence of Asia II 3 and MED transcriptomes (1.84%) and much higher than that of MEAM1 and MED (0.83%). This is consistent with the previous phylogenetic analyses and crossing experiments suggesting these are distinct species. We also identified hundreds of highly diverged genes and compiled sequence identify data into gene functional groups and found the most divergent gene classes are Cytochrome P450, Glutathione metabolism and Oxidative phosphorylation. These results strongly suggest that the divergence of genes related to metabolism might be the driving force of the MEAM1 and Asia II 3 differentiation. We also analyzed single nucleotide polymorphisms within the orthologous gene pairs of indigenous and invasive whiteflies which are helpful for the investigation of association between allelic and phenotypes. Conclusions Our data present the most comprehensive sequences for the indigenous whitefly species Asia II 3. The extensive comparisons of Asia II 3, MEAM1 and MED transcriptomes will serve as an invaluable resource for revealing the genetic basis of whitefly invasion and the molecular mechanisms underlying their biological differences. PMID:23036081
2013-01-01
Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514
Qu, Cheng; Fu, Ningning; Xu, Yihua
2016-01-01
The sycamore lace bug, Corythucha ciliata (Hemiptera: Tingidae), is an invasive forestry pest rapidly expanding in many countries. This pest poses a considerable threat to the urban forestry ecosystem, especially to Platanus spp. However, its molecular biology and biochemistry are poorly understood. This study reports the first C. ciliata transcriptome, encompassing three different life stages (Nymphs, adults female (AF) and adults male (AM)). In total, 26.53 GB of clean data and 60,879 unigenes were obtained from three RNA-seq libraries. These unigenes were annotated and classified by Nr (NCBI non-redundant protein sequences), Nt (NCBI non-redundant nucleotide sequences), Pfam (Protein family), KOG/COG (Clusters of Orthologous Groups of proteins), Swiss-Prot (A manually annotated and reviewed protein sequence database), and KO (KEGG Ortholog database). After all pairwise comparisons between these three different samples, a large number of differentially expressed genes were revealed. The dramatic differences in global gene expression profiles were found between distinct life stages (nymphs and AF, nymphs and AM) and sex difference (AF and AM), with some of the significantly differentially expressed genes (DEGs) being related to metamorphosis, digestion, immune and sex difference. The different express of unigenes were validated through quantitative Real-Time PCR (qRT-PCR) for 16 randomly selected unigenes. In addition, 17,462 potential simple sequence repeat molecular markers were identified in these transcriptome resources. These comprehensive C. ciliata transcriptomic information can be utilized to promote the development of environmentally friendly methodologies to disrupt the processes of metamorphosis, digestion, immune and sex differences. PMID:27494615
Gao, Yuan; He, Xiaoli; Wu, Bin; Long, Qiliang; Shao, Tianwei; Wang, Zi; Wei, Jianhe; Li, Yong; Ding, Wanlong
2016-01-01
Panax ginseng C. A. Meyer is a highly valued medicinal plant. Cylindrocarpon destructans is a destructive pathogen that causes root rot and significantly reduces the quality and yield of P. ginseng. However, an efficient method to control root rot remains unavailable because of insufficient understanding of the molecular mechanism underlying C. destructans-P. ginseng interaction. In this study, C. destructans-induced transcriptomes at different time points were investigated using RNA sequencing (RNA-Seq). De novo assembly produced 73,335 unigenes for the P. ginseng transcriptome after C. destructans infection, in which 3,839 unigenes were up-regulated. Notably, the abundance of the up-regulated unigenes sharply increased at 0.5 d postinoculation to provide effector-triggered immunity. In total, 24 of 26 randomly selected unigenes can be validated using quantitative reverse transcription (qRT)-PCR. Gene ontology enrichment analysis of these unigenes showed that "defense response to fungus", "defense response" and "response to stress" were enriched. In addition, differentially expressed transcription factors involved in the hormone signaling pathways after C. destructans infection were identified. Finally, differentially expressed unigenes involved in reactive oxygen species and ginsenoside biosynthetic pathway during C. destructans infection were indentified. To our knowledge, this study is the first to report on the dynamic transcriptome triggered by C. destructans. These results improve our understanding of disease resistance in P. ginseng and provide a useful resource for quick detection of induced markers in P. ginseng before the comprehensive outbreak of this disease caused by C. destructans.
Gallardo-Escárate, Cristian; Valenzuela-Muñoz, Valentina; Nuñez-Acuña, Gustavo
2014-01-01
Despite the economic and environmental impacts that sea lice infestations have on salmon farming worldwide, genomic data generated by high-throughput transcriptome sequencing for different developmental stages, sexes, and strains of sea lice is still limited or unknown. In this study, RNA-seq analysis was performed using de novo transcriptome assembly as a reference for evidenced transcriptional changes from six developmental stages of the salmon louse Caligus rogercresseyi. EST-datasets were generated from the nauplius I, nauplius II, copepodid and chalimus stages and from female and male adults using MiSeq Illumina sequencing. A total of 151,788,682 transcripts were yielded, which were assembled into 83,444 high quality contigs and subsequently annotated into roughly 24,000 genes based on known proteins. To identify differential transcription patterns among salmon louse stages, cluster analyses were performed using normalized gene expression values. Herein, four clusters were differentially expressed between nauplius I–II and copepodid stages (604 transcripts), five clusters between copepodid and chalimus stages (2,426 transcripts), and six clusters between female and male adults (2,478 transcripts). Gene ontology analysis revealed that the nauplius I–II, copepodid and chalimus stages are mainly annotated to aminoacid transfer/repair/breakdown, metabolism, molting cycle, and nervous system development. Additionally, genes showing differential transcription in female and male adults were highly related to cytoskeletal and contractile elements, reproduction, cell development, morphogenesis, and transcription-translation processes. The data presented in this study provides the most comprehensive transcriptome resource available for C. rogercresseyi, which should be used for future genomic studies linked to host-parasite interactions. PMID:24691066
Gallardo-Escárate, Cristian; Valenzuela-Muñoz, Valentina; Nuñez-Acuña, Gustavo
2014-01-01
Despite the economic and environmental impacts that sea lice infestations have on salmon farming worldwide, genomic data generated by high-throughput transcriptome sequencing for different developmental stages, sexes, and strains of sea lice is still limited or unknown. In this study, RNA-seq analysis was performed using de novo transcriptome assembly as a reference for evidenced transcriptional changes from six developmental stages of the salmon louse Caligus rogercresseyi. EST-datasets were generated from the nauplius I, nauplius II, copepodid and chalimus stages and from female and male adults using MiSeq Illumina sequencing. A total of 151,788,682 transcripts were yielded, which were assembled into 83,444 high quality contigs and subsequently annotated into roughly 24,000 genes based on known proteins. To identify differential transcription patterns among salmon louse stages, cluster analyses were performed using normalized gene expression values. Herein, four clusters were differentially expressed between nauplius I-II and copepodid stages (604 transcripts), five clusters between copepodid and chalimus stages (2,426 transcripts), and six clusters between female and male adults (2,478 transcripts). Gene ontology analysis revealed that the nauplius I-II, copepodid and chalimus stages are mainly annotated to aminoacid transfer/repair/breakdown, metabolism, molting cycle, and nervous system development. Additionally, genes showing differential transcription in female and male adults were highly related to cytoskeletal and contractile elements, reproduction, cell development, morphogenesis, and transcription-translation processes. The data presented in this study provides the most comprehensive transcriptome resource available for C. rogercresseyi, which should be used for future genomic studies linked to host-parasite interactions.
Yu, Ying; Zhao, Chen; Su, Zhenqiang; Wang, Charles; Fuscoe, James C; Tong, Weida; Shi, Leming
2014-01-01
The rat is used extensively by the pharmaceutical, regulatory, and academic communities for safety assessment of drugs and chemicals and for studying human diseases; however, its transcriptome has not been well studied. As part of the SEQC (i.e., MAQC-III) consortium efforts, a comprehensive RNA-Seq data set was constructed using 320 RNA samples isolated from 10 organs (adrenal gland, brain, heart, kidney, liver, lung, muscle, spleen, thymus, and testes or uterus) from both sexes of Fischer 344 rats across four ages (2-, 6-, 21-, and 104-week-old) with four biological replicates for each of the 80 sample groups (organ-sex-age). With the Ribo-Zero rRNA removal and Illumina RNA-Seq protocols, 41 million 50 bp single-end reads were generated per sample, yielding a total of 13.4 billion reads. This data set could be used to identify and validate new rat genes and transcripts, develop a more comprehensive rat transcriptome annotation system, identify novel gene regulatory networks related to tissue specific gene expression and development, and discover genes responsible for disease and drug toxicity and efficacy.
2013-01-01
Background Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. Results In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. Conclusion RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid. PMID:23617896
Mareco, Edson A; Garcia de la Serrana, Daniel; Johnston, Ian A; Dal-Pai-Silva, Maeli
2015-03-14
The Pacu (Piaractus mesopotamicus) is a member of the Characiform family native to the Prata Basin (South America) and a target for the aquaculture industry. A limitation for the development of a selective breeding program for this species is a lack of available genetic information. The primary objectives of the present study were 1) to increase the genetic resources available for the species, 2) to exploit the anatomical separation of myotomal fibres types to compare the transcriptomes of slow and fast muscle phenotypes and 3) to systematically investigate the expression of Ubiquitin Specific Protease (USP) family members in fast and slow muscle in response to fasting and refeeding. We generated 0.6 Tb of pair-end reads from slow and fast skeletal muscle libraries. Over 665 million reads were assembled into 504,065 contigs with an average length of 1,334 bp and N50 = 2,772 bp. We successfully annotated nearly 47% of the transcriptome and identified around 15,000 unique genes and over 8000 complete coding sequences. 319 KEGG metabolic pathways were also annotated and 380 putative microsatellites were identified. 956 and 604 genes were differentially expressed between slow and fast skeletal muscle, respectively. 442 paralogues pairs arising from the teleost-specific whole genome duplication were identified, with the majority showing different expression patterns between fibres types (301 in slow and 245 in fast skeletal muscle). 45 members of the USP family were identified in the transcriptome. Transcript levels were quantified by qPCR in a separate fasting and refeeding experiment. USP genes in fast muscle showed a similar transient increase in expression with fasting as the better characterized E3 ubiquitin ligases. We have generated a 53-fold coverage transcriptome for fast and slow myotomal muscle in the pacu (Piaractus mesopotamicus) significantly increasing the genetic resources available for this important aquaculture species. We describe significant differences in gene expression between muscle fibre types for fundamental components of general metabolism, the Pi3k/Akt/mTor network and myogenesis, including detailed analysis of paralogue expression. We also provide a comprehensive description of USP family member expression between muscle fibre types and with changing nutritional status.
Zhang, Jianxia; Wu, Kunlin; Zeng, Songjun; Teixeira da Silva, Jaime A; Zhao, Xiaolan; Tian, Chang-En; Xia, Haoqiang; Duan, Jun
2013-04-24
Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid.
PaintOmics 3: a web resource for the pathway analysis and visualization of multi-omics data.
Hernández-de-Diego, Rafael; Tarazona, Sonia; Martínez-Mira, Carlos; Balzano-Nogueira, Leandro; Furió-Tarí, Pedro; Pappas, Georgios J; Conesa, Ana
2018-05-25
The increasing availability of multi-omic platforms poses new challenges to data analysis. Joint visualization of multi-omics data is instrumental in better understanding interconnections across molecular layers and in fully utilizing the multi-omic resources available to make biological discoveries. We present here PaintOmics 3, a web-based resource for the integrated visualization of multiple omic data types onto KEGG pathway diagrams. PaintOmics 3 combines server-end capabilities for data analysis with the potential of modern web resources for data visualization, providing researchers with a powerful framework for interactive exploration of their multi-omics information. Unlike other visualization tools, PaintOmics 3 covers a comprehensive pathway analysis workflow, including automatic feature name/identifier conversion, multi-layered feature matching, pathway enrichment, network analysis, interactive heatmaps, trend charts, and more. It accepts a wide variety of omic types, including transcriptomics, proteomics and metabolomics, as well as region-based approaches such as ATAC-seq or ChIP-seq data. The tool is freely available at www.paintomics.org.
Ohyanagi, Hajime; Takano, Tomoyuki; Terashima, Shin; Kobayashi, Masaaki; Kanno, Maasa; Morimoto, Kyoko; Kanegae, Hiromi; Sasaki, Yohei; Saito, Misa; Asano, Satomi; Ozaki, Soichi; Kudo, Toru; Yokoyama, Koji; Aya, Koichiro; Suwabe, Keita; Suzuki, Go; Aoki, Koh; Kubo, Yasutaka; Watanabe, Masao; Matsuoka, Makoto; Yano, Kentaro
2015-01-01
Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources. PMID:25505034
Murray, John Isaac
2018-05-01
The convergence of developmental biology and modern genomics tools brings the potential for a comprehensive understanding of developmental systems. This is especially true for the Caenorhabditis elegans embryo because its small size, invariant developmental lineage, and powerful genetic and genomic tools provide the prospect of a cellular resolution understanding of messenger RNA (mRNA) expression and regulation across the organism. We describe here how a systems biology framework might allow large-scale determination of the embryonic regulatory relationships encoded in the C. elegans genome. This framework consists of two broad steps: (a) defining the "parts list"-all genes expressed in all cells at each time during development and (b) iterative steps of computational modeling and refinement of these models by experimental perturbation. Substantial progress has been made towards defining the parts list through imaging methods such as large-scale green fluorescent protein (GFP) reporter analysis. Imaging results are now being augmented by high-resolution transcriptome methods such as single-cell RNA sequencing, and it is likely the complete expression patterns of all genes across the embryo will be known within the next few years. In contrast, the modeling and perturbation experiments performed so far have focused largely on individual cell types or genes, and improved methods will be needed to expand them to the full genome and organism. This emerging comprehensive map of embryonic expression and regulatory function will provide a powerful resource for developmental biologists, and would also allow scientists to ask questions not accessible without a comprehensive picture. This article is categorized under: Invertebrate Organogenesis > Worms Technologies > Analysis of the Transcriptome Gene Expression and Transcriptional Hierarchies > Gene Networks and Genomics. © 2018 Wiley Periodicals, Inc.
USDA-ARS?s Scientific Manuscript database
The whitefly (Bemisia tabaci) causes tremendous damage to cotton production worldwide. However, very limited information is available about how plants perceive and defend themselves from this destructive pest. In this study, the transcriptomics differences between two cotton cultivars that exhibit e...
USDA-ARS?s Scientific Manuscript database
Genomic and transcriptomic data on kiwifruit (Actinidia chinensis) in public databases are very limited despite its nutritional and economic value. Previously, we have constructed and sequenced nine fruit RNA-Seq libraries of A. chinensis cv. 'Hongyang' at immature, mature, and postharvest ripening...
USDA-ARS?s Scientific Manuscript database
An essential step to understanding the genomic biology of any organism is to comprehensively survey its transcriptome. We present the Bovine Gene Atlas (BGA) a compendium of over 7.2 million unique 20 base Illumina DGE tags representing 100 tissue transcriptomes collected primarily from L1 Dominette...
Transcriptomic Dose-Response Analysis for Mode of Action ...
Microarray and RNA-seq technologies can play an important role in assessing the health risks associated with environmental exposures. The utility of gene expression data to predict hazard has been well documented. Early toxicogenomics studies used relatively high, single doses with minimal replication. Thus, they were not useful in understanding health risks at environmentally-relevant doses. Until the past decade, application of toxicogenomics in dose response assessment and determination of chemical mode of action has been limited. New transcriptomic biomarkers have evolved to detect chemical hazards in multiple tissues together with pathway methods to study biological effects across the full dose response range and critical time course. Comprehensive low dose datasets are now available and with the use of transcriptomic benchmark dose estimation techniques within a mode of action framework, the ability to incorporate informative genomic data into human health risk assessment has substantially improved. The key advantage to applying transcriptomic technology to risk assessment is both the sensitivity and comprehensive examination of direct and indirect molecular changes that lead to adverse outcomes. Book Chapter with topic on future application of toxicogenomics technologies for MoA and risk assessment
Zhang, Guoyun; Zhang, Tong; Liu, Juanjuan; Zhang, Jianguo; He, Caiyun
2018-06-20
Atmospheric carbon dioxide (CO 2 ) concentration increases every year. It is critical to understand the elevated CO 2 response molecular mechanisms of plants using genomic techniques. Hippophae rhamnoides L. is a high stress resistance plant species widely distributed in Europe and Asia. However, the molecular mechanism of elevated CO 2 response in H. rhamnoides has been limited. In this study, transcriptomic analysis of two sea buckthorn cultivars under different CO 2 concentrations was performed, based on the next-generation illumina sequencing platform and de novo assembly. We identified 4740 differentially expressed genes in sea buckthorn response to elevated CO 2 concentrations. According to the gene ontology (GO) results, photosystem I, photosynthesis and chloroplast thylakoid membrane were the main enriched terms in 'xiangyang' sea buckthorn. In 'zhongguo' sea buckthorn, photosynthesis was also the main significantly enriched term. However, the number of photosynthesis related differentially expressed genes were different between two sea buckthorn cultivars. Our GO and pathway analyses indicated that the expression levels of the transcription factors WRKY, MYB and NAC were significantly different between the two sea buckthorn cultivars. This study provides a reliable transcriptome sequence resource and is a valuable resource for genetic and genomic researches for plants under high CO 2 concentration in the future. Copyright © 2018 Elsevier B.V. All rights reserved.
Qiao, Qin; Xue, Li; Wang, Qia; Sun, Hang; Zhong, Yang; Huang, Jinling; Lei, Jiajun; Zhang, Ticao
2016-01-01
Multiple closely related species with genomic sequences provide an ideal system for studies on comparative and evolutionary genomics, as well as the mechanism of speciation. The whole genome sequences of six strawberry species ( Fragaria spp.) have been released, which provide one of the richest genomic resources of any plant genus. In this study, we first generated seven transcriptome sequences of Fragaria species de novo , with a total of 48,557-82,537 unigenes per species. Combined with 13 other species genomes in Rosales, we reconstructed a phylogenetic tree at the genomic level. The phylogenic tree shows that Fragaria closed grouped with Rubus and the Fragaria clade is divided into three subclades. East Asian species appeared in every subclade, suggesting that the genus originated in this area at ∼7.99 Mya. Four species found in mountains of Southwest China originated at ∼3.98 Mya, suggesting that rapid speciation occurred to adapt to changing environments following the uplift of the Qinghai-Tibet Plateau. Moreover, we identified 510 very significantly positively selected genes in the cultivated species F . × ananassa genome. This set of genes was enriched in functions related to specific agronomic traits, such as carbon metabolism and plant hormone signal transduction processes, which are directly related to fruit quality and flavor. These findings illustrate comprehensive evolutionary patterns in Fragaria and the genetic basis of fruit domestication of cultivated strawberry at the genomic/transcriptomic level.
Transcript expression profiling for adventitious roots of Panax ginseng Meyer.
Subramaniyam, Sathiyamoorthy; Mathiyalagan, Ramya; Natarajan, Sathishkumar; Kim, Yu-Jin; Jang, Moon-Gi; Park, Jun-Hyung; Yang, Deok Chun
2014-08-01
Panax ginseng Meyer is one of the major medicinal plants in oriental countries belonging to the Araliaceae family which are the primary source for ginsenosides. However, very few genes were characterized for ginsenoside pathway, due to the limited genome information. Through this study, we obtained a comprehensive transcriptome from adventitious roots, which were treated with methyl jasmonic acids for different time points (control, 2h, 6h, 12h, and 24h) and sequenced by RNA 454 pyrosequencing technology. Reference transcriptome 39,304,529 (0.04GB) was obtained from 5,724,987,880 bases (5.7GB) of 22 libraries by de novo assembly and 35,266 (58.5%) transcripts were annotated with biological schemas (GO and KEGG). The digital gene expression patterns were obtained from in vitro grown adventitious root sequences which mapped to reference, from that, 3813 (6.3%) unique transcripts were involved in ≥2 fold up and downregulations. Finally, candidates for ginsenoside pathway genes were predicted from observed expression patterns. Among them, 30 transcription factors, 20 cytochromes, and 11 glycosyl transferases were predicted as ginsenoside candidates. These data can remarkably expand the existing transcriptome resources of Panax, especially to predict existence of gene networks in P. ginseng. The entity of the data provides a valuable platform to reveal more on secondary metabolism and abiotic stresses from P. ginseng in vitro grown adventitious roots. Copyright © 2014 Elsevier B.V. All rights reserved.
Li, Qun; Ge, Fanglan; Tan, Yunya; Zhang, Guangxiang; Li, Wei
2016-01-01
Mycobacterium smegmatis strain MC2 155 is an attractive model organism for the study of M. tuberculosis and other mycobacterial pathogens, as it can grow well using cholesterol as a carbon resource. However, its global transcriptomic response remains largely unrevealed. In this study, M. smegmatis MC2 155 cultivated in androstenedione, cholesterol and glycerol supplemented media were collected separately for a RNA-Sequencing study. The results showed that 6004, 6681 and 6348 genes were expressed in androstenedione, cholesterol and glycerol supplemented media, and 5891 genes were expressed in all three conditions, with 237 specially expressed in cholesterol added medium. A total of 1852 and 454 genes were significantly up-regulated by cholesterol compared with the other two supplements. Only occasional changes were observed in basic carbon and nitrogen metabolism, while almost all of the genes involved in cholesterol catabolism and mammalian cell entry (MCE) were up-regulated by cholesterol, but not by androstenedione. Eleven and 16 gene clusters were induced by cholesterol when compared with glycerol or androstenedione, respectively. This study provides a comprehensive analysis of the cholesterol responsive transcriptome of M. smegmatis. Our results indicated that cholesterol induced many more genes and increased the expression of the majority of genes involved in cholesterol degradation and MCE in M. smegmatis, while androstenedione did not have the same effect. PMID:27164097
An Atlas of annotations of Hydra vulgaris transcriptome.
Evangelista, Daniela; Tripathi, Kumar Parijat; Guarracino, Mario Rosario
2016-09-22
RNA sequencing takes advantage of the Next Generation Sequencing (NGS) technologies for analyzing RNA transcript counts with an excellent accuracy. Trying to interpret this huge amount of data in biological information is still a key issue, reason for which the creation of web-resources useful for their analysis is highly desiderable. Starting from a previous work, Transcriptator, we present the Atlas of Hydra's vulgaris, an extensible web tool in which its complete transcriptome is annotated. In order to provide to the users an advantageous resource that include the whole functional annotated transcriptome of Hydra vulgaris water polyp, we implemented the Atlas web-tool contains 31.988 accesible and downloadable transcripts of this non-reference model organism. Atlas, as a freely available resource, can be considered a valuable tool to rapidly retrieve functional annotation for transcripts differentially expressed in Hydra vulgaris exposed to the distinct experimental treatments. WEB RESOURCE URL: http://www-labgtp.na.icar.cnr.it/Atlas .
Baird, Fiona J; Su, Xiaopei; Aibinu, Ibukun; Nolan, Matthew J; Sugiyama, Hiromu; Otranto, Domenico; Lopata, Andreas L; Cantacessi, Cinzia
2016-07-01
Food-borne nematodes of the genus Anisakis are responsible for a wide range of illnesses (= anisakiasis), from self-limiting gastrointestinal forms to severe systemic allergic reactions, which are often misdiagnosed and under-reported. In order to enhance and refine current diagnostic tools for anisakiasis, knowledge of the whole spectrum of parasite molecules transcribed and expressed by this parasite, including those acting as potential allergens, is necessary. In this study, we employ high-throughput (Illumina) sequencing and bioinformatics to characterise the transcriptomes of two Anisakis species, A. simplex and A. pegreffii, and utilize this resource to compile lists of potential allergens from these parasites. A total of ~65,000,000 reads were generated from cDNA libraries for each species, and assembled into ~34,000 transcripts (= Unigenes); ~18,000 peptides were predicted from each cDNA library and classified based on homology searches, protein motifs and gene ontology and biological pathway mapping. Using comparative analyses with sequence data available in public databases, 36 (A. simplex) and 29 (A. pegreffii) putative allergens were identified, including sequences encoding 'novel' Anisakis allergenic proteins (i.e. cyclophilins and ABA-1 domain containing proteins). This study represents a first step towards providing the research community with a curated dataset to use as a molecular resource for future investigations of the biology of Anisakis, including molecules putatively acting as allergens, using functional genomics, proteomics and immunological tools. Ultimately, an improved knowledge of the biological functions of these molecules in the parasite, as well as of their immunogenic properties, will assist the development of comprehensive, reliable and robust diagnostic tools.
USDA-ARS?s Scientific Manuscript database
In a collaboration with National Center for Genome Resources and University of Texas at El Paso researchers, we sequenced and assembled the transcriptome of the synganglion of the Texas strain (Deutsch) of the cattle tick Rhipicephalus microplus. This transcriptome contains 43, 468 sequences and wa...
USDA-ARS?s Scientific Manuscript database
In a collaboration with National Center for Genome Resources and University of Texas at El Paso researchers, we sequenced and assembled the transcriptome of the synganglion of the Texas strain (Deutsch) of the cattle tick Rhipicephalus microplus. This transcriptome contains 43, 468 sequences and wa...
Transcriptome and proteomic analysis of mango (Mangifera indica Linn) fruits.
Wu, Hong-xia; Jia, Hui-min; Ma, Xiao-wei; Wang, Song-biao; Yao, Quan-sheng; Xu, Wen-tian; Zhou, Yi-gang; Gao, Zhong-shan; Zhan, Ru-lin
2014-06-13
Here we used Illumina RNA-seq technology for transcriptome sequencing of a mixed fruit sample from 'Zill' mango (Mangifera indica Linn) fruit pericarp and pulp during the development and ripening stages. RNA-seq generated 68,419,722 sequence reads that were assembled into 54,207 transcripts with a mean length of 858bp, including 26,413 clusters and 27,794 singletons. A total of 42,515(78.43%) transcripts were annotated using public protein databases, with a cut-off E-value above 10(-5), of which 35,198 and 14,619 transcripts were assigned to gene ontology terms and clusters of orthologous groups respectively. Functional annotation against the Kyoto Encyclopedia of Genes and Genomes database identified 23,741(43.79%) transcripts which were mapped to 128 pathways. These pathways revealed many previously unknown transcripts. We also applied mass spectrometry-based transcriptome data to characterize the proteome of ripe fruit. LC-MS/MS analysis of the mango fruit proteome was using tandem mass spectrometry (MS/MS) in an LTQ Orbitrap Velos (Thermo) coupled online to the HPLC. This approach enabled the identification of 7536 peptides that matched 2754 proteins. Our study provides a comprehensive sequence for a systemic view of transcriptome during mango fruit development and the most comprehensive fruit proteome to date, which are useful for further genomics research and proteomic studies. Our study provides a comprehensive sequence for a systemic view of both the transcriptome and proteome of mango fruit, and a valuable reference for further research on gene expression and protein identification. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.
Croucher, Peter J P; Brewer, Michael S; Winchell, Christopher J; Oxford, Geoff S; Gillespie, Rosemary G
2013-12-08
A number of spider species within the family Theridiidae exhibit a dramatic abdominal (opisthosomal) color polymorphism. The polymorphism is inherited in a broadly Mendelian fashion and in some species consists of dozens of discrete morphs that are convergent across taxa and populations. Few genomic resources exist for spiders. Here, as a first necessary step towards identifying the genetic basis for this trait we present the near complete transcriptomes of two species: the Hawaiian happy-face spider Theridion grallator and Theridion californicum. We mined the gene complement for pigment-pathway genes and examined differential expression (DE) between morphs that are unpatterned (plain yellow) and patterned (yellow with superimposed patches of red, white or very dark brown). By deep sequencing both RNA-seq and normalized cDNA libraries from pooled specimens of each species we were able to assemble a comprehensive gene set for both species that we estimate to be 98-99% complete. It is likely that these species express more than 20,000 protein-coding genes, perhaps 4.5% (ca. 870) of which might be unique to spiders. Mining for pigment-associated Drosophila melanogaster genes indicated the presence of all ommochrome pathway genes and most pteridine pathway genes and DE analyses further indicate a possible role for the pteridine pathway in theridiid color patterning. Based upon our estimates, T. grallator and T. californicum express a large inventory of protein-coding genes. Our comprehensive assembly illustrates the continuing value of sequencing normalized cDNA libraries in addition to RNA-seq in order to generate a reference transcriptome for non-model species. The identification of pteridine-related genes and their possible involvement in color patterning is a novel finding in spiders and one that suggests a biochemical link between guanine deposits and the pigments exhibited by these species.
Rupwate, Sunny D.; Rajasekharan, Ram; Srinivasan, Malathi
2015-01-01
Chia (Salvia hispanica L.), a member of the mint family (Lamiaceae), is a rediscovered crop with great importance in health and nutrition and is also the highest known terrestrial plant source of heart-healthy omega-3 fatty acid, alpha linolenic acid (ALA). At present, there is no public genomic information or database available for this crop, hindering research on its genetic improvement through genomics-assisted breeding programs. The first comprehensive analysis of the global transcriptome profile of developing Salvia hispanica L. seeds, with special reference to lipid biosynthesis is presented in this study. RNA from five different stages of seed development was extracted and sequenced separately using the Illumina GAIIx platform. De novo assembly of processed reads in the pooled transcriptome using Trinity yielded 76,014 transcripts. The total transcript length was 66,944,462 bases (66.9 Mb), with an average length of approximately 880 bases. In the molecular functions category of Gene Ontology (GO) terms, ATP binding and nucleotide binding were found to be the most abundant and in the biological processes category, the metabolic process and the regulation of transcription-DNA-dependent and oxidation-reduction process were abundant. From the EuKaryotic Orthologous Groups of proteins (KOG) classification, the major category was “Metabolism” (31.97%), of which the most prominent class was ‘carbohydrate metabolism and transport’ (5.81% of total KOG classifications) followed by ‘secondary metabolite biosynthesis transport and catabolism’ (5.34%) and ‘lipid metabolism’ (4.57%). A majority of the candidate genes involved in lipid biosynthesis and oil accumulation were identified. Furthermore, 5596 simple sequence repeats (SSRs) were identified. The transcriptome data was further validated through confirmative PCR and qRT-PCR for select lipid genes. Our study provides insight into the complex transcriptome and will contribute to further genome-wide research and understanding of chia. The identified novel UniGenes will facilitate gene discovery and creation of genomic resource for this crop. PMID:25875809
Sedeek, Khalid E M; Qi, Weihong; Schauer, Monica A; Gupta, Alok K; Poveda, Lucy; Xu, Shuqing; Liu, Zhong-Jian; Grossniklaus, Ueli; Schiestl, Florian P; Schlüter, Philipp M
2013-01-01
Sexually deceptive orchids of the genus Ophrys mimic the mating signals of their pollinator females to attract males as pollinators. This mode of pollination is highly specific and leads to strong reproductive isolation between species. This study aims to identify candidate genes responsible for pollinator attraction and reproductive isolation between three closely related species, O. exaltata, O. sphegodes and O. garganica. Floral traits such as odour, colour and morphology are necessary for successful pollinator attraction. In particular, different odour hydrocarbon profiles have been linked to differences in specific pollinator attraction among these species. Therefore, the identification of genes involved in these traits is important for understanding the molecular basis of pollinator attraction by sexually deceptive orchids. We have created floral reference transcriptomes and proteomes for these three Ophrys species using a combination of next-generation sequencing (454 and Solexa), Sanger sequencing, and shotgun proteomics (tandem mass spectrometry). In total, 121 917 unique transcripts and 3531 proteins were identified. This represents the first orchid proteome and transcriptome from the orchid subfamily Orchidoideae. Proteome data revealed proteins corresponding to 2644 transcripts and 887 proteins not observed in the transcriptome. Candidate genes for hydrocarbon and anthocyanin biosynthesis were represented by 156 and 61 unique transcripts in 20 and 7 genes classes, respectively. Moreover, transcription factors putatively involved in the regulation of flower odour, colour and morphology were annotated, including Myb, MADS and TCP factors. Our comprehensive data set generated by combining transcriptome and proteome technologies allowed identification of candidate genes for pollinator attraction and reproductive isolation among sexually deceptive orchids. This includes genes for hydrocarbon and anthocyanin biosynthesis and regulation, and the development of floral morphology. These data will serve as an invaluable resource for research in orchid floral biology, enabling studies into the molecular mechanisms of pollinator attraction and speciation.
Sedeek, Khalid E. M.; Qi, Weihong; Schauer, Monica A.; Gupta, Alok K.; Poveda, Lucy; Xu, Shuqing; Liu, Zhong-Jian; Grossniklaus, Ueli; Schiestl, Florian P.; Schlüter, Philipp M.
2013-01-01
Background Sexually deceptive orchids of the genus Ophrys mimic the mating signals of their pollinator females to attract males as pollinators. This mode of pollination is highly specific and leads to strong reproductive isolation between species. This study aims to identify candidate genes responsible for pollinator attraction and reproductive isolation between three closely related species, O. exaltata, O. sphegodes and O. garganica. Floral traits such as odour, colour and morphology are necessary for successful pollinator attraction. In particular, different odour hydrocarbon profiles have been linked to differences in specific pollinator attraction among these species. Therefore, the identification of genes involved in these traits is important for understanding the molecular basis of pollinator attraction by sexually deceptive orchids. Results We have created floral reference transcriptomes and proteomes for these three Ophrys species using a combination of next-generation sequencing (454 and Solexa), Sanger sequencing, and shotgun proteomics (tandem mass spectrometry). In total, 121 917 unique transcripts and 3531 proteins were identified. This represents the first orchid proteome and transcriptome from the orchid subfamily Orchidoideae. Proteome data revealed proteins corresponding to 2644 transcripts and 887 proteins not observed in the transcriptome. Candidate genes for hydrocarbon and anthocyanin biosynthesis were represented by 156 and 61 unique transcripts in 20 and 7 genes classes, respectively. Moreover, transcription factors putatively involved in the regulation of flower odour, colour and morphology were annotated, including Myb, MADS and TCP factors. Conclusion Our comprehensive data set generated by combining transcriptome and proteome technologies allowed identification of candidate genes for pollinator attraction and reproductive isolation among sexually deceptive orchids. This includes genes for hydrocarbon and anthocyanin biosynthesis and regulation, and the development of floral morphology. These data will serve as an invaluable resource for research in orchid floral biology, enabling studies into the molecular mechanisms of pollinator attraction and speciation. PMID:23734209
Transcriptomic Immune Response of Tenebrio molitor Pupae to Parasitization by Scleroderma guani
Zhu, Jia-Ying; Yang, Pu; Zhang, Zhong; Wu, Guo-Xing; Yang, Bin
2013-01-01
Background Host and parasitoid interaction is one of the most fascinating relationships of insects, which is currently receiving an increasing interest. Understanding the mechanisms evolved by the parasitoids to evade or suppress the host immune system is important for dissecting this interaction, while it was still poorly known. In order to gain insight into the immune response of Tenebrio molitor to parasitization by Scleroderma guani, the transcriptome of T. molitor pupae was sequenced with focus on immune-related gene, and the non-parasitized and parasitized T. molitor pupae were analyzed by digital gene expression (DGE) analysis with special emphasis on parasitoid-induced immune-related genes using Illumina sequencing. Methodology/Principal Findings In a single run, 264,698 raw reads were obtained. De novo assembly generated 71,514 unigenes with mean length of 424 bp. Of those unigenes, 37,373 (52.26%) showed similarity to the known proteins in the NCBI nr database. Via analysis of the transcriptome data in depth, 430 unigenes related to immunity were identified. DGE analysis revealed that parasitization by S. guani had considerable impacts on the transcriptome profile of T. molitor pupae, as indicated by the significant up- or down-regulation of 3,431 parasitism-responsive transcripts. The expression of a total of 74 unigenes involved in immune response of T. molitor was significantly altered after parasitization. Conclusions/Significance obtained T. molitor transcriptome, in addition to establishing a fundamental resource for further research on functional genomics, has allowed the discovery of a large group of immune genes that might provide a meaningful framework to better understand the immune response in this species and other beetles. The DGE profiling data provides comprehensive T. molitor immune gene expression information at the transcriptional level following parasitization, and sheds valuable light on the molecular understanding of the host-parasitoid interaction. PMID:23342153
A transcriptome resource for the Antarctic pteropod Limacina helicina antarctica.
Johnson, Kevin M; Hofmann, Gretchen E
2016-08-01
The pteropod Limacina helicina antarctica is a dominant member of the zooplankton assemblage in the Antarctic marine ecosystem, and is part of a relatively simple food web in nearshore marine Antarctic waters. As a shelled pteropod, Limacina has been suggested as a candidate sentinel organism for the impacts of ocean acidification, due to the potential for shell dissolution in undersaturated waters. In this study, our goal was to develop a transcriptomic resource for Limacina that would support mechanistic studies to explore the physiological response of Limacina to abiotic stressors such as ocean acidification and ocean warming. To this end, RNA sequencing libraries were prepared from Limacina that had been exposed to a range of pH levels and an elevated temperature to maximize the diversity of expressed genes. RNA sequencing (RNA-seq) was conducted on an Illumina NextSeq500 which produced 339,000,000 150bp paired-end reads. The de novo transcriptome was produced using Trinity and annotation of the assembled transcriptome resulted in the identification of 81,229 transcripts in 137 KEGG pathways. This RNA-seq effort resulted in a transcriptome for the Antarctic pteropod, Limacina helicina antarctica, that is a major resource for an international marine science research community studying these pelagic molluscs in a global change context. Copyright © 2016 Elsevier B.V. All rights reserved.
Chen, L; Luo, J; Li, J X; Li, J J; Wang, D Q; Tian, Y; Lu, L Z
2015-06-01
Excessive adiposity is a major problem in the duck industry, but its molecular mechanisms remain unknown. Genetic comparisons between domestic and wild animals have contributed to the exploration of genetic mechanisms responsible for many phenotypic traits. Significant differences in body fat mass have been detected between domestic and wild ducks. In this study, we used the Peking duck and Anas platyrhynchos as the domestic breed and wild counterpart respectively and performed a transcriptomic comparison of abdominal fat between the two breeds to comprehensively analyze the transcriptome basis of adiposity in ducks. We obtained approximately 350 million clean reads; assembled 61 250 transcripts, including 23 699 novel ones; and identified alternative 5' splice sites, alternative 3' splice sites, skipped exons and retained intron as the main alternative splicing events. A differential expression analysis between the two breeds showed that 753 genes exhibited differential expression. In Peking ducks, some lipid metabolism-related genes (IGF2, FABP5, BMP7, etc.) and oncogenes (RRM2, AURKA, CYR61, etc.) were upregulated, whereas genes related to tumor suppression and immunity (TNFRSF19, TNFAIP6, IGSF21, NCF1, etc.) were downregulated, suggesting adiposity might closely associate with tumorigenesis in ducks. Furthermore, 280 576 single-nucleotide variations were found differentiated between the two breeds, including 8641 non-synonymous ones, and some of the non-synonymous ones were found enriched in genes involved in lipid-associated and immune-associated pathways, suggesting abdominal fat of the duck undertakes both a metabolic function and immune-related function. These datasets enlarge our genetic information of ducks and provide valuable resources for analyzing mechanisms underlying adiposity in ducks. © 2015 Stichting International Foundation for Animal Genetics.
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi.
Song, Zhangyong; Yin, Youping; Jiang, Shasha; Liu, Juanjuan; Chen, Huan; Wang, Zhongkang
2013-06-19
Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia.
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2011-01-01
Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
Transcriptome profiling of the dynamic life cycle of the scypohozoan jellyfish Aurelia aurita.
Brekhman, Vera; Malik, Assaf; Haas, Brian; Sher, Noa; Lotan, Tamar
2015-02-14
The moon jellyfish Aurelia aurita is a widespread scyphozoan species that forms large seasonal blooms. Here we provide the first comprehensive view of the entire complex life of the Aurelia Red Sea strain by employing transcriptomic profiling of each stage from planula to mature medusa. A de novo transcriptome was assembled from Illumina RNA-Seq data generated from six stages throughout the Aurelia life cycle. Transcript expression profiling yielded clusters of annotated transcripts with functions related to each specific life-cycle stage. Free-swimming planulae were found highly enriched for functions related to cilia and microtubules, and the drastic morphogenetic process undergone by the planula while establishing the future body of the polyp may be mediated by specifically expressed Wnt ligands. Specific transcripts related to sensory functions were found in the strobila and the ephyra, whereas extracellular matrix functions were enriched in the medusa due to high expression of transcripts such as collagen, fibrillin and laminin, presumably involved in mesoglea development. The CL390-like gene, suggested to act as a strobilation hormone, was also highly expressed in the advanced strobila of the Red Sea species, and in the medusa stage we identified betaine-homocysteine methyltransferase, an enzyme that may play an important part in maintaining equilibrium of the medusa's bell. Finally, we identified the transcription factors participating in the Aurelia life-cycle and found that 70% of these 487 identified transcription factors were expressed in a developmental-stage-specific manner. This study provides the first scyphozoan transcriptome covering the entire developmental trajectory of the life cycle of Aurelia. It highlights the importance of numerous stage-specific transcription factors in driving morphological and functional changes throughout this complex metamorphosis, and is expected to be a valuable resource to the community.
De Novo Assembly and Characterization of Four Anthozoan (Phylum Cnidaria) Transcriptomes.
Kitchen, Sheila A; Crowder, Camerron M; Poole, Angela Z; Weis, Virginia M; Meyer, Eli
2015-09-17
Many nonmodel species exemplify important biological questions but lack the sequence resources required to study the genes and genomic regions underlying traits of interest. Reef-building corals are famously sensitive to rising seawater temperatures, motivating ongoing research into their stress responses and long-term prospects in a changing climate. A comprehensive understanding of these processes will require extending beyond the sequenced coral genome (Acropora digitifera) to encompass diverse coral species and related anthozoans. Toward that end, we have assembled and annotated reference transcriptomes to develop catalogs of gene sequences for three scleractinian corals (Fungia scutaria, Montastraea cavernosa, Seriatopora hystrix) and a temperate anemone (Anthopleura elegantissima). High-throughput sequencing of cDNA libraries produced ~20-30 million reads per sample, and de novo assembly of these reads produced ~75,000-110,000 transcripts from each sample with size distributions (mean ~1.4 kb, N50 ~2 kb), comparable to the distribution of gene models from the coral genome (mean ~1.7 kb, N50 ~2.2 kb). Each assembly includes matches for more than half the gene models from A. digitifera (54-67%) and many reasonably complete transcripts (~5300-6700) spanning nearly the entire gene (ortholog hit ratios ≥0.75). The catalogs of gene sequences developed in this study made it possible to identify hundreds to thousands of orthologs across diverse scleractinian species and related taxa. We used these sequences for phylogenetic inference, recovering known relationships and demonstrating superior performance over phylogenetic trees constructed using single mitochondrial loci. The resources developed in this study provide gene sequences and genetic markers for several anthozoan species. To enhance the utility of these resources for the research community, we developed searchable databases enabling researchers to rapidly recover sequences for genes of interest. Our analysis of de novo assembly quality highlights metrics that we expect will be useful for evaluating the relative quality of other de novo transcriptome assemblies. The identification of orthologous sequences and phylogenetic reconstruction demonstrates the feasibility of these methods for clarifying the substantial uncertainties in the existing scleractinian phylogeny. Copyright © 2015 Kitchen et al.
Riviere, Guillaume; Klopp, Christophe; Ibouniyamine, Nabihoudine; Huvet, Arnaud; Boudry, Pierre; Favrel, Pascal
2015-12-02
The Pacific oyster, Crassostrea gigas, is one of the most important aquaculture shellfish resources worldwide. Important efforts have been undertaken towards a better knowledge of its genome and transcriptome, which makes now C. gigas becoming a model organism among lophotrochozoans, the under-described sister clade of ecdysozoans within protostomes. These massive sequencing efforts offer the opportunity to assemble gene expression data and make such resource accessible and exploitable for the scientific community. Therefore, we undertook this assembly into an up-to-date publicly available transcriptome database: the GigaTON (Gigas TranscriptOme pipeliNe) database. We assembled 2204 million sequences obtained from 114 publicly available RNA-seq libraries that were realized using all embryo-larval development stages, adult organs, different environmental stressors including heavy metals, temperature, salinity and exposure to air, which were mostly performed as part of the Crassostrea gigas genome project. This data was analyzed in silico and resulted into 56621 newly assembled contigs that were deposited into a publicly available database, the GigaTON database. This database also provides powerful and user-friendly request tools to browse and retrieve information about annotation, expression level, UTRs, splice and polymorphism, and gene ontology associated to all the contigs into each, and between all libraries. The GigaTON database provides a convenient, potent and versatile interface to browse, retrieve, confront and compare massive transcriptomic information in an extensive range of conditions, tissues and developmental stages in Crassostrea gigas. To our knowledge, the GigaTON database constitutes the most extensive transcriptomic database to date in marine invertebrates, thereby a new reference transcriptome in the oyster, a highly valuable resource to physiologists and evolutionary biologists.
Enabling large-scale next-generation sequence assembly with Blacklight
Couger, M. Brian; Pipes, Lenore; Squina, Fabio; Prade, Rolf; Siepel, Adam; Palermo, Robert; Katze, Michael G.; Mason, Christopher E.; Blood, Philip D.
2014-01-01
Summary A variety of extremely challenging biological sequence analyses were conducted on the XSEDE large shared memory resource Blacklight, using current bioinformatics tools and encompassing a wide range of scientific applications. These include genomic sequence assembly, very large metagenomic sequence assembly, transcriptome assembly, and sequencing error correction. The data sets used in these analyses included uncategorized fungal species, reference microbial data, very large soil and human gut microbiome sequence data, and primate transcriptomes, composed of both short-read and long-read sequence data. A new parallel command execution program was developed on the Blacklight resource to handle some of these analyses. These results, initially reported previously at XSEDE13 and expanded here, represent significant advances for their respective scientific communities. The breadth and depth of the results achieved demonstrate the ease of use, versatility, and unique capabilities of the Blacklight XSEDE resource for scientific analysis of genomic and transcriptomic sequence data, and the power of these resources, together with XSEDE support, in meeting the most challenging scientific problems. PMID:25294974
Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics
Tzika, Athanasia C.; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C.
2015-01-01
Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the “Reptilian Transcriptomes Database 2.0,” which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. PMID:26133641
Elucidating and mining the Tulipa and Lilium transcriptomes.
Moreno-Pachon, Natalia M; Leeggangers, Hendrika A C F; Nijveen, Harm; Severing, Edouard; Hilhorst, Henk; Immink, Richard G H
2016-10-01
Genome sequencing remains a challenge for species with large and complex genomes containing extensive repetitive sequences, of which the bulbous and monocotyledonous plants tulip and lily are examples. In such a case, sequencing of only the active part of the genome, represented by the transcriptome, is a good alternative to obtain information about gene content. In this study we aimed to generate a high quality transcriptome of tulip and lily and to make this data available as an open-access resource via a user-friendly web-based interface. The Illumina HiSeq 2000 platform was applied and the transcribed RNA was sequenced from a collection of different lily and tulip tissues, respectively. In order to obtain good transcriptome coverage and to facilitate effective data mining, assembly was done using different filtering parameters for clearing out contamination and noise of the RNAseq datasets. This analysis revealed limitations of commonly applied methods and parameter settings used in de novo transcriptome assembly. The final created transcriptomes are publicly available via a user friendly Transcriptome browser ( http://www.bioinformatics.nl/bulbs/db/species/index ). The usefulness of this resource has been exemplified by a search for all potential transcription factors in lily and tulip, with special focus on the TCP transcription factor family. This analysis and other quality parameters point out the quality of the transcriptomes, which can serve as a basis for further genomics studies in lily, tulip, and bulbous plants in general.
Ohyanagi, Hajime; Takano, Tomoyuki; Terashima, Shin; Kobayashi, Masaaki; Kanno, Maasa; Morimoto, Kyoko; Kanegae, Hiromi; Sasaki, Yohei; Saito, Misa; Asano, Satomi; Ozaki, Soichi; Kudo, Toru; Yokoyama, Koji; Aya, Koichiro; Suwabe, Keita; Suzuki, Go; Aoki, Koh; Kubo, Yasutaka; Watanabe, Masao; Matsuoka, Makoto; Yano, Kentaro
2015-01-01
Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Zhang, Yanqiong; Yang, Chunyuan; Wang, Shaochuang; Chen, Tao; Li, Mansheng; Wang, Xue; Li, Dongsheng; Wang, Kang; Ma, Jie; Wu, Songfeng; Zhang, Xueli; Zhu, Yunping; Wu, Jinsheng; He, Fuchu
2013-09-01
A large amount of liver-related physiological and pathological data exist in publicly available biological and bibliographic databases, which are usually far from comprehensive or integrated. Data collection, integration and mining processes pose a great challenge to scientific researchers and clinicians interested in the liver. To address these problems, we constructed LiverAtlas (http://liveratlas.hupo.org.cn), a comprehensive resource of biomedical knowledge related to the liver and various hepatic diseases by incorporating 53 databases. In the present version, LiverAtlas covers data on liver-related genomics, transcriptomics, proteomics, metabolomics and hepatic diseases. Additionally, LiverAtlas provides a wealth of manually curated information, relevant literature citations and cross-references to other databases. Importantly, an expert-confirmed Human Liver Disease Ontology, including relevant information for 227 types of hepatic disease, has been constructed and is used to annotate LiverAtlas data. Furthermore, we have demonstrated two examples of applying LiverAtlas data to identify candidate markers for hepatocellular carcinoma (HCC) at the systems level and to develop a systems biology-based classifier by combining the differential gene expression with topological features of human protein interaction networks to enhance the ability of HCC differential diagnosis. LiverAtlas is the most comprehensive liver and hepatic disease resource, which helps biologists and clinicians to analyse their data at the systems level and will contribute much to the biomarker discovery and diagnostic performance enhancement for liver diseases. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Aya, Koichiro; Kobayashi, Masaaki; Tanaka, Junmu; Ohyanagi, Hajime; Suzuki, Takayuki; Yano, Kenji; Takano, Tomoyuki; Yano, Kentaro; Matsuoka, Makoto
2015-01-01
During plant evolution, ferns originally evolved as a major vascular plant with a distinctive life cycle in which the haploid and diploid generations are completely separated. However, the low level of genetic resources has limited studies of their physiological events, as well as hindering research on the evolutionary history of land plants. In this study, to identify a comprehensive catalog of transcripts and characterize their expression traits in the fern Lygodium japonicum, nine different RNA samples isolated from prothalli, trophophylls, rhizomes and sporophylls were sequenced using Roche 454 GS-FLX and Illumina HiSeq sequencers. The hybrid assembly of the high-quality 454 GS-FLX and Illumina HiSeq reads generated a set of 37,830 isoforms with an average length of 1,444 bp. Using four open reading frame (ORF) predictors, 38,142 representative ORFs were identified from a total of 37,830 transcript isoforms and 95 contigs, which were annotated by searching against several public databases. Furthermore, an orthoMCL analysis using the protein sequences of L. japonicum and five model plants revealed various sets of lineage-specific genes, including those detected among land plant lineages and those detected in only L. japonicum. We have also examined the expression patterns of all contigs/isoforms, along with the life cycle of L. japonicum, and identified the tissue-specific transcripts using statistical expression analyses. Finally, we developed a public web resource, the L. japonicum transcriptome database at http://bioinf.mind.meiji.ac.jp/kanikusa/, which provides important opportunities to accelerate molecular research in ferns. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Deep RNA-Seq to unlock the gene bank of floral development in Sinapis arvensis.
Liu, Jia; Mei, Desheng; Li, Yunchang; Huang, Shunmou; Hu, Qiong
2014-01-01
Sinapis arvensis is a weed with strong biological activity. Despite being a problematic annual weed that contaminates agricultural crop yield, it is a valuable alien germplasm resource. It can be utilized for broadening the genetic background of Brassica crops with desirable agricultural traits like resistance to blackleg (Leptosphaeria maculans), stem rot (Sclerotinia sclerotium) and pod shatter (caused by FRUITFULL gene). However, few genetic studies of S. arvensis were reported because of the lack of genomic resources. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive dataset for S. arvensis for the first time. We used Illumina paired-end sequencing technology to sequence the S. arvensis flower transcriptome and generated 40,981,443 reads that were assembled into 131,278 transcripts. We de novo assembled 96,562 high quality unigenes with an average length of 832 bp. A total of 33,662 full-length ORF complete sequences were identified, and 41,415 unigenes were mapped onto 128 pathways using the KEGG Pathway database. The annotated unigenes were compared against Brassica rapa, B. oleracea, B. napus and Arabidopsis thaliana. Among these unigenes, 76,324 were identified as putative homologs of annotated sequences in the public protein databases, of which 1194 were associated with plant hormone signal transduction and 113 were related to gibberellin homeostasis/signaling. Unigenes that did not match any of those sequence datasets were considered to be unique to S. arvensis. Furthermore, 21,321 simple sequence repeats were found. Our study will enhance the currently available resources for Brassicaceae and will provide a platform for future genomic studies for genetic improvement of Brassica crops.
Baird, Fiona J.; Su, Xiaopei; Aibinu, Ibukun; Nolan, Matthew J.; Sugiyama, Hiromu; Otranto, Domenico
2016-01-01
Background Food-borne nematodes of the genus Anisakis are responsible for a wide range of illnesses (= anisakiasis), from self-limiting gastrointestinal forms to severe systemic allergic reactions, which are often misdiagnosed and under-reported. In order to enhance and refine current diagnostic tools for anisakiasis, knowledge of the whole spectrum of parasite molecules transcribed and expressed by this parasite, including those acting as potential allergens, is necessary. Methodology/Principal Findings In this study, we employ high-throughput (Illumina) sequencing and bioinformatics to characterise the transcriptomes of two Anisakis species, A. simplex and A. pegreffii, and utilize this resource to compile lists of potential allergens from these parasites. A total of ~65,000,000 reads were generated from cDNA libraries for each species, and assembled into ~34,000 transcripts (= Unigenes); ~18,000 peptides were predicted from each cDNA library and classified based on homology searches, protein motifs and gene ontology and biological pathway mapping. Using comparative analyses with sequence data available in public databases, 36 (A. simplex) and 29 (A. pegreffii) putative allergens were identified, including sequences encoding ‘novel’ Anisakis allergenic proteins (i.e. cyclophilins and ABA-1 domain containing proteins). Conclusions/Significance This study represents a first step towards providing the research community with a curated dataset to use as a molecular resource for future investigations of the biology of Anisakis, including molecules putatively acting as allergens, using functional genomics, proteomics and immunological tools. Ultimately, an improved knowledge of the biological functions of these molecules in the parasite, as well as of their immunogenic properties, will assist the development of comprehensive, reliable and robust diagnostic tools. PMID:27472517
Deep RNA-Seq to Unlock the Gene Bank of Floral Development in Sinapis arvensis
Liu, Jia; Mei, Desheng; Li, Yunchang; Huang, Shunmou; Hu, Qiong
2014-01-01
Sinapis arvensis is a weed with strong biological activity. Despite being a problematic annual weed that contaminates agricultural crop yield, it is a valuable alien germplasm resource. It can be utilized for broadening the genetic background of Brassica crops with desirable agricultural traits like resistance to blackleg (Leptosphaeria maculans), stem rot (Sclerotinia sclerotium) and pod shatter (caused by FRUITFULL gene). However, few genetic studies of S. arvensis were reported because of the lack of genomic resources. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive dataset for S. arvensis for the first time. We used Illumina paired-end sequencing technology to sequence the S. arvensis flower transcriptome and generated 40,981,443 reads that were assembled into 131,278 transcripts. We de novo assembled 96,562 high quality unigenes with an average length of 832 bp. A total of 33,662 full-length ORF complete sequences were identified, and 41,415 unigenes were mapped onto 128 pathways using the KEGG Pathway database. The annotated unigenes were compared against Brassica rapa, B. oleracea, B. napus and Arabidopsis thaliana. Among these unigenes, 76,324 were identified as putative homologs of annotated sequences in the public protein databases, of which 1194 were associated with plant hormone signal transduction and 113 were related to gibberellin homeostasis/signaling. Unigenes that did not match any of those sequence datasets were considered to be unique to S. arvensis. Furthermore, 21,321 simple sequence repeats were found. Our study will enhance the currently available resources for Brassicaceae and will provide a platform for future genomic studies for genetic improvement of Brassica crops. PMID:25192023
Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree
Rangiah, Kannan; Mahesh, HB; Rajamani, Anantharamanan; Shirke, Meghana D.; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, BN
2015-01-01
Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways. PMID:26290780
Genome-wide transcriptome profiling reveals novel insights into Luffa cylindrica browning.
Chen, Xia; Tan, Taiming; Xu, Changcheng; Huang, Shuping; Tan, Jie; Zhang, Min; Wang, Chunli; Xie, Conghua
2015-08-07
Luffa cylindrica (sponge gourd) is one of the most popular vegetables in China. Production and consumption of L. cylindrica are limited due to postharvest browning; however, little is known about the genetic regulation of the browning process. In the present study, transcriptome profiles of L. cylindrica cultivars, YLB05 (browning resistant) and XTR05 (browning sensitive), were analyzed using next-generation sequencing to clarify the genes and mechanisms associated with browning. A total of 9.1 Gb of valid data including 116,703 unigenes (>200 bp) were obtained and 39,473 sequences were annotated by alignment against five public databases. Of these, there were 27,407 genes assigned to 747 Gene Ontology functional categories; and 12,350 genes were annotated with 25 Eukaryotic Orthologous Groups (KOG) categories with 343 KOG functional terms. Additionally, by searching against the Kyoto Encyclopedia of Genes and Genomes database, 8689 unigenes were mapped to 189 pathways. Furthermore, there were 24,556 sequences found to be differentially regulated, including 4344 annotated unigenes. Several genes potentially associated with phenolic oxidation, carbohydrate and hormone metabolism were found differentially regulated between the cultivars of different browning sensitivities. Our results suggest that elements involved in enzymatic processes and other pathways might be responsible for L. cylindrica browning. The present study provides a comprehensive transcriptome sequence resource, which will facilitate further studies on gene discovery and exploiting the fruit browning mechanism of L. cylindrica. Copyright © 2015 Elsevier Inc. All rights reserved.
Qiao, Qin; Xue, Li; Wang, Qia; Sun, Hang; Zhong, Yang; Huang, Jinling; Lei, Jiajun; Zhang, Ticao
2016-01-01
Multiple closely related species with genomic sequences provide an ideal system for studies on comparative and evolutionary genomics, as well as the mechanism of speciation. The whole genome sequences of six strawberry species (Fragaria spp.) have been released, which provide one of the richest genomic resources of any plant genus. In this study, we first generated seven transcriptome sequences of Fragaria species de novo, with a total of 48,557–82,537 unigenes per species. Combined with 13 other species genomes in Rosales, we reconstructed a phylogenetic tree at the genomic level. The phylogenic tree shows that Fragaria closed grouped with Rubus and the Fragaria clade is divided into three subclades. East Asian species appeared in every subclade, suggesting that the genus originated in this area at ∼7.99 Mya. Four species found in mountains of Southwest China originated at ∼3.98 Mya, suggesting that rapid speciation occurred to adapt to changing environments following the uplift of the Qinghai–Tibet Plateau. Moreover, we identified 510 very significantly positively selected genes in the cultivated species F. × ananassa genome. This set of genes was enriched in functions related to specific agronomic traits, such as carbon metabolism and plant hormone signal transduction processes, which are directly related to fruit quality and flavor. These findings illustrate comprehensive evolutionary patterns in Fragaria and the genetic basis of fruit domestication of cultivated strawberry at the genomic/transcriptomic level. PMID:28018379
Liu, Yulin; Huang, Zhedong; Ao, Yan; Li, Wei; Zhang, Zhixiang
2013-01-01
Background Yellow horn (Xanthoceras sorbifolia Bunge) is an oil-rich seed shrub that grows well in cold, barren environments and has great potential for biodiesel production in China. However, the limited genetic data means that little information about the key genes involved in oil biosynthesis is available, which limits further improvement of this species. In this study, we describe sequencing and de novo transcriptome assembly to produce the first comprehensive and integrated genomic resource for yellow horn and identify the pathways and key genes related to oil accumulation. In addition, potential molecular markers were identified and compiled. Methodology/Principal Findings Total RNA was isolated from 30 plants from two regions, including buds, leaves, flowers and seeds. Equal quantities of RNA from these tissues were pooled to construct a cDNA library for 454 pyrosequencing. A total of 1,147,624 high-quality reads with total and average lengths of 530.6 Mb and 462 bp, respectively, were generated. These reads were assembled into 51,867 unigenes, corresponding to a total of 36.1 Mb with a mean length, N50 and median of 696, 928 and 570 bp, respectively. Of the unigenes, 17,541 (33.82%) were unmatched in any public protein databases. We identified 281 unigenes that may be involved in de novo fatty acid (FA) and triacylglycerol (TAG) biosynthesis and metabolism. Furthermore, 6,707 SSRs, 16,925 SNPs and 6,201 InDels with high-confidence were also identified in this study. Conclusions This transcriptome represents a new functional genomics resource and a foundation for further studies on the metabolic engineering of yellow horn to increase oil content and modify oil composition. The potential molecular markers identified in this study provide a basis for polymorphism analysis of Xanthoceras, and even Sapindaceae; they will also accelerate the process of breeding new varieties with better agronomic characteristics. PMID:24040247
Brain transcriptome atlases: a computational perspective.
Mahfouz, Ahmed; Huisman, Sjoerd M H; Lelieveldt, Boudewijn P F; Reinders, Marcel J T
2017-05-01
The immense complexity of the mammalian brain is largely reflected in the underlying molecular signatures of its billions of cells. Brain transcriptome atlases provide valuable insights into gene expression patterns across different brain areas throughout the course of development. Such atlases allow researchers to probe the molecular mechanisms which define neuronal identities, neuroanatomy, and patterns of connectivity. Despite the immense effort put into generating such atlases, to answer fundamental questions in neuroscience, an even greater effort is needed to develop methods to probe the resulting high-dimensional multivariate data. We provide a comprehensive overview of the various computational methods used to analyze brain transcriptome atlases.
Gardin, Jeanne Aude Christiane; Gouzy, Jérôme; Carrère, Sébastien; Délye, Christophe
2015-08-12
Herbicide resistance in agrestal weeds is a global problem threatening food security. Non-target-site resistance (NTSR) endowed by mechanisms neutralising the herbicide or compensating for its action is considered the most agronomically noxious type of resistance. Contrary to target-site resistance, NTSR mechanisms are far from being fully elucidated. A part of weed response to herbicide stress, NTSR is considered to be largely driven by gene regulation. Our purpose was to establish a transcriptome resource allowing investigation of the transcriptomic bases of NTSR in the major grass weed Alopecurus myosuroides L. (Poaceae) for which almost no genomic or transcriptomic data was available. RNA-Seq was performed from plants in one F2 population that were sensitive or expressing NTSR to herbicides inhibiting acetolactate-synthase. Cloned plants were sampled over seven time-points ranging from before until 73 h after herbicide application. Assembly of over 159M high-quality Illumina reads generated a transcriptomic resource (ALOMYbase) containing 65,558 potentially active contigs (N50 = 1240 nucleotides) predicted to encode 32,138 peptides with 74% GO annotation, of which 2017 were assigned to protein families presumably involved in NTSR. Comparison with the fully sequenced grass genomes indicated good coverage and correct representation of A. myosuroides transcriptome in ALOMYbase. The part of the herbicide transcriptomic response common to the resistant and the sensitive plants was consistent with the expected effects of acetolactate-synthase inhibition, with striking similarities observed with published Arabidopsis thaliana data. A. myosuroides plants with NTSR were first affected by herbicide action like sensitive plants, but ultimately overcame it. Analysis of differences in transcriptomic herbicide response between resistant and sensitive plants did not allow identification of processes directly explaining NTSR. Five contigs associated to NTSR in the F2 population studied were tentatively identified. They were predicted to encode three cytochromes P450 (CYP71A, CYP71B and CYP81D), one peroxidase and one disease resistance protein. Our data confirmed that gene regulation is at the root of herbicide response and of NTSR. ALOMYbase proved to be a relevant resource to support NTSR transcriptomic studies, and constitutes a valuable tool for future research aiming at elucidating gene regulations involved in NTSR in A. myosuroides.
Comprehensive RNA-Seq profiling to evaluate lactating sheep mammary gland transcriptome
Suárez-Vega, Aroa; Gutiérrez-Gil, Beatriz; Klopp, Christophe; Tosser-Klopp, Gwenola; Arranz, Juan-José
2016-01-01
RNA-Seq enables the generation of extensive transcriptome information providing the capability to characterize transcripts (including alternative isoforms and polymorphism), to quantify expression and to identify differential regulation in a single experiment. Our aim in this study was to take advantage of using RNA-Seq high-throughput technology to provide a comprehensive transcriptome profiling of the sheep lactating mammary gland. Eight ewes of two dairy sheep breeds with differences in milk production traits were used in this experiment (four Churra and four Assaf ewes). Milk samples from these animals were collected on days 10, 50, 120 and 150 after lambing to cover the various physiological stages of the mammary gland across the complete lactation. RNA samples were extracted from milk somatic cells. The RNA-Seq dataset was generated using an Illumina HiSeq 2000 sequencer. The information reported here will be useful to understand the biology of lactation in sheep, providing also an opportunity to characterize their different patterns on milk production aptitude. PMID:27377755
Comprehensive RNA-Seq profiling to evaluate lactating sheep mammary gland transcriptome.
Suárez-Vega, Aroa; Gutiérrez-Gil, Beatriz; Klopp, Christophe; Tosser-Klopp, Gwenola; Arranz, Juan-José
2016-07-05
RNA-Seq enables the generation of extensive transcriptome information providing the capability to characterize transcripts (including alternative isoforms and polymorphism), to quantify expression and to identify differential regulation in a single experiment. Our aim in this study was to take advantage of using RNA-Seq high-throughput technology to provide a comprehensive transcriptome profiling of the sheep lactating mammary gland. Eight ewes of two dairy sheep breeds with differences in milk production traits were used in this experiment (four Churra and four Assaf ewes). Milk samples from these animals were collected on days 10, 50, 120 and 150 after lambing to cover the various physiological stages of the mammary gland across the complete lactation. RNA samples were extracted from milk somatic cells. The RNA-Seq dataset was generated using an Illumina HiSeq 2000 sequencer. The information reported here will be useful to understand the biology of lactation in sheep, providing also an opportunity to characterize their different patterns on milk production aptitude.
Differential immune responses of Monochamus alternatus against symbiotic and entomopathogenic fungi.
Zhang, Wei; Meng, Jie; Ning, Jing; Qin, Peijun; Zhou, Jiao; Zou, Zhen; Wang, Yanhong; Jiang, Hong; Ahmad, Faheem; Zhao, Lilin; Sun, Jianghua
2017-08-01
Monochamus alternatus, the main vector beetles of invasive pinewood nematode, has established a symbiotic relationship with a native ectotrophic fungal symbiont, Sporothrix sp. 1, in China. The immune response of M. alternatus to S. sp. 1 in the coexistence of beetles and fungi is, however, unknown. Here, we report that immune responses of M. alternatus pupae to infection caused by ectotrophic symbiotic fungus S. sp. 1 and entomopathogenic fungus Beauveria bassiana differ significantly. The S. sp. 1 did not kill the beetles while B. bassiana killed all upon injection. The transcriptome results showed that the numbers of differentially expressed genes in M. alternatus infected with S. sp. 1 were 2-fold less than those infected with B. bassiana at 48 hours post infection. It was noticed that Toll and IMD pathways played a leading role in the beetle's immune system when infected by symbiotic fungus, but upon infection by entomopathogenic fungus, only the Toll pathway gets triggered actively. Furthermore, the beetles could tolerate the infection of symbiotic fungi by retracing their Toll and IMD pathways at 48 h. This study provided a comprehensive sequence resource of M. alternatus transcriptome for further study of the immune interactions between host and associated fungi.
Melicher, Dacotah; Torson, Alex S; Dworkin, Ian; Bowsher, Julia H
2014-03-12
The Sepsidae family of flies is a model for investigating how sexual selection shapes courtship and sexual dimorphism in a comparative framework. However, like many non-model systems, there are few molecular resources available. Large-scale sequencing and assembly have not been performed in any sepsid, and the lack of a closely related genome makes investigation of gene expression challenging. Our goal was to develop an automated pipeline for de novo transcriptome assembly, and to use that pipeline to assemble and analyze the transcriptome of the sepsid Themira biloba. Our bioinformatics pipeline uses cloud computing services to assemble and analyze the transcriptome with off-site data management, processing, and backup. It uses a multiple k-mer length approach combined with a second meta-assembly to extend transcripts and recover more bases of transcript sequences than standard single k-mer assembly. We used 454 sequencing to generate 1.48 million reads from cDNA generated from embryo, larva, and pupae of T. biloba and assembled a transcriptome consisting of 24,495 contigs. Annotation identified 16,705 transcripts, including those involved in embryogenesis and limb patterning. We assembled transcriptomes from an additional three non-model organisms to demonstrate that our pipeline assembled a higher-quality transcriptome than single k-mer approaches across multiple species. The pipeline we have developed for assembly and analysis increases contig length, recovers unique transcripts, and assembles more base pairs than other methods through the use of a meta-assembly. The T. biloba transcriptome is a critical resource for performing large-scale RNA-Seq investigations of gene expression patterns, and is the first transcriptome sequenced in this Dipteran family.
Transcriptome Assembly, Gene Annotation and Tissue Gene Expression Atlas of the Rainbow Trout
Salem, Mohamed; Paneru, Bam; Al-Tobasei, Rafet; Abdouni, Fatima; Thorgaard, Gary H.; Rexroad, Caird E.; Yao, Jianbo
2015-01-01
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome. PMID:25793877
Identifier mapping performance for integrating transcriptomics and proteomics experimental results
2011-01-01
Background Studies integrating transcriptomic data with proteomic data can illuminate the proteome more clearly than either separately. Integromic studies can deepen understanding of the dynamic complex regulatory relationship between the transcriptome and the proteome. Integrating these data dictates a reliable mapping between the identifier nomenclature resultant from the two high-throughput platforms. However, this kind of analysis is well known to be hampered by lack of standardization of identifier nomenclature among proteins, genes, and microarray probe sets. Therefore data integration may also play a role in critiquing the fallible gene identifications that both platforms emit. Results We compared three freely available internet-based identifier mapping resources for mapping UniProt accessions (ACCs) to Affymetrix probesets identifications (IDs): DAVID, EnVision, and NetAffx. Liquid chromatography-tandem mass spectrometry analyses of 91 endometrial cancer and 7 noncancer samples generated 11,879 distinct ACCs. For each ACC, we compared the retrieval sets of probeset IDs from each mapping resource. We confirmed a high level of discrepancy among the mapping resources. On the same samples, mRNA expression was available. Therefore, to evaluate the quality of each ACC-to-probeset match, we calculated proteome-transcriptome correlations, and compared the resources presuming that better mapping of identifiers should generate a higher proportion of mapped pairs with strong inter-platform correlations. A mixture model for the correlations fitted well and supported regression analysis, providing a window into the performance of the mapping resources. The resources have added and dropped matches over two years, but their overall performance has not changed. Conclusions The methods presented here serve to achieve concrete context-specific insight, to support well-informed decisions in choosing an ID mapping strategy for "omic" data merging. PMID:21619611
Tian, Xin-Jie; Long, Yan; Wang, Jiao; Zhang, Jing-Wen; Wang, Yan-Yan; Li, Wei-Min; Peng, Yu-Fa; Yuan, Qian-Hua; Pei, Xin-Wu
2015-01-01
The perennial O. rufipogon (common wild rice), which is considered to be the ancestor of Asian cultivated rice species, contains many useful genetic resources, including drought resistance genes. However, few studies have identified the drought resistance and tissue-specific genes in common wild rice. In this study, transcriptome sequencing libraries were constructed, including drought-treated roots (DR) and control leaves (CL) and roots (CR). Using Illumina sequencing technology, we generated 16.75 million bases of high-quality sequence data for common wild rice and conducted de novo assembly and annotation of genes without prior genome information. These reads were assembled into 119,332 unigenes with an average length of 715 bp. A total of 88,813 distinct sequences (74.42% of unigenes) significantly matched known genes in the NCBI NT database. Differentially expressed gene (DEG) analysis showed that 3617 genes were up-regulated and 4171 genes were down-regulated in the CR library compared with the CL library. Among the DEGs, 535 genes were expressed in roots but not in shoots. A similar comparison between the DR and CR libraries showed that 1393 genes were up-regulated and 315 genes were down-regulated in the DR library compared with the CR library. Finally, 37 genes that were specifically expressed in roots were screened after comparing the DEGs identified in the above-described analyses. This study provides a transcriptome sequence resource for common wild rice plants and establishes a digital gene expression profile of wild rice plants under drought conditions using the assembled transcriptome data as a reference. Several tissue-specific and drought-stress-related candidate genes were identified, representing a fully characterized transcriptome and providing a valuable resource for genetic and genomic studies in plants.
Zhang, Qu; Hill, Geoffrey E; Edwards, Scott V; Backström, Niclas
2014-04-24
With its plumage color dimorphism and unique history in North America, including a recent population expansion and an epizootic of Mycoplasma gallisepticum (MG), the house finch (Haemorhous mexicanus) is a model species for studying sexual selection, plumage coloration and host-parasite interactions. As part of our ongoing efforts to make available genomic resources for this species, here we report a transcriptome assembly derived from genes expressed in spleen. We characterize transcriptomes from two populations with different histories of demography and disease exposure: a recently founded population in the eastern US that has been exposed to MG for over a decade and a native population from the western range that has never been exposed to MG. We utilize this resource to quantify conservation in gene expression in passerine birds over approximately 50 MY by comparing splenic expression profiles for 9,646 house finch transcripts and those from zebra finch and find that less than half of all genes expressed in spleen in either species are expressed in both species. Comparative gene annotations from several vertebrate species suggest that the house finch transcriptomes contain ~15 genes not yet found in previously sequenced vertebrate genomes. The house finch transcriptomes harbour ~85,000 SNPs, ~20,000 of which are non-synonymous. Although not yet validated by biological or technical replication, we identify a set of genes exhibiting differences between populations in gene expression (n = 182; 2% of all transcripts), allele frequencies (76 FST ouliers) and alternative splicing as well as genes with several fixed non-synonymous substitutions; this set includes genes with functions related to double-strand break repair and immune response. The two house finch spleen transcriptome profiles will add to the increasing data on genome and transcriptome sequence information from natural populations. Differences in splenic expression between house finch and zebra finch imply either significant evolutionary turnover of splenic expression patterns or different physiological states of the individuals examined. The transcriptome resource will enhance the potential to annotate an eventual house finch genome, and the set of gene-based high-quality SNPs will help clarify the genetic underpinnings of host-pathogen interactions and sexual selection.
Oil biosynthesis in a basal angiosperm: transcriptome analysis of Persea Americana mesocarp.
Kilaru, Aruna; Cao, Xia; Dabbs, Parker B; Sung, Ha-Jung; Rahman, Md Mahbubur; Thrower, Nicholas; Zynda, Greg; Podicheti, Ram; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis; Mockaitis, Keithanne; Ohlrogge, John B
2015-08-16
The mechanism by which plants synthesize and store high amounts of triacylglycerols (TAG) in tissues other than seeds is not well understood. The comprehension of controls for carbon partitioning and oil accumulation in nonseed tissues is essential to generate oil-rich biomass in perennial bioenergy crops. Persea americana (avocado), a basal angiosperm with unique features that are ancestral to most flowering plants, stores ~ 70 % TAG per dry weight in its mesocarp, a nonseed tissue. Transcriptome analyses of select pathways, from generation of pyruvate and leading up to TAG accumulation, in mesocarp tissues of avocado was conducted and compared with that of oil-rich monocot (oil palm) and dicot (rapeseed and castor) tissues to identify tissue- and species-specific regulation and biosynthesis of TAG in plants. RNA-Seq analyses of select lipid metabolic pathways of avocado mesocarp revealed patterns similar to that of other oil-rich species. However, only some predominant orthologs of the fatty acid biosynthetic pathway genes in this basal angiosperm were similar to those of monocots and dicots. The accumulation of TAG, rich in oleic acid, was associated with higher transcript levels for a putative stearoyl-ACP desaturase and endoplasmic reticulum (ER)-associated acyl-CoA synthetases, during fruit development. Gene expression levels for enzymes involved in terminal steps to TAG biosynthesis in the ER further indicated that both acyl-CoA-dependent and -independent mechanisms might play a role in TAG assembly, depending on the developmental stage of the fruit. Furthermore, in addition to the expression of an ortholog of WRINKLED1 (WRI1), a regulator of fatty acid biosynthesis, high transcript levels for WRI2-like and WRI3-like suggest a role for additional transcription factors in nonseed oil accumulation. Plastid pyruvate necessary for fatty acid synthesis is likely driven by the upregulation of genes involved in glycolysis and transport of its intermediates. Together, a comparative transcriptome analyses for storage oil biosynthesis in diverse plants and tissues suggested that several distinct and conserved features in this basal angiosperm species might contribute towards its rich TAG content. Our work represents a comprehensive transcriptome resource for a basal angiosperm species and provides insight into their lipid metabolism in mesocarp tissues. Furthermore, comparison of the transcriptome of oil-rich mesocarp of avocado, with oil-rich seed and nonseed tissues of monocot and dicot species, revealed lipid gene orthologs that are highly conserved during evolution. The orthologs that are distinctively expressed in oil-rich mesocarp tissues of this basal angiosperm, such as WRI2, ER-associated acyl-CoA synthetases, and lipid-droplet associated proteins were also identified. This study provides a foundation for future investigations to increase oil-content and has implications for metabolic engineering to enhance storage oil content in nonseed tissues of diverse species.
Transcriptome of the Caribbean stony coral Porites astreoides from three developmental stages.
Mansour, Tamer A; Rosenthal, Joshua J C; Brown, C Titus; Roberson, Loretta M
2016-08-02
Porites astreoides is a ubiquitous species of coral on modern Caribbean reefs that is resistant to increasing temperatures, overfishing, and other anthropogenic impacts that have threatened most other coral species. We assembled and annotated a transcriptome from this coral using Illumina sequences from three different developmental stages collected over several years: free-swimming larvae, newly settled larvae, and adults (>10 cm in diameter). This resource will aid understanding of coral calcification, larval settlement, and host-symbiont interactions. A de novo transcriptome for the P. astreoides holobiont (coral plus algal symbiont) was assembled using 594 Mbp of raw Illumina sequencing data generated from five age-specific cDNA libraries. The new transcriptome consists of 867 255 transcript elements with an average length of 685 bases. The isolated P. astreoides assembly consists of 129 718 transcript elements with an average length of 811 bases, and the isolated Symbiodinium sp. assembly had 186 177 transcript elements with an average length of 1105 bases. This contribution to coral transcriptome data provides a valuable resource for researchers studying the ontogeny of gene expression patterns within both the coral and its dinoflagellate symbiont.
RNA-Seq Atlas of Glycine max: a guide to the soybean transcriptome
USDA-ARS?s Scientific Manuscript database
A first analysis of the Glycine max (L.) Merr. (soybean) transcriptome using next generation sequencing technology and RNA-Sequencing (RNA-Seq) is presented. This analysis will provide an important resource for understanding transcription and gene co-regulatory networks in soybean, the most economic...
Data mining in newt-omics, the repository for omics data from the newt.
Looso, Mario; Braun, Thomas
2015-01-01
Salamanders are an excellent model organism to study regenerative processes due to their unique ability to regenerate lost appendages or organs. Straightforward bioinformatics tools to analyze and take advantage of the growing number of "omics" studies performed in salamanders were lacking so far. To overcome this limitation, we have generated a comprehensive data repository for the red-spotted newt Notophthalmus viridescens, named newt-omics, merging omics style datasets on the transcriptome and proteome level including expression values and annotations. The resource is freely available via a user-friendly Web-based graphical user interface ( http://newt-omics.mpi-bn.mpg.de) that allows access and queries to the database without prior bioinformatical expertise. The repository is updated regularly, incorporating new published datasets from omics technologies.
2012-01-01
Background Chinese fir (Cunninghamia lanceolata) is an important timber species that accounts for 20–30% of the total commercial timber production in China. However, the available genomic information of Chinese fir is limited, and this severely encumbers functional genomic analysis and molecular breeding in Chinese fir. Recently, major advances in transcriptome sequencing have provided fast and cost-effective approaches to generate large expression datasets that have proven to be powerful tools to profile the transcriptomes of non-model organisms with undetermined genomes. Results In this study, the transcriptomes of nine tissues from Chinese fir were analyzed using the Illumina HiSeq™ 2000 sequencing platform. Approximately 40 million paired-end reads were obtained, generating 3.62 gigabase pairs of sequencing data. These reads were assembled into 83,248 unique sequences (i.e. Unigenes) with an average length of 449 bp, amounting to 37.40 Mb. A total of 73,779 Unigenes were supported by more than 5 reads, 42,663 (57.83%) had homologs in the NCBI non-redundant and Swiss-Prot protein databases, corresponding to 27,224 unique protein entries. Of these Unigenes, 16,750 were assigned to Gene Ontology classes, and 14,877 were clustered into orthologous groups. A total of 21,689 (29.40%) were mapped to 119 pathways by BLAST comparison against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The majority of the genes encoding the enzymes in the biosynthetic pathways of cellulose and lignin were identified in the Unigene dataset by targeted searches of their annotations. And a number of candidate Chinese fir genes in the two metabolic pathways were discovered firstly. Eighteen genes related to cellulose and lignin biosynthesis were cloned for experimental validating of transcriptome data. Overall 49 Unigenes, covering different regions of these selected genes, were found by alignment. Their expression patterns in different tissues were analyzed by qRT-PCR to explore their putative functions. Conclusions A substantial fraction of transcript sequences was obtained from the deep sequencing of Chinese fir. The assembled Unigene dataset was used to discover candidate genes of cellulose and lignin biosynthesis. This transcriptome dataset will provide a comprehensive sequence resource for molecular genetics research of C. lanceolata. PMID:23171398
Bhardwaj, Ankur R; Joshi, Gopal; Kukreja, Bharti; Malik, Vidhi; Arora, Priyanka; Pandey, Ritu; Shukla, Rohit N; Bankar, Kiran G; Katiyar-Agarwal, Surekha; Goel, Shailendra; Jagannath, Arun; Kumar, Amar; Agarwal, Manu
2015-01-21
Brassica juncea var. Varuna is an economically important oilseed crop of family Brassicaceae which is vulnerable to abiotic stresses at specific stages in its life cycle. Till date no attempts have been made to elucidate genome-wide changes in its transcriptome against high temperature or drought stress. To gain global insights into genes, transcription factors and kinases regulated by these stresses and to explore information on coding transcripts that are associated with traits of agronomic importance, we utilized a combinatorial approach of next generation sequencing and de-novo assembly to discover B. juncea transcriptome associated with high temperature and drought stresses. We constructed and sequenced three transcriptome libraries namely Brassica control (BC), Brassica high temperature stress (BHS) and Brassica drought stress (BDS). More than 180 million purity filtered reads were generated which were processed through quality parameters and high quality reads were assembled de-novo using SOAPdenovo assembler. A total of 77750 unique transcripts were identified out of which 69,245 (89%) were annotated with high confidence. We established a subset of 19110 transcripts, which were differentially regulated by either high temperature and/or drought stress. Furthermore, 886 and 2834 transcripts that code for transcription factors and kinases, respectively, were also identified. Many of these were responsive to high temperature, drought or both stresses. Maximum number of up-regulated transcription factors in high temperature and drought stress belonged to heat shock factors (HSFs) and dehydration responsive element-binding (DREB) families, respectively. We also identified 239 metabolic pathways, which were perturbed during high temperature and drought treatments. Analysis of gene ontologies associated with differentially regulated genes forecasted their involvement in diverse biological processes. Our study provides first comprehensive discovery of B. juncea transcriptome under high temperature and drought stress conditions. Transcriptome resource generated in this study will enhance our understanding on the molecular mechanisms involved in defining the response of B. juncea against two important abiotic stresses. Furthermore this information would benefit designing of efficient crop improvement strategies for tolerance against conditions of high temperature regimes and water scarcity.
Benton, Matthew A; Kenny, Nathan J; Conrads, Kai H; Roth, Siegfried; Lynch, Jeremy A
2016-01-01
Despite recent efforts to sample broadly across metazoan and insect diversity, current sequence resources in the Coleoptera do not adequately describe the diversity of the clade. Here we present deep, staged transcriptomic data for two coleopteran species, Atrachya menetriesi (Faldermann 1835) and Callosobruchus maculatus (Fabricius 1775). Our sampling covered key stages in ovary and early embryonic development in each species. We utilized this data to build combined assemblies for each species which were then analysed in detail. The combined A. menetriesi assembly consists of 228,096 contigs with an N50 of 1,598 bp, while the combined C. maculatus assembly consists of 128,837 contigs with an N50 of 2,263 bp. For these assemblies, 34.6% and 32.4% of contigs were identified using Blast2GO, and 97% and 98.3% of the BUSCO set of metazoan orthologs were present, respectively. We also carried out manual annotation of developmental signalling pathways and found that nearly all expected genes were present in each transcriptome. Our analyses show that both transcriptomes are of high quality. Lastly, we performed read mapping utilising our timed, stage specific RNA samples to identify differentially expressed contigs. The resources presented here will provide a firm basis for a variety of experimentation, both in developmental biology and in comparative genomic studies.
Gupta, Parul; Goel, Ridhi; Pathak, Sumya; Srivastava, Apeksha; Singh, Surya Pratap; Sangwan, Rajender Singh; Asif, Mehar Hasan; Trivedi, Prabodh Kumar
2013-01-01
Withania somnifera is one of the most valuable medicinal plants used in Ayurvedic and other indigenous medicine systems due to bioactive molecules known as withanolides. As genomic information regarding this plant is very limited, little information is available about biosynthesis of withanolides. To facilitate the basic understanding about the withanolide biosynthesis pathways, we performed transcriptome sequencing for Withania leaf (101L) and root (101R) which specifically synthesize withaferin A and withanolide A, respectively. Pyrosequencing yielded 8,34,068 and 7,21,755 reads which got assembled into 89,548 and 1,14,814 unique sequences from 101L and 101R, respectively. A total of 47,885 (101L) and 54,123 (101R) could be annotated using TAIR10, NR, tomato and potato databases. Gene Ontology and KEGG analyses provided a detailed view of all the enzymes involved in withanolide backbone synthesis. Our analysis identified members of cytochrome P450, glycosyltransferase and methyltransferase gene families with unique presence or differential expression in leaf and root and might be involved in synthesis of tissue-specific withanolides. We also detected simple sequence repeats (SSRs) in transcriptome data for use in future genetic studies. Comprehensive sequence resource developed for Withania, in this study, will help to elucidate biosynthetic pathway for tissue-specific synthesis of secondary plant products in non-model plant organisms as well as will be helpful in developing strategies for enhanced biosynthesis of withanolides through biotechnological approaches. PMID:23667511
Zhu, Haisheng; Liu, Jianting; Wen, Qingfang; Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng
2017-01-01
Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar 'Fusi-3'. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1-6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism.
2013-01-01
Backgroud Isatis indigotica is a widely used herb for the clinical treatment of colds, fever, and influenza in Traditional Chinese Medicine (TCM). Various structural classes of compounds have been identified as effective ingredients. However, little is known at genetics level about these active metabolites. In the present study, we performed de novo transcriptome sequencing for the first time to produce a comprehensive dataset of I. indigotica. Results A database of 36,367 unigenes (average length = 1,115.67 bases) was generated by performing transcriptome sequencing. Based on the gene annotation of the transcriptome, 104 unigenes were identified covering most of the catalytic steps in the general biosynthetic pathways of indole, terpenoid, and phenylpropanoid. Subsequently, the organ-specific expression patterns of the genes involved in these pathways, and their responses to methyl jasmonate (MeJA) induction, were investigated. Metabolites profile of effective phenylpropanoid showed accumulation pattern of secondary metabolites were mostly correlated with the transcription of their biosynthetic genes. According to the analysis of UDP-dependent glycosyltransferases (UGT) family, several flavonoids were indicated to exist in I. indigotica and further identified by metabolic profile using UPLC/Q-TOF. Moreover, applying transcriptome co-expression analysis, nine new, putative UGTs were suggested as flavonol glycosyltransferases and lignan glycosyltransferases. Conclusions This database provides a pool of candidate genes involved in biosynthesis of effective metabolites in I. indigotica. Furthermore, the comprehensive analysis and characterization of the significant pathways are expected to give a better insight regarding the diversity of chemical composition, synthetic characteristics, and the regulatory mechanism which operate in this medical herb. PMID:24308360
2012-01-01
Background In rubber tree, bark is one of important agricultural and biological organs. However, the molecular mechanism involved in the bark formation and development in rubber tree remains largely unknown, which is at least partially due to lack of bark transcriptomic and genomic information. Therefore, it is necessary to carried out high-throughput transcriptome sequencing of rubber tree bark to generate enormous transcript sequences for the functional characterization and molecular marker development. Results In this study, more than 30 million sequencing reads were generated using Illumina paired-end sequencing technology. In total, 22,756 unigenes with an average length of 485 bp were obtained with de novo assembly. The similarity search indicated that 16,520 and 12,558 unigenes showed significant similarities to known proteins from NCBI non-redundant and Swissprot protein databases, respectively. Among these annotated unigenes, 6,867 and 5,559 unigenes were separately assigned to Gene Ontology (GO) and Clusters of Orthologous Group (COG). When 22,756 unigenes searched against the Kyoto Encyclopedia of Genes and Genomes Pathway (KEGG) database, 12,097 unigenes were assigned to 5 main categories including 123 KEGG pathways. Among the main KEGG categories, metabolism was the biggest category (9,043, 74.75%), suggesting the active metabolic processes in rubber tree bark. In addition, a total of 39,257 EST-SSRs were identified from 22,756 unigenes, and the characterizations of EST-SSRs were further analyzed in rubber tree. 110 potential marker sites were randomly selected to validate the assembly quality and develop EST-SSR markers. Among 13 Hevea germplasms, PCR success rate and polymorphism rate of 110 markers were separately 96.36% and 55.45% in this study. Conclusion By assembling and analyzing de novo transcriptome sequencing data, we reported the comprehensive functional characterization of rubber tree bark. This research generated a substantial fraction of rubber tree transcriptome sequences, which were very useful resources for gene annotation and discovery, molecular markers development, genome assembly and annotation, and microarrays development in rubber tree. The EST-SSR markers identified and developed in this study will facilitate marker-assisted selection breeding in rubber tree. Moreover, this study also supported that transcriptome analysis based on Illumina paired-end sequencing is a powerful tool for transcriptome characterization and molecular marker development in non-model species, especially those with large and complex genomes. PMID:22607098
The testes transcriptome derived from the New World Screwworm, Cochliomyia hominivorax SRA
USDA-ARS?s Scientific Manuscript database
In a collaboration with National Center for Genome Resources researchers, we sequenced and assembled the testes transcriptome derived from the Pacora, Panama, production plant strain J06 of the New World Screwworm, Cochliomyia hominivorax. This sequencing project produced 72,750,822 raw reads and th...
Spriggs, Andrew; Henderson, Steven T.; Hand, Melanie L.; Johnson, Susan D.; Taylor, Jennifer M.; Koltunow, Anna
2018-01-01
Cowpea ( Vigna unguiculata (L.) Walp) is an important legume crop for food security in areas of low-input and smallholder farming throughout Africa and Asia. Genetic improvements are required to increase yield and resilience to biotic and abiotic stress and to enhance cowpea crop performance. An integrated cowpea genomic and gene expression data resource has the potential to greatly accelerate breeding and the delivery of novel genetic traits for cowpea. Extensive genomic resources for cowpea have been absent from the public domain; however, a recent early release reference genome for IT97K-499-35 ( Vigna unguiculata v1.0, NSF, UCR, USAID, DOE-JGI, http://phytozome.jgi.doe.gov/) has now been established in a collaboration between the Joint Genome Institute (JGI) and University California (UC) Riverside. Here we release supporting genomic and transcriptomic data for IT97K-499-35 and a second transformable cowpea variety, IT86D-1010. The transcriptome resource includes six tissue-specific datasets for each variety, with particular emphasis on reproductive tissues that extend and support the V. unguiculata v1.0 reference. Annotations have been included in our resource to allow direct mapping to the v1.0 cowpea reference. Access to this resource provided here is supported by raw and assembled data downloads. PMID:29528046
Spriggs, Andrew; Henderson, Steven T; Hand, Melanie L; Johnson, Susan D; Taylor, Jennifer M; Koltunow, Anna
2018-02-09
Cowpea ( Vigna unguiculata (L.) Walp) is an important legume crop for food security in areas of low-input and smallholder farming throughout Africa and Asia. Genetic improvements are required to increase yield and resilience to biotic and abiotic stress and to enhance cowpea crop performance. An integrated cowpea genomic and gene expression data resource has the potential to greatly accelerate breeding and the delivery of novel genetic traits for cowpea. Extensive genomic resources for cowpea have been absent from the public domain; however, a recent early release reference genome for IT97K-499-35 ( Vigna unguiculata v1.0, NSF, UCR, USAID, DOE-JGI, http://phytozome.jgi.doe.gov/) has now been established in a collaboration between the Joint Genome Institute (JGI) and University California (UC) Riverside. Here we release supporting genomic and transcriptomic data for IT97K-499-35 and a second transformable cowpea variety, IT86D-1010. The transcriptome resource includes six tissue-specific datasets for each variety, with particular emphasis on reproductive tissues that extend and support the V. unguiculata v1.0 reference. Annotations have been included in our resource to allow direct mapping to the v1.0 cowpea reference. Access to this resource provided here is supported by raw and assembled data downloads.
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi
2013-01-01
Background Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. Results A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Conclusion Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia. PMID:23777366
Lv, Jianjian; Liu, Ping; Gao, Baoquan; Wang, Yu; Wang, Zheng; Chen, Ping; Li, Jian
2014-01-01
Background The swimming crab, Portunus trituberculatus, is an important farmed species in China, has been attracting extensive studies, which require more and more genome background knowledge. To date, the sequencing of its whole genome is unavailable and transcriptomic information is also scarce for this species. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for major tissues of Portunus trituberculatus by the Illumina paired-end sequencing technology. Results Total RNA was isolated from eyestalk, gill, heart, hepatopancreas and muscle. Equal quantities of RNA from each tissue were pooled to construct a cDNA library. Using the Illumina paired-end sequencing technology, we generated a total of 120,137 transcripts with an average length of 1037 bp. Further assembly analysis showed that all contigs contributed to 87,100 unigenes, of these, 16,029 unigenes (18.40% of the total) can be matched in the GenBank non-redundant database. Potential genes and their functions were predicted by GO, KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literature, many putative genes with fundamental roles in growth and muscle development, including actin, myosin, tropomyosin, troponin and other potentially important candidate genes were identified for the first time in this specie. Furthermore, 22,673 SSRs and 66,191 high-confidence SNPs were identified in this EST dataset. Conclusion The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in Portunus trituberculatus. The data will also instruct future functional studies to manipulate or select for genes influencing growth that should find practical applications in aquaculture breeding programs. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will be essential for accelerating aquaculture breeding programs with this species. PMID:24722690
Widana Gamage, Shirani M K; McGrath, Desmond J; Persley, Denis M; Dietzgen, Ralf G
2016-01-01
Capsicum chlorosis virus (CaCV) is an emerging pathogen of capsicum, tomato and peanut crops in Australia and South-East Asia. Commercial capsicum cultivars with CaCV resistance are not yet available, but CaCV resistance identified in Capsicum chinense is being introgressed into commercial Bell capsicum. However, our knowledge of the molecular mechanisms leading to the resistance response to CaCV infection is limited. Therefore, transcriptome and expression profiling data provide an important resource to better understand CaCV resistance mechanisms. We assembled capsicum transcriptomes and analysed gene expression using Illumina HiSeq platform combined with a tag-based digital gene expression system. Total RNA extracted from CaCV/mock inoculated CaCV resistant (R) and susceptible (S) capsicum at the time point when R line showed a strong hypersensitive response to CaCV infection was used in transcriptome assembly. Gene expression profiles of R and S capsicum in CaCV- and buffer-inoculated conditions were compared. None of the genes were differentially expressed (DE) between R and S cultivars when mock-inoculated, while 2484 genes were DE when inoculated with CaCV. Functional classification revealed that the most highly up-regulated DE genes in R capsicum included pathogenesis-related genes, cell death-associated genes, genes associated with hormone-mediated signalling pathways and genes encoding enzymes involved in synthesis of defense-related secondary metabolites. We selected 15 genes to confirm DE expression levels by real-time quantitative PCR. DE transcript profiling data provided comprehensive gene expression information to gain an understanding of the underlying CaCV resistance mechanisms. Further, we identified candidate CaCV resistance genes in the CaCV-resistant C. annuum x C. chinense breeding line. This knowledge will be useful in future for fine mapping of the CaCV resistance locus and potential genetic engineering of resistance into CaCV-susceptible crops.
Widana Gamage, Shirani M. K.; McGrath, Desmond J.; Persley, Denis M.
2016-01-01
Background Capsicum chlorosis virus (CaCV) is an emerging pathogen of capsicum, tomato and peanut crops in Australia and South-East Asia. Commercial capsicum cultivars with CaCV resistance are not yet available, but CaCV resistance identified in Capsicum chinense is being introgressed into commercial Bell capsicum. However, our knowledge of the molecular mechanisms leading to the resistance response to CaCV infection is limited. Therefore, transcriptome and expression profiling data provide an important resource to better understand CaCV resistance mechanisms. Methodology/Principal Findings We assembled capsicum transcriptomes and analysed gene expression using Illumina HiSeq platform combined with a tag-based digital gene expression system. Total RNA extracted from CaCV/mock inoculated CaCV resistant (R) and susceptible (S) capsicum at the time point when R line showed a strong hypersensitive response to CaCV infection was used in transcriptome assembly. Gene expression profiles of R and S capsicum in CaCV- and buffer-inoculated conditions were compared. None of the genes were differentially expressed (DE) between R and S cultivars when mock-inoculated, while 2484 genes were DE when inoculated with CaCV. Functional classification revealed that the most highly up-regulated DE genes in R capsicum included pathogenesis-related genes, cell death-associated genes, genes associated with hormone-mediated signalling pathways and genes encoding enzymes involved in synthesis of defense-related secondary metabolites. We selected 15 genes to confirm DE expression levels by real-time quantitative PCR. Conclusion/Significance DE transcript profiling data provided comprehensive gene expression information to gain an understanding of the underlying CaCV resistance mechanisms. Further, we identified candidate CaCV resistance genes in the CaCV-resistant C. annuum x C. chinense breeding line. This knowledge will be useful in future for fine mapping of the CaCV resistance locus and potential genetic engineering of resistance into CaCV-susceptible crops. PMID:27398596
Duan, Jun; Ladd, Tim; Doucet, Daniel; Cusson, Michel; vanFrankenhuyzen, Kees; Mittapalli, Omprakash; Krell, Peter J; Quan, Guoxing
2015-01-01
The Emerald ash borer (EAB), Agrilus planipennis, is an invasive phloem-feeding insect pest of ash trees. Since its initial discovery near the Detroit, US- Windsor, Canada area in 2002, the spread of EAB has had strong negative economic, social and environmental impacts in both countries. Several transcriptomes from specific tissues including midgut, fat body and antenna have recently been generated. However, the relatively low sequence depth, gene coverage and completeness limited the usefulness of these EAB databases. High-throughput deep RNA-Sequencing (RNA-Seq) was used to obtain 473.9 million pairs of 100 bp length paired-end reads from various life stages and tissues. These reads were assembled into 88,907 contigs using the Trinity strategy and integrated into 38,160 unigenes after redundant sequences were removed. We annotated 11,229 unigenes by searching against the public nr, Swiss-Prot and COG. The EAB transcriptome assembly was compared with 13 other sequenced insect species, resulting in the prediction of 536 unigenes that are Coleoptera-specific. Differential gene expression revealed that 290 unigenes are expressed during larval molting and 3,911 unigenes during metamorphosis from larvae to pupae, respectively (FDR< 0.01 and log2 FC>2). In addition, 1,167 differentially expressed unigenes were identified from larval and adult midguts, 435 unigenes were up-regulated in larval midgut and 732 unigenes were up-regulated in adult midgut. Most of the genes involved in RNA interference (RNAi) pathways were identified, which implies the existence of a system RNAi in EAB. This study provides one of the most fundamental and comprehensive transcriptome resources available for EAB to date. Identification of the tissue- stage- or species- specific unigenes will benefit the further study of gene functions during growth and metamorphosis processes in EAB and other pest insects.
Duan, Jun; Ladd, Tim; Doucet, Daniel; Cusson, Michel; vanFrankenhuyzen, Kees; Mittapalli, Omprakash; Krell, Peter J.; Quan, Guoxing
2015-01-01
Background The Emerald ash borer (EAB), Agrilus planipennis, is an invasive phloem-feeding insect pest of ash trees. Since its initial discovery near the Detroit, US- Windsor, Canada area in 2002, the spread of EAB has had strong negative economic, social and environmental impacts in both countries. Several transcriptomes from specific tissues including midgut, fat body and antenna have recently been generated. However, the relatively low sequence depth, gene coverage and completeness limited the usefulness of these EAB databases. Methodology and Principal Findings High-throughput deep RNA-Sequencing (RNA-Seq) was used to obtain 473.9 million pairs of 100 bp length paired-end reads from various life stages and tissues. These reads were assembled into 88,907 contigs using the Trinity strategy and integrated into 38,160 unigenes after redundant sequences were removed. We annotated 11,229 unigenes by searching against the public nr, Swiss-Prot and COG. The EAB transcriptome assembly was compared with 13 other sequenced insect species, resulting in the prediction of 536 unigenes that are Coleoptera-specific. Differential gene expression revealed that 290 unigenes are expressed during larval molting and 3,911 unigenes during metamorphosis from larvae to pupae, respectively (FDR< 0.01 and log2 FC>2). In addition, 1,167 differentially expressed unigenes were identified from larval and adult midguts, 435 unigenes were up-regulated in larval midgut and 732 unigenes were up-regulated in adult midgut. Most of the genes involved in RNA interference (RNAi) pathways were identified, which implies the existence of a system RNAi in EAB. Conclusions and Significance This study provides one of the most fundamental and comprehensive transcriptome resources available for EAB to date. Identification of the tissue- stage- or species- specific unigenes will benefit the further study of gene functions during growth and metamorphosis processes in EAB and other pest insects. PMID:26244979
Transcriptome Dynamics during Maize Endosperm Development
Feng, Jiaojiao; Xu, Shutu; Wang, Lei; Li, Feifei; Li, Yibo; Zhang, Renhe; Zhang, Xinghua; Xue, Jiquan; Guo, Dongwei
2016-01-01
The endosperm is a major organ of the seed that plays vital roles in determining seed weight and quality. However, genome-wide transcriptome patterns throughout maize endosperm development have not been comprehensively investigated to date. Accordingly, we performed a high-throughput RNA sequencing (RNA-seq) analysis of the maize endosperm transcriptome at 5, 10, 15 and 20 days after pollination (DAP). We found that more than 11,000 protein-coding genes underwent alternative splicing (AS) events during the four developmental stages studied. These genes were mainly involved in intracellular protein transport, signal transmission, cellular carbohydrate metabolism, cellular lipid metabolism, lipid biosynthesis, protein modification, histone modification, cellular amino acid metabolism, and DNA repair. Additionally, 7,633 genes, including 473 transcription factors (TFs), were differentially expressed among the four developmental stages. The differentially expressed TFs were from 50 families, including the bZIP, WRKY, GeBP and ARF families. Further analysis of the stage-specific TFs showed that binding, nucleus and ligand-dependent nuclear receptor activities might be important at 5 DAP, that immune responses, signalling, binding and lumen development are involved at 10 DAP, that protein metabolic processes and the cytoplasm might be important at 15 DAP, and that the responses to various stimuli are different at 20 DAP compared with the other developmental stages. This RNA-seq analysis provides novel, comprehensive insights into the transcriptome dynamics during early endosperm development in maize. PMID:27695101
Rai, Amit; Nakaya, Taiki; Shimizu, Yohei; Rai, Megha; Nakamura, Michimi; Suzuki, Hideyuki; Saito, Kazuki; Yamazaki, Mami
2018-05-29
Lithospermum officinale is a valuable source of bioactive metabolites with medicinal and industrial values. However, little is known about genes involved in the biosynthesis of these metabolites, primarily due to the lack of genome or transcriptome resources. This study presents the first effort to establish and characterize de novo transcriptome assembly resource for L. officinale and expression analysis for three of its tissues, namely leaf, stem, and root. Using over 4Gbps of RNA-sequencing datasets, we obtained de novo transcriptome assembly of L. officinale , consisting of 77,047 unigenes with assembly N50 value as 1524 bps. Based on transcriptome annotation and functional classification, 52,766 unigenes were assigned with putative genes functions, gene ontology terms, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. KEGG pathway and gene ontology enrichment analysis using highly expressed unigenes across three tissues and targeted metabolome analysis showed active secondary metabolic processes enriched specifically in the root of L. officinale . Using co-expression analysis, we also identified 20 and 48 unigenes representing different enzymes of lithospermic/chlorogenic acid and shikonin biosynthesis pathways, respectively. We further identified 15 candidate unigenes annotated as cytochrome P450 with the highest expression in the root of L. officinale as novel genes with a role in key biochemical reactions toward shikonin biosynthesis. Thus, through this study, we not only generated a high-quality genomic resource for L. officinale but also propose candidate genes to be involved in shikonin biosynthesis pathways for further functional characterization. Georg Thieme Verlag KG Stuttgart · New York.
Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P
2012-06-15
The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.
Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P
2012-01-01
The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development. PMID:22508961
2010-01-01
Background De novo assembly of transcript sequences produced by short-read DNA sequencing technologies offers a rapid approach to obtain expressed gene catalogs for non-model organisms. A draft genome sequence will be produced in 2010 for a Eucalyptus tree species (E. grandis) representing the most important hardwood fibre crop in the world. Genome annotation of this valuable woody plant and genetic dissection of its superior growth and productivity will be greatly facilitated by the availability of a comprehensive collection of expressed gene sequences from multiple tissues and organs. Results We present an extensive expressed gene catalog for a commercially grown E. grandis × E. urophylla hybrid clone constructed using only Illumina mRNA-Seq technology and de novo assembly. A total of 18,894 transcript-derived contigs, a large proportion of which represent full-length protein coding genes were assembled and annotated. Analysis of assembly quality, length and diversity show that this dataset represent the most comprehensive expressed gene catalog for any Eucalyptus tree. mRNA-Seq analysis furthermore allowed digital expression profiling of all of the assembled transcripts across diverse xylogenic and non-xylogenic tissues, which is invaluable for ascribing putative gene functions. Conclusions De novo assembly of Illumina mRNA-Seq reads is an efficient approach for transcriptome sequencing and profiling in Eucalyptus and other non-model organisms. The transcriptome resource (Eucspresso, http://eucspresso.bi.up.ac.za/) generated by this study will be of value for genomic analysis of woody biomass production in Eucalyptus and for comparative genomic analysis of growth and development in woody and herbaceous plants. PMID:21122097
Vathipadiekal, Vinod; Wang, Victoria; Wei, Wei; Waldron, Levi; Drapkin, Ronny; Gillette, Michael; Skates, Steven; Birrer, Michael
2015-11-01
To generate a comprehensive "Secretome" of proteins potentially found in the blood and derive a virtual Affymetrix array. To validate the utility of this database for the discovery of novel serum-based biomarkers using ovarian cancer transcriptomic data. The secretome was constructed by aggregating the data from databases of known secreted proteins, transmembrane or membrane proteins, signal peptides, G-protein coupled receptors, or proteins existing in the extracellular region, and the virtual array was generated by mapping them to Affymetrix probeset identifiers. Whole-genome microarray data from ovarian cancer, normal ovarian surface epithelium, and fallopian tube epithelium were used to identify transcripts upregulated in ovarian cancer. We established the secretome from eight public databases and a virtual array consisting of 16,521 Affymetrix U133 Plus 2.0 probesets. Using ovarian cancer transcriptomic data, we identified candidate blood-based biomarkers for ovarian cancer and performed bioinformatic validation by demonstrating rediscovery of known biomarkers including CA125 and HE4. Two novel top biomarkers (FGF18 and GPR172A) were validated in serum samples from an independent patient cohort. We present the secretome, comprising the most comprehensive resource available for protein products that are potentially found in the blood. The associated virtual array can be used to translate gene-expression data into cancer biomarker discovery. A list of blood-based biomarkers for ovarian cancer detection is reported and includes CA125 and HE4. FGF18 and GPR172A were identified and validated by ELISA as being differentially expressed in the serum of ovarian cancer patients compared with controls. ©2015 American Association for Cancer Research.
Høgslund, Niels; Radutoiu, Simona; Krusell, Lene; Voroshilova, Vera; Hannah, Matthew A.; Goffard, Nicolas; Sanchez, Diego H.; Lippold, Felix; Ott, Thomas; Sato, Shusei; Tabata, Satoshi; Liboriussen, Poul; Lohmann, Gitte V.; Schauser, Leif; Weiller, Georg F.; Udvardi, Michael K.; Stougaard, Jens
2009-01-01
Genetic analyses of plant symbiotic mutants has led to the identification of key genes involved in Rhizobium-legume communication as well as in development and function of nitrogen fixing root nodules. However, the impact of these genes in coordinating the transcriptional programs of nodule development has only been studied in limited and isolated studies. Here, we present an integrated genome-wide analysis of transcriptome landscapes in Lotus japonicus wild-type and symbiotic mutant plants. Encompassing five different organs, five stages of the sequentially developed determinate Lotus root nodules, and eight mutants impaired at different stages of the symbiotic interaction, our data set integrates an unprecedented combination of organ- or tissue-specific profiles with mutant transcript profiles. In total, 38 different conditions sampled under the same well-defined growth regimes were included. This comprehensive analysis unravelled new and unexpected patterns of transcriptional regulation during symbiosis and organ development. Contrary to expectations, none of the previously characterized nodulins were among the 37 genes specifically expressed in nodules. Another surprise was the extensive transcriptional response in whole root compared to the susceptible root zone where the cellular response is most pronounced. A large number of transcripts predicted to encode transcriptional regulators, receptors and proteins involved in signal transduction, as well as many genes with unknown function, were found to be regulated during nodule organogenesis and rhizobial infection. Combining wild type and mutant profiles of these transcripts demonstrates the activation of a complex genetic program that delineates symbiotic nitrogen fixation. The complete data set was organized into an indexed expression directory that is accessible from a resource database, and here we present selected examples of biological questions that can be addressed with this comprehensive and powerful gene expression data set. PMID:19662091
Jayaswall, Kuldip; Mahajan, Pallavi; Singh, Gagandeep; Parmar, Rajni; Seth, Romit; Raina, Aparnashree; Swarnkar, Mohit Kumar; Singh, Anil Kumar; Shankar, Ravi; Sharma, Ram Kumar
2016-01-01
To unravel the molecular mechanism of defense against blister blight (BB) disease caused by an obligate biotrophic fungus, Exobasidium vexans, transcriptome of BB interaction with resistance and susceptible tea genotypes was analysed through RNA-seq using Illumina GAIIx at four different stages during ~20-day disease cycle. Approximately 69 million high quality reads were assembled de novo, yielding 37,790 unique transcripts with more than 55% being functionally annotated. Differentially expressed, 149 defense related transcripts/genes, namely defense related enzymes, resistance genes, multidrug resistant transporters, transcription factors, retrotransposons, metacaspases and chaperons were observed in RG, suggesting their role in defending against BB. Being present in the major hub, putative master regulators among these candidates were identified from predetermined protein-protein interaction network of Arabidopsis thaliana. Further, confirmation of abundant expression of well-known RPM1, RPS2 and RPP13 in quantitative Real Time PCR indicates salicylic acid and jasmonic acid, possibly induce synthesis of antimicrobial compounds, required to overcome the virulence of E. vexans. Compendiously, the current study provides a comprehensive gene expression and insights into the molecular mechanism of tea defense against BB to serve as a resource for unravelling the possible regulatory mechanism of immunity against various biotic stresses in tea and other crops. PMID:27465480
NASA Astrophysics Data System (ADS)
Jayaswall, Kuldip; Mahajan, Pallavi; Singh, Gagandeep; Parmar, Rajni; Seth, Romit; Raina, Aparnashree; Swarnkar, Mohit Kumar; Singh, Anil Kumar; Shankar, Ravi; Sharma, Ram Kumar
2016-07-01
To unravel the molecular mechanism of defense against blister blight (BB) disease caused by an obligate biotrophic fungus, Exobasidium vexans, transcriptome of BB interaction with resistance and susceptible tea genotypes was analysed through RNA-seq using Illumina GAIIx at four different stages during ~20-day disease cycle. Approximately 69 million high quality reads were assembled de novo, yielding 37,790 unique transcripts with more than 55% being functionally annotated. Differentially expressed, 149 defense related transcripts/genes, namely defense related enzymes, resistance genes, multidrug resistant transporters, transcription factors, retrotransposons, metacaspases and chaperons were observed in RG, suggesting their role in defending against BB. Being present in the major hub, putative master regulators among these candidates were identified from predetermined protein-protein interaction network of Arabidopsis thaliana. Further, confirmation of abundant expression of well-known RPM1, RPS2 and RPP13 in quantitative Real Time PCR indicates salicylic acid and jasmonic acid, possibly induce synthesis of antimicrobial compounds, required to overcome the virulence of E. vexans. Compendiously, the current study provides a comprehensive gene expression and insights into the molecular mechanism of tea defense against BB to serve as a resource for unravelling the possible regulatory mechanism of immunity against various biotic stresses in tea and other crops.
Kang, Chunying; Darwish, Omar; Geretz, Aviva; Shahan, Rachel; Alkharouf, Nadim; Liu, Zhongchi
2013-01-01
Fragaria vesca, a diploid woodland strawberry with a small and sequenced genome, is an excellent model for studying fruit development. The strawberry fruit is unique in that the edible flesh is actually enlarged receptacle tissue. The true fruit are the numerous dry achenes dotting the receptacle’s surface. Auxin produced from the achene is essential for the receptacle fruit set, a paradigm for studying crosstalk between hormone signaling and development. To investigate the molecular mechanism underlying strawberry fruit set, next-generation sequencing was employed to profile early-stage fruit development with five fruit tissue types and five developmental stages from floral anthesis to enlarged fruits. This two-dimensional data set provides a systems-level view of molecular events with precise spatial and temporal resolution. The data suggest that the endosperm and seed coat may play a more prominent role than the embryo in auxin and gibberellin biosynthesis for fruit set. A model is proposed to illustrate how hormonal signals produced in the endosperm and seed coat coordinate seed, ovary wall, and receptacle fruit development. The comprehensive fruit transcriptome data set provides a wealth of genomic resources for the strawberry and Rosaceae communities as well as unprecedented molecular insight into fruit set and early stage fruit development. PMID:23898027
Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants.
Li, Xinguo; Wu, Harry X; Southerton, Simon G
2010-06-21
Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution.
Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants
2010-01-01
Background Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. Results The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conclusions Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution. PMID:20565927
Blood transcriptomics and metabolomics for personalized medicine.
Li, Shuzhao; Todor, Andrei; Luo, Ruiyan
2016-01-01
Molecular analysis of blood samples is pivotal to clinical diagnosis and has been intensively investigated since the rise of systems biology. Recent developments have opened new opportunities to utilize transcriptomics and metabolomics for personalized and precision medicine. Efforts from human immunology have infused into this area exquisite characterizations of subpopulations of blood cells. It is now possible to infer from blood transcriptomics, with fine accuracy, the contribution of immune activation and of cell subpopulations. In parallel, high-resolution mass spectrometry has brought revolutionary analytical capability, detecting > 10,000 metabolites, together with environmental exposure, dietary intake, microbial activity, and pharmaceutical drugs. Thus, the re-examination of blood chemicals by metabolomics is in order. Transcriptomics and metabolomics can be integrated to provide a more comprehensive understanding of the human biological states. We will review these new data and methods and discuss how they can contribute to personalized medicine.
USDA-ARS?s Scientific Manuscript database
In a collaboration with National Center for Genome Resources and University of Texas at El Paso researchers, we sequenced and assembled the transcriptome of the Haller's organ of an Australian strain (NRFS) of the cattle tick Rhipicephalus microplus (recently reclassified as Rhipicephalus australis...
Single-cell transcriptomics for microbial eukaryotes.
Kolisko, Martin; Boscaro, Vittorio; Burki, Fabien; Lynn, Denis H; Keeling, Patrick J
2014-11-17
One of the greatest hindrances to a comprehensive understanding of microbial genomics, cell biology, ecology, and evolution is that most microbial life is not in culture. Solutions to this problem have mainly focused on whole-community surveys like metagenomics, but these analyses inevitably loose information and present particular challenges for eukaryotes, which are relatively rare and possess large, gene-sparse genomes. Single-cell analyses present an alternative solution that allows for specific species to be targeted, while retaining information on cellular identity, morphology, and partitioning of activities within microbial communities. Single-cell transcriptomics, pioneered in medical research, offers particular potential advantages for uncultivated eukaryotes, but the efficiency and biases have not been tested. Here we describe a simple and reproducible method for single-cell transcriptomics using manually isolated cells from five model ciliate species; we examine impacts of amplification bias and contamination, and compare the efficacy of gene discovery to traditional culture-based transcriptomics. Gene discovery using single-cell transcriptomes was found to be comparable to mass-culture methods, suggesting single-cell transcriptomics is an efficient entry point into genomic data from the vast majority of eukaryotic biodiversity. Copyright © 2014 Elsevier Ltd. All rights reserved.
Sathyanarayana, N; Pittala, Ranjith Kumar; Tripathi, Pankaj Kumar; Chopra, Ratan; Singh, Heikham Russiachand; Belamkar, Vikas; Bhardwaj, Pardeep Kumar; Doyle, Jeff J; Egan, Ashley N
2017-05-25
The medicinal legume Mucuna pruriens (L.) DC. has attracted attention worldwide as a source of the anti-Parkinson's drug L-Dopa. It is also a popular green manure cover crop that offers many agronomic benefits including high protein content, nitrogen fixation and soil nutrients. The plant currently lacks genomic resources and there is limited knowledge on gene expression, metabolic pathways, and genetics of secondary metabolite production. Here, we present transcriptomic resources for M. pruriens, including a de novo transcriptome assembly and annotation, as well as differential transcript expression analyses between root, leaf, and pod tissues. We also develop microsatellite markers and analyze genetic diversity and population structure within a set of Indian germplasm accessions. One-hundred ninety-one million two hundred thirty-three thousand two hundred forty-two bp cleaned reads were assembled into 67,561 transcripts with mean length of 626 bp and N50 of 987 bp. Assembled sequences were annotated using BLASTX against public databases with over 80% of transcripts annotated. We identified 7,493 simple sequence repeat (SSR) motifs, including 787 polymorphic repeats between the parents of a mapping population. 134 SSRs from expressed sequenced tags (ESTs) were screened against 23 M. pruriens accessions from India, with 52 EST-SSRs retained after quality control. Population structure analysis using a Bayesian framework implemented in fastSTRUCTURE showed nearly similar groupings as with distance-based (neighbor-joining) and principal component analyses, with most of the accessions clustering per geographical origins. Pair-wise comparison of transcript expression in leaves, roots and pods identified 4,387 differentially expressed transcripts with the highest number occurring between roots and leaves. Differentially expressed transcripts were enriched with transcription factors and transcripts annotated as belonging to secondary metabolite pathways. The M. pruriens transcriptomic resources generated in this study provide foundational resources for gene discovery and development of molecular markers. Polymorphic SSRs identified can be used for genetic diversity, marker-trait analyses, and development of functional markers for crop improvement. The results of differential expression studies can be used to investigate genes involved in L-Dopa synthesis and other key metabolic pathways in M. pruriens.
2014-01-01
Background With its plumage color dimorphism and unique history in North America, including a recent population expansion and an epizootic of Mycoplasma gallisepticum (MG), the house finch (Haemorhous mexicanus) is a model species for studying sexual selection, plumage coloration and host-parasite interactions. As part of our ongoing efforts to make available genomic resources for this species, here we report a transcriptome assembly derived from genes expressed in spleen. Results We characterize transcriptomes from two populations with different histories of demography and disease exposure: a recently founded population in the eastern US that has been exposed to MG for over a decade and a native population from the western range that has never been exposed to MG. We utilize this resource to quantify conservation in gene expression in passerine birds over approximately 50 MY by comparing splenic expression profiles for 9,646 house finch transcripts and those from zebra finch and find that less than half of all genes expressed in spleen in either species are expressed in both species. Comparative gene annotations from several vertebrate species suggest that the house finch transcriptomes contain ~15 genes not yet found in previously sequenced vertebrate genomes. The house finch transcriptomes harbour ~85,000 SNPs, ~20,000 of which are non-synonymous. Although not yet validated by biological or technical replication, we identify a set of genes exhibiting differences between populations in gene expression (n = 182; 2% of all transcripts), allele frequencies (76 FST ouliers) and alternative splicing as well as genes with several fixed non-synonymous substitutions; this set includes genes with functions related to double-strand break repair and immune response. Conclusions The two house finch spleen transcriptome profiles will add to the increasing data on genome and transcriptome sequence information from natural populations. Differences in splenic expression between house finch and zebra finch imply either significant evolutionary turnover of splenic expression patterns or different physiological states of the individuals examined. The transcriptome resource will enhance the potential to annotate an eventual house finch genome, and the set of gene-based high-quality SNPs will help clarify the genetic underpinnings of host-pathogen interactions and sexual selection. PMID:24758272
Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich
2015-12-16
Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.
Integrated transcriptomic and proteomic evaluation of gentamicin nephrotoxicity in rats
DOE Office of Scientific and Technical Information (OSTI.GOV)
Com, Emmanuelle, E-mail: emmanuelle.com@univ-rennes1.fr; INSERM U625, Proteomics Core Facility Biogenouest, Rennes; Boitier, Eric
2012-01-01
Gentamicin is an aminoglycoside antibiotic, which induces renal tubular necrosis in rats. In the context of the European InnoMed PredTox project, transcriptomic and proteomic studies were performed to provide new insights into the molecular mechanisms of gentamicin-induced nephrotoxicity. Male Wistar rats were treated with 25 and 75 mg/kg/day subcutaneously for 1, 3 and 14 days. Histopathology observations showed mild tubular degeneration/necrosis and regeneration and moderate mononuclear cell infiltrate after long-term treatment. Transcriptomic data indicated a strong treatment-related gene expression modulation in kidney and blood cells at the high dose after 14 days of treatment, with the regulation of 463 andmore » 3241 genes, respectively. Of note, the induction of NF-kappa B pathway via the p38 MAPK cascade in the kidney, together with the activation of T-cell receptor signaling in blood cells were suggestive of inflammatory processes in relation with the recruitment of mononuclear cells in the kidney. Proteomic results showed a regulation of 163 proteins in kidney at the high dose after 14 days of treatment. These protein modulations were suggestive of a mitochondrial dysfunction with impairment of cellular energy production, induction of oxidative stress, an effect on protein biosynthesis and on cellular assembly and organization. Proteomic results also provided clues for potential nephrotoxicity biomarkers such as AGAT and PRBP4 which were strongly modulated in the kidney. Transcriptomic and proteomic data turned out to be complementary and their integration gave a more comprehensive insight into the putative mode of nephrotoxicity of gentamicin which was in accordance with histopathological findings. -- Highlights: ► Gentamicin induces renal tubular necrosis in rats. ► The mechanisms of gentamicin nephrotoxicity remain still elusive. ► Transcriptomic and proteomic analyses were performed to study this toxicity in rats. ► Transcriptomic and proteomic data turned out to be complementary and are integrated. ► A more comprehensive putative model of nephrotoxicity of gentamicin is presented.« less
Graupner, Nadine; Bock, Christina; Wodniok, Sabina; Grossmann, Lars; Vos, Matthijs; Sures, Bernd
2017-01-01
Background Chrysophytes are protist model species in ecology and ecophysiology and important grazers of bacteria-sized microorganisms and primary producers. However, they have not yet been investigated in detail at the molecular level, and no genomic and only little transcriptomic information is available. Chrysophytes exhibit different trophic modes: while phototrophic chrysophytes perform only photosynthesis, mixotrophs can gain carbon from bacterial food as well as from photosynthesis, and heterotrophs solely feed on bacteria-sized microorganisms. Recent phylogenies and megasystematics demonstrate an immense complexity of eukaryotic diversity with numerous transitions between phototrophic and heterotrophic organisms. The question we aim to answer is how the diverse nutritional strategies, accompanied or brought about by a reduction of the plasmid and size reduction in heterotrophic strains, affect physiology and molecular processes. Results We sequenced the mRNA of 18 chrysophyte strains on the Illumina HiSeq platform and analysed the transcriptomes to determine relations between the trophic mode (mixotrophic vs. heterotrophic) and gene expression. We observed an enrichment of genes for photosynthesis, porphyrin and chlorophyll metabolism for phototrophic and mixotrophic strains that can perform photosynthesis. Genes involved in nutrient absorption, environmental information processing and various transporters (e.g., monosaccharide, peptide, lipid transporters) were present or highly expressed only in heterotrophic strains that have to sense, digest and absorb bacterial food. We furthermore present a transcriptome-based alignment-free phylogeny construction approach using transcripts assembled from short reads to determine the evolutionary relationships between the strains and the possible influence of nutritional strategies on the reconstructed phylogeny. We discuss the resulting phylogenies in comparison to those from established approaches based on ribosomal RNA and orthologous genes. Finally, we make functionally annotated reference transcriptomes of each strain available to the community, significantly enhancing publicly available data on Chrysophyceae. Conclusions Our study is the first comprehensive transcriptomic characterisation of a diverse set of Chrysophyceaen strains. In addition, we showcase the possibility of inferring phylogenies from assembled transcriptomes using an alignment-free approach. The raw and functionally annotated data we provide will prove beneficial for further examination of the diversity within this taxon. Our molecular characterisation of different trophic modes presents a first such example. PMID:28097055
iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.
Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi
2018-01-01
We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.
Conrads, Kai H.; Roth, Siegfried; Lynch, Jeremy A.
2016-01-01
Despite recent efforts to sample broadly across metazoan and insect diversity, current sequence resources in the Coleoptera do not adequately describe the diversity of the clade. Here we present deep, staged transcriptomic data for two coleopteran species, Atrachya menetriesi (Faldermann 1835) and Callosobruchus maculatus (Fabricius 1775). Our sampling covered key stages in ovary and early embryonic development in each species. We utilized this data to build combined assemblies for each species which were then analysed in detail. The combined A. menetriesi assembly consists of 228,096 contigs with an N50 of 1,598 bp, while the combined C. maculatus assembly consists of 128,837 contigs with an N50 of 2,263 bp. For these assemblies, 34.6% and 32.4% of contigs were identified using Blast2GO, and 97% and 98.3% of the BUSCO set of metazoan orthologs were present, respectively. We also carried out manual annotation of developmental signalling pathways and found that nearly all expected genes were present in each transcriptome. Our analyses show that both transcriptomes are of high quality. Lastly, we performed read mapping utilising our timed, stage specific RNA samples to identify differentially expressed contigs. The resources presented here will provide a firm basis for a variety of experimentation, both in developmental biology and in comparative genomic studies. PMID:27907180
Zhu, Haisun; Casselman, Amy; Reppert, Steven M.
2008-01-01
North American monarch butterflies (Danaus plexippus) undergo a spectacular fall migration. In contrast to summer butterflies, migrants are juvenile hormone (JH) deficient, which leads to reproductive diapause and increased longevity. Migrants also utilize time-compensated sun compass orientation to help them navigate to their overwintering grounds. Here, we describe a brain expressed sequence tag (EST) resource to identify genes involved in migratory behaviors. A brain EST library was constructed from summer and migrating butterflies. Of 9,484 unique sequences, 6068 had positive hits with the non-redundant protein database; the EST database likely represents ∼52% of the gene-encoding potential of the monarch genome. The brain transcriptome was cataloged using Gene Ontology and compared to Drosophila. Monarch genes were well represented, including those implicated in behavior. Three genes involved in increased JH activity (allatotropin, juvenile hormone acid methyltransfersase, and takeout) were upregulated in summer butterflies, compared to migrants. The locomotion-relevant turtle gene was marginally upregulated in migrants, while the foraging and single-minded genes were not differentially regulated. Many of the genes important for the monarch circadian clock mechanism (involved in sun compass orientation) were in the EST resource, including the newly identified cryptochrome 2. The EST database also revealed a novel Na+/K+ ATPase allele predicted to be more resistant to the toxic effects of milkweed than that reported previously. Potential genetic markers were identified from 3,486 EST contigs and included 1599 double-hit single nucleotide polymorphisms (SNPs) and 98 microsatellite polymorphisms. These data provide a template of the brain transcriptome for the monarch butterfly. Our “snap-shot” analysis of the differential regulation of candidate genes between summer and migratory butterflies suggests that unbiased, comprehensive transcriptional profiling will inform the molecular basis of migration. The identified SNPs and microsatellite polymorphisms can be used as genetic markers to address questions of population and subspecies structure. PMID:18183285
Gonzalez, Sergio; Clavijo, Bernardo; Rivarola, Máximo; Moreno, Patricio; Fernandez, Paula; Dopazo, Joaquín; Paniego, Norma
2017-02-22
In the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species. We developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results. ATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net .
Góngora-Castillo, Elsa; Childs, Kevin L.; Fedewa, Greg; Hamilton, John P.; Liscombe, David K.; Magallanes-Lundback, Maria; Mandadi, Kranthi K.; Nims, Ezekiel; Runguphan, Weerawat; Vaillancourt, Brieanne; Varbanova-Herde, Marina; DellaPenna, Dean; McKnight, Thomas D.; O’Connor, Sarah; Buell, C. Robin
2012-01-01
The natural diversity of plant metabolism has long been a source for human medicines. One group of plant-derived compounds, the monoterpene indole alkaloids (MIAs), includes well-documented therapeutic agents used in the treatment of cancer (vinblastine, vincristine, camptothecin), hypertension (reserpine, ajmalicine), malaria (quinine), and as analgesics (7-hydroxymitragynine). Our understanding of the biochemical pathways that synthesize these commercially relevant compounds is incomplete due in part to a lack of molecular, genetic, and genomic resources for the identification of the genes involved in these specialized metabolic pathways. To address these limitations, we generated large-scale transcriptome sequence and expression profiles for three species of Asterids that produce medicinally important MIAs: Camptotheca acuminata, Catharanthus roseus, and Rauvolfia serpentina. Using next generation sequencing technology, we sampled the transcriptomes of these species across a diverse set of developmental tissues, and in the case of C. roseus, in cultured cells and roots following elicitor treatment. Through an iterative assembly process, we generated robust transcriptome assemblies for all three species with a substantial number of the assembled transcripts being full or near-full length. The majority of transcripts had a related sequence in either UniRef100, the Arabidopsis thaliana predicted proteome, or the Pfam protein domain database; however, we also identified transcripts that lacked similarity with entries in either database and thereby lack a known function. Representation of known genes within the MIA biosynthetic pathway was robust. As a diverse set of tissues and treatments were surveyed, expression abundances of transcripts in the three species could be estimated to reveal transcripts associated with development and response to elicitor treatment. Together, these transcriptomes and expression abundance matrices provide a rich resource for understanding plant specialized metabolism, and promotes realization of innovative production systems for plant-derived pharmaceuticals. PMID:23300689
Transcriptome and Proteome Exploration to Provide a Resource for the Study of Agrocybe aegerita
Jiang, Shuai; Chen, Yijie; Yin, Yalin; Pan, Yongfu; Yu, Guojun; Li, Yamu; Wong, Barry Hon Cheung; Liang, Yi; Sun, Hui
2013-01-01
Background Agrocybe aegerita, the black poplar mushroom, has been highly valued as a functional food for its medicinal and nutritional benefits. Several bioactive extracts from A. aegerita have been found to exhibit antitumor and antioxidant activities. However, limited genetic resources for A. aegerita have hindered exploration of this species. Methodology/Principal Findings To facilitate the research on A. aegerita, we established a deep survey of the transcriptome and proteome of this mushroom. We applied high-throughput sequencing technology (Illumina) to sequence A. aegerita transcriptomes from mycelium and fruiting body. The raw clean reads were de novo assembled into a total of 36,134 expressed sequences tags (ESTs) with an average length of 663 bp. These ESTs were annotated and classified according to Gene Ontology (GO), Clusters of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways. Gene expression profile analysis showed that 18,474 ESTs were differentially expressed, with 10,131 up-regulated in mycelium and 8,343 up-regulated in fruiting body. Putative genes involved in polysaccharide and steroid biosynthesis were identified from A. aegerita transcriptome, and these genes were differentially expressed at the two stages of A. aegerita. Based on one-dimensional gel electrophoresis (1-DGE) coupled with electrospray ionization liquid chromatography tandem MS (LC-ESI-MS/MS), we identified a total of 309 non-redundant proteins. And many metabolic enzymes involved in glycolysis were identified in the protein database. Conclusions/Significance This is the first study on transcriptome and proteome analyses of A. aegerita. The data in this study serve as a resource of A. aegerita transcripts and proteins, and offer clues to the applications of this mushroom in nutrition, pharmacy and industry. PMID:23418592
Leslie, Trent; Baucom, Regina S.
2014-01-01
Human-mediated selection can lead to rapid evolution in very short time scales, and the evolution of herbicide resistance in agricultural weeds is an excellent example of this phenomenon. The common morning glory, Ipomoea purpurea, is resistant to the herbicide glyphosate, but genetic investigations of this trait have been hampered by the lack of genomic resources for this species. Here, we present the annotated transcriptome of the common morning glory, Ipomoea purpurea, along with an examination of whole genome expression profiling to assess potential gene expression differences between three artificially selected herbicide resistant lines and three susceptible lines. The assembled Ipomoea transcriptome reported in this work contains 65,459 assembled transcripts, ~28,000 of which were functionally annotated by assignment to Gene Ontology categories. Our RNA-seq survey using this reference transcriptome identified 19 differentially expressed genes associated with resistance—one of which, a cytochrome P450, belongs to a large plant family of genes involved in xenobiotic detoxification. The differentially expressed genes also broadly implicated receptor-like kinases, which were down-regulated in the resistant lines, and other growth and defense genes, which were up-regulated in resistant lines. Interestingly, the target of glyphosate—EPSP synthase—was not overexpressed in the resistant Ipomoea lines as in other glyphosate resistant weeds. Overall, this work identifies potential candidate resistance loci for future investigations and dramatically increases genomic resources for this species. The assembled transcriptome presented herein will also provide a valuable resource to the Ipomoea community, as well as to those interested in utilizing the close relationship between the Convolvulaceae and the Solanaceae for phylogenetic and comparative genomics examinations. PMID:25155274
Leslie, Trent; Baucom, Regina S
2014-08-25
Human-mediated selection can lead to rapid evolution in very short time scales, and the evolution of herbicide resistance in agricultural weeds is an excellent example of this phenomenon. The common morning glory, Ipomoea purpurea, is resistant to the herbicide glyphosate, but genetic investigations of this trait have been hampered by the lack of genomic resources for this species. Here, we present the annotated transcriptome of the common morning glory, Ipomoea purpurea, along with an examination of whole genome expression profiling to assess potential gene expression differences between three artificially selected herbicide resistant lines and three susceptible lines. The assembled Ipomoea transcriptome reported in this work contains 65,459 assembled transcripts, ~28,000 of which were functionally annotated by assignment to Gene Ontology categories. Our RNA-seq survey using this reference transcriptome identified 19 differentially expressed genes associated with resistance-one of which, a cytochrome P450, belongs to a large plant family of genes involved in xenobiotic detoxification. The differentially expressed genes also broadly implicated receptor-like kinases, which were down-regulated in the resistant lines, and other growth and defense genes, which were up-regulated in resistant lines. Interestingly, the target of glyphosate-EPSP synthase-was not overexpressed in the resistant Ipomoea lines as in other glyphosate resistant weeds. Overall, this work identifies potential candidate resistance loci for future investigations and dramatically increases genomic resources for this species. The assembled transcriptome presented herein will also provide a valuable resource to the Ipomoea community, as well as to those interested in utilizing the close relationship between the Convolvulaceae and the Solanaceae for phylogenetic and comparative genomics examinations. Copyright © 2014 Leslie and Baucom.
Transcriptome and proteome exploration to provide a resource for the study of Agrocybe aegerita.
Wang, Man; Gu, Bianli; Huang, Jie; Jiang, Shuai; Chen, Yijie; Yin, Yalin; Pan, Yongfu; Yu, Guojun; Li, Yamu; Wong, Barry Hon Cheung; Liang, Yi; Sun, Hui
2013-01-01
Agrocybe aegerita, the black poplar mushroom, has been highly valued as a functional food for its medicinal and nutritional benefits. Several bioactive extracts from A. aegerita have been found to exhibit antitumor and antioxidant activities. However, limited genetic resources for A. aegerita have hindered exploration of this species. To facilitate the research on A. aegerita, we established a deep survey of the transcriptome and proteome of this mushroom. We applied high-throughput sequencing technology (Illumina) to sequence A. aegerita transcriptomes from mycelium and fruiting body. The raw clean reads were de novo assembled into a total of 36,134 expressed sequences tags (ESTs) with an average length of 663 bp. These ESTs were annotated and classified according to Gene Ontology (GO), Clusters of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways. Gene expression profile analysis showed that 18,474 ESTs were differentially expressed, with 10,131 up-regulated in mycelium and 8,343 up-regulated in fruiting body. Putative genes involved in polysaccharide and steroid biosynthesis were identified from A. aegerita transcriptome, and these genes were differentially expressed at the two stages of A. aegerita. Based on one-dimensional gel electrophoresis (1-DGE) coupled with electrospray ionization liquid chromatography tandem MS (LC-ESI-MS/MS), we identified a total of 309 non-redundant proteins. And many metabolic enzymes involved in glycolysis were identified in the protein database. This is the first study on transcriptome and proteome analyses of A. aegerita. The data in this study serve as a resource of A. aegerita transcripts and proteins, and offer clues to the applications of this mushroom in nutrition, pharmacy and industry.
Transcriptome of the Antarctic brooding gastropod mollusc Margarella antarctica.
Clark, Melody S; Thorne, Michael A S
2015-12-01
454 RNA-Seq transcriptome data were generated from foot tissue of the Antarctic brooding gastropod mollusc Margarella antarctica. A total of 6195 contigs were assembled de novo, providing a useful resource for researchers with an interest in Antarctic marine species, phylogenetics and mollusc biology, especially shell production. Copyright © 2015 Elsevier B.V. All rights reserved.
Detecting specific infections in children through host responses: a paradigm shift.
Mejias, Asuncion; Suarez, Nicolas M; Ramilo, Octavio
2014-06-01
There is a need for improved diagnosis and for optimal classification of patients with infectious diseases. An alternative approach to the pathogen-detection strategy is based on a comprehensive analysis of the host response to the infection. This review focuses on the value of transcriptome analyses of blood leukocytes for the diagnosis and management of patients with infectious diseases. Initial studies showed that RNA from blood leukocytes of children with acute viral and bacterial infections carried pathogen-specific transcriptional signatures. Subsequently, transcriptional signatures for several other infections have been described and validated in humans with malaria, dengue, salmonella, melioidosis, respiratory syncytial virus, influenza, tuberculosis, and HIV. In addition, transcriptome analyses represent an invaluable tool to understand disease pathogenesis and to objectively classify patients according to the clinical severity. Microarray studies have been shown to be highly reproducible using different platforms, and in different patient populations, confirming the value of blood transcriptome analyses to study pathogen-specific host immune responses in the clinical setting. Combining the detection of the pathogen with a comprehensive assessment of the host immune response will provide a new understanding of the correlations between specific causative agents, the host response, and the clinical manifestations of the disease.
2012-01-01
Background Filamentous fungi are confronted with changes and limitations of their carbon source during growth in their natural habitats and during industrial applications. To survive life-threatening starvation conditions, carbon from endogenous resources becomes mobilized to fuel maintenance and self-propagation. Key to understand the underlying cellular processes is the system-wide analysis of fungal starvation responses in a temporal and spatial resolution. The knowledge deduced is important for the development of optimized industrial production processes. Results This study describes the physiological, morphological and genome-wide transcriptional changes caused by prolonged carbon starvation during submerged batch cultivation of the filamentous fungus Aspergillus niger. Bioreactor cultivation supported highly reproducible growth conditions and monitoring of physiological parameters. Changes in hyphal growth and morphology were analyzed at distinct cultivation phases using automated image analysis. The Affymetrix GeneChip platform was used to establish genome-wide transcriptional profiles for three selected time points during prolonged carbon starvation. Compared to the exponential growth transcriptome, about 50% (7,292) of all genes displayed differential gene expression during at least one of the starvation time points. Enrichment analysis of Gene Ontology, Pfam domain and KEGG pathway annotations uncovered autophagy and asexual reproduction as major global transcriptional trends. Induced transcription of genes encoding hydrolytic enzymes was accompanied by increased secretion of hydrolases including chitinases, glucanases, proteases and phospholipases as identified by mass spectrometry. Conclusions This study is the first system-wide analysis of the carbon starvation response in a filamentous fungus. Morphological, transcriptomic and secretomic analyses identified key events important for fungal survival and their chronology. The dataset obtained forms a comprehensive framework for further elucidation of the interrelation and interplay of the individual cellular events involved. PMID:22873931
Yamamoto, Naoki; Takano, Tomoyuki; Tanaka, Keisuke; Ishige, Taichiro; Terashima, Shin; Endo, Chisato; Kurusu, Takamitsu; Yajima, Shunsuke; Yano, Kentaro; Tada, Yuichi
2015-01-01
The turf grass Sporobolus virginicus is halophyte and has high salinity tolerance. To investigate the molecular basis of its remarkable tolerance, we performed Illumina high-throughput RNA sequencing on roots and shoots of a S. virginicus genotype under normal and saline conditions. The 130 million short reads were assembled into 444,242 unigenes. A comparative analysis of the transcriptome with rice and Arabidopsis transcriptome revealed six turf grass-specific unigenes encoding transcription factors. Interestingly, all of them showed root specific expression and five of them encode bZIP type transcription factors. Another remarkable transcriptional feature of S. virginicus was activation of specific pathways under salinity stress. Pathway enrichment analysis suggested transcriptional activation of amino acid, pyruvate, and phospholipid metabolism. Up-regulation of several unigenes, previously shown to respond to salt stress in other halophytes was also observed. Gene Ontology enrichment analysis revealed that unigenes assigned as proteins in response to water stress, such as dehydrin and aquaporin, and transporters such as cation, amino acid, and citrate transporters, and H+-ATPase, were up-regulated in both shoots and roots under salinity. A correspondence analysis of the enriched pathways in turf grass cells, but not in rice cells, revealed two groups of unigenes similarly up-regulated in the turf grass in response to salt stress; one of the groups, showing excessive up-regulation under salinity, included unigenes homologos to salinity responsive genes in other halophytes. Thus, the present study identified candidate genes involved in salt tolerance of S. virginicus. This genetic resource should be valuable for understanding the mechanisms underlying high salt tolerance in S. virginicus. This information can also provide insight into salt tolerance in other halophytes. PMID:25954282
Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng
2017-01-01
Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar ‘Fusi-3’. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1–6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism. PMID:29145430
PATRIC, the bacterial bioinformatics database and analysis resource.
Wattam, Alice R; Abraham, David; Dalay, Oral; Disz, Terry L; Driscoll, Timothy; Gabbard, Joseph L; Gillespie, Joseph J; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K; Olson, Robert; Overbeek, Ross; Pusch, Gordon D; Shukla, Maulik; Schulman, Julie; Stevens, Rick L; Sullivan, Daniel E; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J C; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W
2014-01-01
The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.
PATRIC, the bacterial bioinformatics database and analysis resource
Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.
2014-01-01
The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323
Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan
2013-01-01
Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system. PMID:24009133
Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan
2013-01-01
Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system.
Kim, Mi Ae; Rhee, Jae-Sung; Kim, Tae Ha; Lee, Jung Sick; Choi, Ah-Young; Choi, Beom-Soon; Choi, Ik-Young; Sohn, Young Chang
2017-03-09
In order to characterize the female or male transcriptome of the Pacific abalone and further increase genomic resources, we sequenced the mRNA of full-length complementary DNA (cDNA) libraries derived from pooled tissues of female and male Haliotis discus hannai by employing the Iso-Seq protocol of the PacBio RSII platform. We successfully assembled whole full-length cDNA sequences and constructed a transcriptome database that included isoform information. After clustering, a total of 15,110 and 12,145 genes that coded for proteins were identified in female and male abalones, respectively. A total of 13,057 putative orthologs were retained from each transcriptome in abalones. Overall Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyzed in each database showed a similar composition between sexes. In addition, a total of 519 and 391 isoforms were genome-widely identified with at least two isoforms from female and male transcriptome databases. We found that the number of isoforms and their alternatively spliced patterns are variable and sex-dependent. This information represents the first significant contribution to sex-preferential genomic resources of the Pacific abalone. The availability of whole female and male transcriptome database and their isoform information will be useful to improve our understanding of molecular responses and also for the analysis of population dynamics in the Pacific abalone.
Kim, Mi Ae; Rhee, Jae-Sung; Kim, Tae Ha; Lee, Jung Sick; Choi, Ah-Young; Choi, Beom-Soon; Choi, Ik-Young; Sohn, Young Chang
2017-01-01
In order to characterize the female or male transcriptome of the Pacific abalone and further increase genomic resources, we sequenced the mRNA of full-length complementary DNA (cDNA) libraries derived from pooled tissues of female and male Haliotis discus hannai by employing the Iso-Seq protocol of the PacBio RSII platform. We successfully assembled whole full-length cDNA sequences and constructed a transcriptome database that included isoform information. After clustering, a total of 15,110 and 12,145 genes that coded for proteins were identified in female and male abalones, respectively. A total of 13,057 putative orthologs were retained from each transcriptome in abalones. Overall Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyzed in each database showed a similar composition between sexes. In addition, a total of 519 and 391 isoforms were genome-widely identified with at least two isoforms from female and male transcriptome databases. We found that the number of isoforms and their alternatively spliced patterns are variable and sex-dependent. This information represents the first significant contribution to sex-preferential genomic resources of the Pacific abalone. The availability of whole female and male transcriptome database and their isoform information will be useful to improve our understanding of molecular responses and also for the analysis of population dynamics in the Pacific abalone. PMID:28282934
A comprehensive catalogue of the coding and non-coding transcripts of the human inner ear
Corneveaux, Jason J.; Ohmen, Jeffrey; White, Cory; Allen, April N.; Lusis, Aldons J.; Van Camp, Guy; Huentelman, Matthew J.; Friedman, Rick A.
2015-01-01
The mammalian inner ear consists of the cochlea and the vestibular labyrinth (utricle, saccule, and semicircular canals), which participate in both hearing and balance. Proper development and life-long function of these structures involves a highly complex coordinated system of spatial and temporal gene expression. The characterization of the inner ear transcriptome is likely important for the functional study of auditory and vestibular components, yet, primarily due to tissue unavailability, detailed expression catalogues of the human inner ear remain largely incomplete. We report here, for the first time, comprehensive transcriptome characterization of the adult human cochlea, ampulla, saccule and utricle of the vestibule obtained from patients without hearing abnormalities. Using RNA-Seq, we measured the expression of >50,000 predicted genes corresponding to approximately 200,000 transcripts, in the adult inner ear and compared it to 32 other human tissues. First, we identified genes preferentially expressed in the inner ear, and unique either to the vestibule or cochlea. Next, we examined expression levels of specific groups of potentially interesting RNAs, such as genes implicated in hearing loss, long non-coding RNAs, pseudogenes and transcripts subject to nonsense mediated decay (NMD). We uncover the spatial specificity of expression of these RNAs in the hearing/balance system, and reveal evidence of tissue specific NMD. Lastly, we investigated the non-syndromic deafness loci to which no gene has been mapped, and narrow the list of potential candidates for each locus. These data represent the first high-resolution transcriptome catalogue of the adult human inner ear. A comprehensive identification of coding and non-coding RNAs in the inner ear will enable pathways of auditory and vestibular function to be further defined in the study of hearing and balance. Expression data are freely accessible at https://www.tgen.org/home/research/research-divisions/neurogenomics/supplementary-data/inner-ear-transcriptome.aspx PMID:26341477
Lin, Zixin; An, Jiyong; Wang, Jia; Niu, Jun; Ma, Chao; Wang, Libing; Yuan, Guanshen; Shi, Lingling; Liu, Lili; Zhang, Jinsong; Zhang, Zhixiang; Qi, Ji; Lin, Shanzhi
2017-01-01
Lindera glauca fruit with high quality and quantity of oil has emerged as a novel potential source of biodiesel in China, but the molecular regulatory mechanism of carbon flux and energy source for oil biosynthesis in developing fruits is still unknown. To better develop fruit oils of L. glauca as woody biodiesel, a combination of two different sequencing platforms (454 and Illumina) and qRT-PCR analysis was used to define a minimal reference transcriptome of developing L. glauca fruits, and to construct carbon and energy metabolic model for regulation of carbon partitioning and energy supply for FA biosynthesis and oil accumulation. We first analyzed the dynamic patterns of growth tendency, oil content, FA compositions, biodiesel properties, and the contents of ATP and pyridine nucleotide of L. glauca fruits from seven different developing stages. Comprehensive characterization of transcriptome of the developing L. glauca fruit was performed using a combination of two different next-generation sequencing platforms, of which three representative fruit samples (50, 125, and 150 DAF) and one mixed sample from seven developing stages were selected for Illumina and 454 sequencing, respectively. The unigenes separately obtained from long and short reads (201, and 259, respectively, in total) were reconciled using TGICL software, resulting in a total of 60,031 unigenes (mean length = 1061.95 bp) to describe a transcriptome for developing L. glauca fruits. Notably, 198 genes were annotated for photosynthesis, sucrose cleavage, carbon allocation, metabolite transport, acetyl-CoA formation, oil synthesis, and energy metabolism, among which some specific transporters, transcription factors, and enzymes were identified to be implicated in carbon partitioning and energy source for oil synthesis by an integrated analysis of transcriptomic sequencing and qRT-PCR. Importantly, the carbon and energy metabolic model was well established for oil biosynthesis of developing L. glauca fruits, which could help to reveal the molecular regulatory mechanism of the increased oil production in developing fruits. This study presents for the first time the application of an integrated two different sequencing analyses (Illumina and 454) and qRT-PCR detection to define a minimal reference transcriptome for developing L. glauca fruits, and to elucidate the molecular regulatory mechanism of carbon flux control and energy provision for oil synthesis. Our results will provide a valuable resource for future fundamental and applied research on the woody biodiesel plants.
Genomic Resources Notes Accepted 1 June 2015-31 July 2015.
Álvarez, P; Arthofer, Wolfgang; Coelho, Maria M; Conklin, D; Estonba, A; Grosso, Ana R; Helyar, S J; Langa, J; Machado, Miguel P; Montes, I; Pinho, Joana; Rief, Alexander; Schartl, Manfred; Schlick-Steiner, Birgit C; Seeber, Julia; Steiner, Florian M; Vilas, C
2015-11-01
This article documents the public availability of (i) microbiomes in diet and gut of larvae from the dipteran Dilophus febrilis using massive parallel sequencing, (ii) SNP and SSR discovery and characterization in the transcriptome of the Atlantic mackerel (Scomber scombrus, L) and (iii) assembled transcriptome for an endangered, endemic Iberian cyprinid fish (Squalius pyrenaicus). © 2015 John Wiley & Sons Ltd.
USDA-ARS?s Scientific Manuscript database
This study reports generation of large-scale genomic resources for pigeonpea, a so-called ‘orphan crop species’ of the semi-arid tropic regions. Roche FLX/454 sequencing was carried out on a normalized cDNA pool prepared from 31 tissues produced 494,353 short transcript reads (STRs). Cluster analysi...
Single-cell Transcriptome Study as Big Data
Yu, Pingjian; Lin, Wei
2016-01-01
The rapid growth of single-cell RNA-seq studies (scRNA-seq) demands efficient data storage, processing, and analysis. Big-data technology provides a framework that facilitates the comprehensive discovery of biological signals from inter-institutional scRNA-seq datasets. The strategies to solve the stochastic and heterogeneous single-cell transcriptome signal are discussed in this article. After extensively reviewing the available big-data applications of next-generation sequencing (NGS)-based studies, we propose a workflow that accounts for the unique characteristics of scRNA-seq data and primary objectives of single-cell studies. PMID:26876720
Moazzzam Jazi, Maryam; Seyedi, Seyed Mahdi; Ebrahimie, Esmaeil; Ebrahimi, Mansour; De Moro, Gianluca; Botanga, Christopher
2017-08-17
Pistachio (Pistacia vera L.) is one of the most important commercial nut crops worldwide. It is a salt-tolerant and long-lived tree, with the largest cultivation area in Iran. Climate change and subsequent increased soil salt content have adversely affected the pistachio yield in recent years. However, the lack of genomic/global transcriptomic sequences on P. vera impedes comprehensive researches at the molecular level. Hence, whole transcriptome sequencing is required to gain insight into functional genes and pathways in response to salt stress. RNA sequencing of a pooled sample representing 24 different tissues of two pistachio cultivars with contrasting salinity tolerance under control and salt treatment by Illumina Hiseq 2000 platform resulted in 368,953,262 clean 100 bp paired-ends reads (90 Gb). Following creating several assemblies and assessing their quality from multiple perspectives, we found that using the annotation-based metrics together with the length-based parameters allows an improved assessment of the transcriptome assembly quality, compared to the solely use of the length-based parameters. The generated assembly by Trinity was adopted for functional annotation and subsequent analyses. In total, 29,119 contigs annotated against all of five public databases, including NR, UniProt, TAIR10, KOG and InterProScan. Among 279 KEGG pathways supported by our assembly, we further examined the pathways involved in the plant hormone biosynthesis and signaling as well as those to be contributed to secondary metabolite biosynthesis due to their importance under salinity stress. In total, 11,337 SSRs were also identified, which the most abundant being dinucleotide repeats. Besides, 13,097 transcripts as candidate stress-responsive genes were identified. Expression of some of these genes experimentally validated through quantitative real-time PCR (qRT-PCR) that further confirmed the accuracy of the assembly. From this analysis, the contrasting expression pattern of NCED3 and SOS1 genes were observed between salt-sensitive and salt-tolerant cultivars. This study, as the first report on the whole transcriptome survey of P. vera, provides important resources and paves the way for functional and comparative genomic studies on this major tree to discover the salinity tolerance-related markers and stress response mechanisms for breeding of new pistachio cultivars with more salinity tolerance.
Ruffier, Magali; Kähäri, Andreas; Komorowska, Monika; Keenan, Stephen; Laird, Matthew; Longden, Ian; Proctor, Glenn; Searle, Steve; Staines, Daniel; Taylor, Kieron; Vullo, Alessandro; Yates, Andrew; Zerbino, Daniel; Flicek, Paul
2017-01-01
The Ensembl software resources are a stable infrastructure to store, access and manipulate genome assemblies and their functional annotations. The Ensembl 'Core' database and Application Programming Interface (API) was our first major piece of software infrastructure and remains at the centre of all of our genome resources. Since its initial design more than fifteen years ago, the number of publicly available genomic, transcriptomic and proteomic datasets has grown enormously, accelerated by continuous advances in DNA-sequencing technology. Initially intended to provide annotation for the reference human genome, we have extended our framework to support the genomes of all species as well as richer assembly models. Cross-referenced links to other informatics resources facilitate searching our database with a variety of popular identifiers such as UniProt and RefSeq. Our comprehensive and robust framework storing a large diversity of genome annotations in one location serves as a platform for other groups to generate and maintain their own tailored annotation. We welcome reuse and contributions: our databases and APIs are publicly available, all of our source code is released with a permissive Apache v2.0 licence at http://github.com/Ensembl and we have an active developer mailing list ( http://www.ensembl.org/info/about/contact/index.html ). http://www.ensembl.org. © The Author(s) 2017. Published by Oxford University Press.
Guttman, Mitchell; Garber, Manuel; Levin, Joshua Z.; Donaghey, Julie; Robinson, James; Adiconis, Xian; Fan, Lin; Koziol, Magdalena J.; Gnirke, Andreas; Nusbaum, Chad; Rinn, John L.; Lander, Eric S.; Regev, Aviv
2010-01-01
RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes. PMID:20436462
Makita, Yuko; Kawashima, Mika; Lau, Nyok Sean; Othman, Ahmad Sofiman; Matsui, Minami
2018-01-19
Natural rubber is an economically important material. Currently the Pará rubber tree, Hevea brasiliensis is the main commercial source. Little is known about rubber biosynthesis at the molecular level. Next-generation sequencing (NGS) technologies brought draft genomes of three rubber cultivars and a variety of RNA sequencing (RNA-seq) data. However, no current genome or transcriptome databases (DB) are organized by gene. A gene-oriented database is a valuable support for rubber research. Based on our original draft genome sequence of H. brasiliensis RRIM600, we constructed a rubber tree genome and transcriptome DB. Our DB provides genome information including gene functional annotations and multi-transcriptome data of RNA-seq, full-length cDNAs including PacBio Isoform sequencing (Iso-Seq), ESTs and genome wide transcription start sites (TSSs) derived from CAGE technology. Using our original and publically available RNA-seq data, we calculated co-expressed genes for identifying functionally related gene sets and/or genes regulated by the same transcription factor (TF). Users can access multi-transcriptome data through both a gene-oriented web page and a genome browser. For the gene searching system, we provide keyword search, sequence homology search and gene expression search; users can also select their expression threshold easily. The rubber genome and transcriptome DB provides rubber tree genome sequence and multi-transcriptomics data. This DB is useful for comprehensive understanding of the rubber transcriptome. This will assist both industrial and academic researchers for rubber and economically important close relatives such as R. communis, M. esculenta and J. curcas. The Rubber Transcriptome DB release 2017.03 is accessible at http://matsui-lab.riken.jp/rubber/ .
Omics studies of citrus, grape and rosaceae fruit trees
Shiratake, Katsuhiro; Suzuki, Mami
2016-01-01
Recent advance of bioinformatics and analytical apparatuses such as next generation DNA sequencer (NGS) and mass spectrometer (MS) has brought a big wave of comprehensive study to biology. Comprehensive study targeting all genes, transcripts (RNAs), proteins, metabolites, hormones, ions or phenotypes is called genomics, transcriptomics, proteomics, metabolomics, hormonomics, ionomics or phenomics, respectively. These omics are powerful approaches to identify key genes for important traits, to clarify events of physiological mechanisms and to reveal unknown metabolic pathways in crops. Recently, the use of omics approach has increased dramatically in fruit tree research. Although the most reported omics studies on fruit trees are transcriptomics, proteomics and metabolomics, and a few is reported on hormonomics and ionomics. In this article, we reviewed recent omics studies of major fruit trees, i.e. citrus, grapevine and rosaceae fruit trees. The effectiveness and prospects of omics in fruit tree research will as well be highlighted. PMID:27069397
Omics studies of citrus, grape and rosaceae fruit trees.
Shiratake, Katsuhiro; Suzuki, Mami
2016-01-01
Recent advance of bioinformatics and analytical apparatuses such as next generation DNA sequencer (NGS) and mass spectrometer (MS) has brought a big wave of comprehensive study to biology. Comprehensive study targeting all genes, transcripts (RNAs), proteins, metabolites, hormones, ions or phenotypes is called genomics, transcriptomics, proteomics, metabolomics, hormonomics, ionomics or phenomics, respectively. These omics are powerful approaches to identify key genes for important traits, to clarify events of physiological mechanisms and to reveal unknown metabolic pathways in crops. Recently, the use of omics approach has increased dramatically in fruit tree research. Although the most reported omics studies on fruit trees are transcriptomics, proteomics and metabolomics, and a few is reported on hormonomics and ionomics. In this article, we reviewed recent omics studies of major fruit trees, i.e. citrus, grapevine and rosaceae fruit trees. The effectiveness and prospects of omics in fruit tree research will as well be highlighted.
Soybean Knowledge Base (SoyKB): a Web Resource for Soybean Translational Genomics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joshi, Trupti; Patil, Kapil; Fitzpatrick, Michael R.
2012-01-17
Background: Soybean Knowledge Base (SoyKB) is a comprehensive all-inclusive web resource for soybean translational genomics. SoyKB is designed to handle the management and integration of soybean genomics, transcriptomics, proteomics and metabolomics data along with annotation of gene function and biological pathway. It contains information on four entities, namely genes, microRNAs, metabolites and single nucleotide polymorphisms (SNPs). Methods: SoyKB has many useful tools such as Affymetrix probe ID search, gene family search, multiple gene/ metabolite search supporting co-expression analysis, and protein 3D structure viewer as well as download and upload capacity for experimental data and annotations. It has four tiers ofmore » registration, which control different levels of access to public and private data. It allows users of certain levels to share their expertise by adding comments to the data. It has a user-friendly web interface together with genome browser and pathway viewer, which display data in an intuitive manner to the soybean researchers, producers and consumers. Conclusions: SoyKB addresses the increasing need of the soybean research community to have a one-stop-shop functional and translational omics web resource for information retrieval and analysis in a user-friendly way. SoyKB can be publicly accessed at http://soykb.org/.« less
Tsurumaki, M; Kotake, M; Iwasaki, M; Saito, M; Tanaka, K; Aw, W; Fukuda, S; Tomita, M
2015-01-01
Inulin, a natural renewable polysaccharide resource produced by various plants in nature, has been reported to possess a significant number of diverse pharmaceutical and food applications. Recently, there has been rapid progress in high-throughput technologies and platforms to assay global mRNA, proteins, metabolites and gut microbiota. In this review, we will describe the current status of utilizing omics technologies of elucidating the impact of inulin and inulin-containing prebiotics at the transcriptome, proteome, metabolome and gut microbiome levels. Although many studies in this review have addressed the impact of inulin comprehensively, these omics technologies only enable us to understand physiological information at each different stage of mRNA, protein, metabolite and gut microbe. We believe that a synergistic approach is vital in order to fully illustrate the intricate beauty behind the relatively modest influence of food factors like inulin on host health. PMID:26619369
Trapnell, Cole; Roberts, Adam; Goff, Loyal; Pertea, Geo; Kim, Daehwan; Kelley, David R; Pimentel, Harold; Salzberg, Steven L; Rinn, John L; Pachter, Lior
2012-01-01
Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions. This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol's execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ~1 h of hands-on time. PMID:22383036
De Novo Transcriptome Analysis for Kentucky Bluegrass Dwarf Mutants Induced by Space Mutation
Gan, Lu; Di, Rong; Chao, Yuehui; Han, Liebao; Chen, Xingwu; Wu, Chao; Yin, Shuxia
2016-01-01
Kentucky bluegrass (Poa pratensis L.) is a major cool-season turfgrass requiring frequent mowing. Utilization of cultivars with slow growth is a promising method to decrease mowing frequency. In this study, two dwarf mutant selections of Kentucky bluegrass (A12 and A16) induced by space mutation were analyzed for the differentially expressed genes compared with the wild type (WT) by the high-throughput RNA-Seq technology. 253,909 unigenes were obtained by de novo assembly. 24.20% of the unigenes had a significant level of amino acid sequence identity to Brachypodium distachyon proteins, followed by Hordeum vulgare with 18.72% among the non-redundant (NR) Blastx top hits. Assembled unigenes were associated with 32 pathways using KEGG orthology terms and their respective KEGG maps. Between WT and A16 libraries, 4,203 differentially expressed genes (DEGs) were identified, whereas there were 883 DEGs between WT and A12 libraries. Further investigation revealed that the DEG pathways were mainly involved in terpenoid biosynthesis and plant hormone metabolism, which might account for the differences of plant height and leaf blade color between dwarf mutant and WT plants. Our study presents the first comprehensive transcriptomic data and gene function analysis of Poa pratensis L., providing a valuable resource for future studies in plant dwarfing breeding and comparative genome analysis for Pooideae plants. PMID:27010560
Zhang, Min; Zhou, Yuwen; Wang, Hui; Jones, Huw; Gao, Qiang; Wang, Dahai; Ma, Youzhi; Xia, Lanqin
2013-08-16
The grain aphid (Sitobion avenae F.) is a major agricultural pest which causes significant yield losses of wheat in China, Europe and North America annually. Transcriptome profiling of the grain aphid alimentary canal after feeding on wheat plants could provide comprehensive gene expression information involved in feeding, ingestion and digestion. Furthermore, selection of aphid-specific RNAi target genes would be essential for utilizing a plant-mediated RNAi strategy to control aphids via a non-toxic mode of action. However, due to the tiny size of the alimentary canal and lack of genomic information on grain aphid as a whole, selection of the RNAi targets is a challenging task that as far as we are aware, has never been documented previously. In this study, we performed de novo transcriptome assembly and gene expression analyses of the alimentary canals of grain aphids before and after feeding on wheat plants using Illumina RNA sequencing. The transcriptome profiling generated 30,427 unigenes with an average length of 664 bp. Furthermore, comparison of the transcriptomes of alimentary canals of pre- and post feeding grain aphids indicated that 5490 unigenes were differentially expressed, among which, diverse genes and/or pathways were identified and annotated. Based on the RPKM values of these unigenes, 16 of them that were significantly up or down-regulated upon feeding were selected for dsRNA artificial feeding assay. Of these, 5 unigenes led to higher mortality and developmental stunting in an artificial feeding assay due to the down-regulation of the target gene expression. Finally, by adding fluorescently labelled dsRNA into the artificial diet, the spread of fluorescence signal in the whole body tissues of grain aphid was observed. Comparison of the transcriptome profiles of the alimentary canals of pre- and post-feeding grain aphids on wheat plants provided comprehensive gene expression information that could facilitate our understanding of the molecular mechanisms underlying feeding, ingestion and digestion. Furthermore, five novel and effective potential RNAi target genes were identified in grain aphid for the first time. This finding would provide a fundamental basis for aphid control in wheat through plant mediated RNAi strategy.
Mao, Yunrui; Zhang, Yonghua; Xu, Chuan; Qiu, Yingxiong
2016-01-01
Dysosma species (Berberidaceae, Podophylloideae) are of great medicinal pharmacogenetic importance and used as model systems to study the drivers and mechanisms of species diversification of temperate plants in East Asia. Recently, we have sequenced the transcriptome of the low-elevation D. versipellis. In this study, we sequenced the transcriptome of the high-elevation D. aurantiocaulis and used comparative genomic approaches to investigate the transcriptome evolution of the two species. We retrieved 53,929 unigenes from D. aurantiocaulis by de novo transcriptome assemblies using the Illumina HiSeq 2000 platform. Comparing the transcriptomes of both species, we identified 4593 orthologs. Estimation of Ka/Ks ratios for 3126 orthologs revealed that none had a Ka/Ks significantly greater than 1, whereas 1273 (Ka/Ks < 0.5, P < 0.05) were inferred to be under purifying selection. A total of 51 primer pairs were successfully designed from 461 EST-SSRs contained in 4593 orthologs. Marker validation assay revealed that 26 (51%) and 41 (80.4%) produced clear fragments with the expected sizes in all Podophylloideae species. Specifically, 19 different sequences of CYP719A were identified from PCR-amplified genomic DNA of all 12 species of Podophylloideae using primers designed from the assembled transcripts. The data further indicated that CYP719A was likely subject to strong selective constraints maintaining only one copy per genome. In Dysosma, there was relaxed purifying selection or more positive selection for high-elevation species. Overall, this study has generated a wealth of molecular resources potentially useful for pharmacogenetic and evolutionary studies in Dysosma and allied taxa. © 2015 John Wiley & Sons Ltd.
Naithani, Sushma; Sullivan, Chris; Preece, Justin; Tiwari, Vijay K.; Elser, Justin; Leonard, Jeffrey M.; Sage, Abigail; Gresham, Cathy; Kerhornou, Arnaud; Bolser, Dan; McCarthy, Fiona; Kersey, Paul; Lazo, Gerard R.; Jaiswal, Pankaj
2014-01-01
Background Triticum monococcum (2n) is a close ancestor of T. urartu, the A-genome progenitor of cultivated hexaploid wheat, and is therefore a useful model for the study of components regulating photomorphogenesis in diploid wheat. In order to develop genetic and genomic resources for such a study, we constructed genome-wide transcriptomes of two Triticum monococcum subspecies, the wild winter wheat T. monococcum ssp. aegilopoides (accession G3116) and the domesticated spring wheat T. monococcum ssp. monococcum (accession DV92) by generating de novo assemblies of RNA-Seq data derived from both etiolated and green seedlings. Principal Findings The de novo transcriptome assemblies of DV92 and G3116 represent 120,911 and 117,969 transcripts, respectively. We successfully mapped ∼90% of these transcripts from each accession to barley and ∼95% of the transcripts to T. urartu genomes. However, only ∼77% transcripts mapped to the annotated barley genes and ∼85% transcripts mapped to the annotated T. urartu genes. Differential gene expression analyses revealed 22% more light up-regulated and 35% more light down-regulated transcripts in the G3116 transcriptome compared to DV92. The DV92 and G3116 mRNA sequence reads aligned against the reference barley genome led to the identification of ∼500,000 single nucleotide polymorphism (SNP) and ∼22,000 simple sequence repeat (SSR) sites. Conclusions De novo transcriptome assemblies of two accessions of the diploid wheat T. monococcum provide new empirical transcriptome references for improving Triticeae genome annotations, and insights into transcriptional programming during photomorphogenesis. The SNP and SSR sites identified in our analysis provide additional resources for the development of molecular markers. PMID:24821410
2010-01-01
Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR) and their RNA transcription level by quantitative PCR (qPCR) experiments. Conclusions We have established the first tissue transcriptional analysis of a deep-sea hydrothermal vent animal and generated a searchable catalog of genes that provides a direct method of identifying and retrieving vast numbers of novel coding sequences which can be applied in gene expression profiling experiments from a non-conventional model organism. This provides the most comprehensive sequence resource for identifying novel genes currently available for a deep-sea vent organism, in particular, genes putatively involved in immune and inflammatory reactions in vent mussels. The characterization of the B. azoricus transcriptome will facilitate research into biological processes underlying physiological adaptations to hydrothermal vent environments and will provide a basis for expanding our understanding of genes putatively involved in adaptations processes during post-capture long term acclimatization experiments, at "sea-level" conditions, using B. azoricus as a model organism. PMID:20937131
Christie, Andrew E.; Sommer, Stephanie A.; Cieslak, Matthew C.; Hartline, Daniel K.; Lenz, Petra H.
2017-01-01
Coral reef ecosystems of many sub-tropical and tropical marine coastal environments have suffered significant degradation from anthropogenic sources. Research to inform management strategies that mitigate stressors and promote a healthy ecosystem has focused on the ecology and physiology of coral reefs and associated organisms. Few studies focus on the surrounding pelagic communities, which are equally important to ecosystem function. Zooplankton, often dominated by small crustaceans such as copepods, is an important food source for invertebrates and fishes, especially larval fishes. The reef-associated zooplankton includes a sub-neustonic copepod family that could serve as an indicator species for the community. Here, we describe the generation of a de novo transcriptome for one such copepod, Labidocera madurae, a pontellid from an intensively-studied coral reef ecosystem, Kāne‘ohe Bay, Oahu, Hawai‘i. The transcriptome was assembled using high-throughput sequence data obtained from whole organisms. It comprised 211,002 unique transcripts, including 72,391 with coding regions. It was assessed for quality and completeness using multiple workflows. Bench-marking-universal-single-copy-orthologs (BUSCO) analysis identified transcripts for 88% of expected eukaryotic core proteins. Targeted gene-discovery analyses included searches for transcripts coding full-length “giant” proteins (>4,000 amino acids), proteins and splice variants of voltage-gated sodium channels, and proteins involved in the circadian signaling pathway. Four different reference transcriptomes were generated and compared for the detection of differential gene expression between copepodites and adult females; 6,229 genes were consistently identified as differentially expressed between the two regardless of reference. Automated bioinformatics analyses and targeted manual gene curation suggest that the de novo assembled L. madurae transcriptome is of high quality and completeness. This transcriptome provides a new resource for assessing the global physiological status of a planktonic species inhabiting a coral reef ecosystem that is subjected to multiple anthropogenic stressors. The workflows provide a template for generating and assessing transcriptomes in other non-model species. PMID:29065152
Roncalli, Vittoria; Christie, Andrew E; Sommer, Stephanie A; Cieslak, Matthew C; Hartline, Daniel K; Lenz, Petra H
2017-01-01
Coral reef ecosystems of many sub-tropical and tropical marine coastal environments have suffered significant degradation from anthropogenic sources. Research to inform management strategies that mitigate stressors and promote a healthy ecosystem has focused on the ecology and physiology of coral reefs and associated organisms. Few studies focus on the surrounding pelagic communities, which are equally important to ecosystem function. Zooplankton, often dominated by small crustaceans such as copepods, is an important food source for invertebrates and fishes, especially larval fishes. The reef-associated zooplankton includes a sub-neustonic copepod family that could serve as an indicator species for the community. Here, we describe the generation of a de novo transcriptome for one such copepod, Labidocera madurae, a pontellid from an intensively-studied coral reef ecosystem, Kāne'ohe Bay, Oahu, Hawai'i. The transcriptome was assembled using high-throughput sequence data obtained from whole organisms. It comprised 211,002 unique transcripts, including 72,391 with coding regions. It was assessed for quality and completeness using multiple workflows. Bench-marking-universal-single-copy-orthologs (BUSCO) analysis identified transcripts for 88% of expected eukaryotic core proteins. Targeted gene-discovery analyses included searches for transcripts coding full-length "giant" proteins (>4,000 amino acids), proteins and splice variants of voltage-gated sodium channels, and proteins involved in the circadian signaling pathway. Four different reference transcriptomes were generated and compared for the detection of differential gene expression between copepodites and adult females; 6,229 genes were consistently identified as differentially expressed between the two regardless of reference. Automated bioinformatics analyses and targeted manual gene curation suggest that the de novo assembled L. madurae transcriptome is of high quality and completeness. This transcriptome provides a new resource for assessing the global physiological status of a planktonic species inhabiting a coral reef ecosystem that is subjected to multiple anthropogenic stressors. The workflows provide a template for generating and assessing transcriptomes in other non-model species.
Mykles, Donald L; Burnett, Karen G; Durica, David S; Stillman, Jonathon H
2016-12-01
Crustaceans, and decapods in particular (i.e., crabs, shrimp, and lobsters), are a diverse and ecologically and commercially important group of organisms. Understanding responses to abiotic and biotic factors is critical for developing best practices in aquaculture and assessing the effects of changing environments on the biology of these important animals. A relatively small number of decapod crustacean species have been intensively studied at the molecular level; the availability, experimental tractability, and economic relevance factor into the selection of a particular species as a model. Transcriptomics, using high-throughput next generation sequencing (NGS, coupled with RNA sequencing or RNA-seq) is revolutionizing crustacean biology. The 11 symposium papers in this volume illustrate how RNA-seq is being used to study stress response, molting and limb regeneration, immunity and disease, reproduction and development, neurobiology, and ecology and evolution. This symposium occurred on the 10th anniversary of the symposium, "Genomic and Proteomic Approaches to Crustacean Biology", held at the Society for Integrative and Comparative Biology 2006 meeting. Two participants in the 2006 symposium, the late Paul Gross and David Towle, were recognized as leaders who pioneered the use of molecular techniques that would ultimately foster the transcriptomics research reviewed in this volume. RNA-seq is a powerful tool for hypothesis-driven research, as well as an engine for discovery. It has eclipsed the technologies available in 2006, such as microarrays, expressed sequence tags, and subtractive hybridization screening, as the millions of "reads" from NGS enable researchers to de novo assemble a comprehensive transcriptome without a complete genome sequence. The symposium series concludes with a policy paper that gives an overview of the resources available and makes recommendations for developing better tools for functional annotation and pathway and network analysis in organisms in which the genome is not available or is incomplete. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Leaps and lulls in the developmental transcriptome of Dictyostelium discoideum.
Rosengarten, Rafael David; Santhanam, Balaji; Fuller, Danny; Katoh-Kurasawa, Mariko; Loomis, William F; Zupan, Blaz; Shaulsky, Gad
2015-04-13
Development of the soil amoeba Dictyostelium discoideum is triggered by starvation. When placed on a solid substrate, the starving solitary amoebae cease growth, communicate via extracellular cAMP, aggregate by tens of thousands and develop into multicellular organisms. Early phases of the developmental program are often studied in cells starved in suspension while cAMP is provided exogenously. Previous studies revealed massive shifts in the transcriptome under both developmental conditions and a close relationship between gene expression and morphogenesis, but were limited by the sampling frequency and the resolution of the methods. Here, we combine the superior depth and specificity of RNA-seq-based analysis of mRNA abundance with high frequency sampling during filter development and cAMP pulsing in suspension. We found that the developmental transcriptome exhibits mostly gradual changes interspersed by a few instances of large shifts. For each time point we treated the entire transcriptome as single phenotype, and were able to characterize development as groups of similar time points separated by gaps. The grouped time points represented gradual changes in mRNA abundance, or molecular phenotype, and the gaps represented times during which many genes are differentially expressed rapidly, and thus the phenotype changes dramatically. Comparing developmental experiments revealed that gene expression in filter developed cells lagged behind those treated with exogenous cAMP in suspension. The high sampling frequency revealed many genes whose regulation is reproducibly more complex than indicated by previous studies. Gene Ontology enrichment analysis suggested that the transition to multicellularity coincided with rapid accumulation of transcripts associated with DNA processes and mitosis. Later development included the up-regulation of organic signaling molecules and co-factor biosynthesis. Our analysis also demonstrated a high level of synchrony among the developing structures throughout development. Our data describe D. discoideum development as a series of coordinated cellular and multicellular activities. Coordination occurred within fields of aggregating cells and among multicellular bodies, such as mounds or migratory slugs that experience both cell-cell contact and various soluble signaling regimes. These time courses, sampled at the highest temporal resolution to date in this system, provide a comprehensive resource for studies of developmental gene expression.
2013-01-01
Background The brown planthopper (Nilaparvata lugens) is one of the most serious rice plant pests in Asia. N. lugens causes extensive rice damage by sucking rice phloem sap, which results in stunted plant growth and the transmission of plant viruses. Despite the importance of this insect pest, little is known about the immunological mechanisms occurring in this hemimetabolous insect species. Results In this study, we performed a genome- and transcriptome-wide analysis aiming at the immune-related genes. The transcriptome datasets include the N. lugens intestine, the developmental stage, wing formation, and sex-specific expression information that provided useful gene expression sequence data for the genome-wide analysis. As a result, we identified a large number of genes encoding N. lugens pattern recognition proteins, modulation proteins in the prophenoloxidase (proPO) activating cascade, immune effectors, and the signal transduction molecules involved in the immune pathways, including the Toll, Immune deficiency (Imd) and Janus kinase signal transducers and activators of transcription (JAK-STAT) pathways. The genome scale analysis revealed detailed information of the gene structure, distribution and transcription orientations in scaffolds. A comparison of the genome-available hemimetabolous and metabolous insect species indicate the differences in the immune-related gene constitution. We investigated the gene expression profiles with regards to how they responded to bacterial infections and tissue, as well as development and sex expression specificity. Conclusions The genome- and transcriptome-wide analysis of immune-related genes including pattern recognition and modulation molecules, immune effectors, and the signal transduction molecules involved in the immune pathways is an important step in determining the overall architecture and functional network of the immune components in N. lugens. Our findings provide the comprehensive gene sequence resource and expression profiles of the immune-related genes of N. lugens, which could facilitate the understanding of the innate immune mechanisms in the hemimetabolous insect species. These data give insight into clarifying the potential functional roles of the immune-related genes involved in the biological processes of development, reproduction, and virus transmission in N. lugens. PMID:23497397
Comparative whole genome transcriptome and metabolome analyses of five Klebsiella pneumonia strains.
Lee, Soojin; Kim, Borim; Yang, Jeongmo; Jeong, Daun; Park, Soohyun; Shin, Sang Heum; Kook, Jun Ho; Yang, Kap-Seok; Lee, Jinwon
2015-11-01
The integration of transcriptomics and metabolomics can provide precise information on gene-to-metabolite networks for identifying the function of novel genes. The goal of this study was to identify novel gene functions involved in 2,3-butanediol (2,3-BDO) biosynthesis by a comprehensive analysis of the transcriptome and metabolome of five mutated Klebsiella pneumonia strains (∆wabG = SGSB100, ∆wabG∆budA = SGSB106, ∆wabG∆budB = SGSB107, ∆wabG∆budC = SGSB108, ∆wabG∆budABC = SGSB109). First, the transcriptomes of all five mutants were analyzed and the genes exhibiting reproducible changes in expression were determined. The transcriptome was well conserved among the five strains, and differences in gene expression occurred mainly in genes coding for 2,3-BDO biosynthesis (budA, budB, and budC) and the genes involved in the degradation of reactive oxygen, biosynthesis and transport of arginine, cysteine biosynthesis, sulfur metabolism, oxidoreductase reaction, and formate dehydrogenase reaction. Second, differences in the metabolome (estimated by carbon distribution, CO2 emission, and redox balance) among the five mutant strains due to gene alteration of the 2,3-BDO operon were detected. The functional genomics approach integrating metabolomics and transcriptomics in K. Pneumonia presented here provides an innovative means of identifying novel gene functions involved in 2,3-BDO biosynthesis metabolism and whole cell metabolism.
Gao, Bei; Li, Xiaoshuang; Zhang, Daoyuan; Liang, Yuqing; Yang, Honglan; Chen, Moxian; Zhang, Yuanming; Zhang, Jianhua; Wood, Andrew J
2017-08-08
The desiccation tolerant bryophyte Bryum argenteum is an important component of desert biological soil crusts (BSCs) and is emerging as a model system for studying vegetative desiccation tolerance. Here we present and analyze the hydration-dehydration-rehydration transcriptomes in B. argenteum to establish a desiccation-tolerance transcriptomic atlas. B. argenteum gametophores representing five different hydration stages (hydrated (H0), dehydrated for 2 h (D2), 24 h (D24), then rehydrated for 2 h (R2) and 48 h (R48)), were sampled for transcriptome analyses. Illumina high throughput RNA-Seq technology was employed and generated more than 488.46 million reads. An in-house de novo transcriptome assembly optimization pipeline based on Trinity assembler was developed to obtain a reference Hydration-Dehydration-Rehydration (H-D-R) transcriptome comprising of 76,206 transcripts, with an N50 of 2,016 bp and average length of 1,222 bp. Comprehensive transcription factor (TF) annotation discovered 978 TFs in 62 families, among which 404 TFs within 40 families were differentially expressed upon dehydration-rehydration. Pfam term enrichment analysis revealed 172 protein families/domains were significantly associated with the H-D-R cycle and confirmed early rehydration (i.e. the R2 stage) as exhibiting the maximum stress-induced changes in gene expression.
Cavaiuolo, Marina; Cocetta, Giacomo; Spadafora, Natasha Damiana; Müller, Carsten T.; Rogers, Hilary J.
2017-01-01
Diplotaxis tenuifolia L. is of important economic value in the fresh-cut industry for its nutraceutical and sensorial properties. However, information on the molecular mechanisms conferring tolerance of harvested leaves to pre- and postharvest stresses during processing and shelf-life have never been investigated. Here, we provide the first transcriptomic resource of rocket by de novo RNA sequencing assembly, functional annotation and stress-induced expression analysis of 33874 transcripts. Transcriptomic changes in leaves subjected to commercially-relevant pre-harvest (salinity, heat and nitrogen starvation) and postharvest stresses (cold, dehydration, dark, wounding) known to affect quality and shelf-life were analysed 24h after stress treatment, a timing relevant to subsequent processing of salad leaves. Transcription factors and genes involved in plant growth regulator signaling, autophagy, senescence and glucosinolate metabolism were the most affected by the stresses. Hundreds of genes with unknown function but uniquely expressed under stress were identified, providing candidates to investigate stress responses in rocket. Dehydration and wounding had the greatest effect on the transcriptome and different stresses elicited changes in the expression of genes related to overlapping groups of hormones. These data will allow development of approaches targeted at improving stress tolerance, quality and shelf-life of rocket with direct applications in the fresh-cut industries. PMID:28558066
Cavaiuolo, Marina; Cocetta, Giacomo; Spadafora, Natasha Damiana; Müller, Carsten T; Rogers, Hilary J; Ferrante, Antonio
2017-01-01
Diplotaxis tenuifolia L. is of important economic value in the fresh-cut industry for its nutraceutical and sensorial properties. However, information on the molecular mechanisms conferring tolerance of harvested leaves to pre- and postharvest stresses during processing and shelf-life have never been investigated. Here, we provide the first transcriptomic resource of rocket by de novo RNA sequencing assembly, functional annotation and stress-induced expression analysis of 33874 transcripts. Transcriptomic changes in leaves subjected to commercially-relevant pre-harvest (salinity, heat and nitrogen starvation) and postharvest stresses (cold, dehydration, dark, wounding) known to affect quality and shelf-life were analysed 24h after stress treatment, a timing relevant to subsequent processing of salad leaves. Transcription factors and genes involved in plant growth regulator signaling, autophagy, senescence and glucosinolate metabolism were the most affected by the stresses. Hundreds of genes with unknown function but uniquely expressed under stress were identified, providing candidates to investigate stress responses in rocket. Dehydration and wounding had the greatest effect on the transcriptome and different stresses elicited changes in the expression of genes related to overlapping groups of hormones. These data will allow development of approaches targeted at improving stress tolerance, quality and shelf-life of rocket with direct applications in the fresh-cut industries.
Quantitative RNA-seq analysis of the Campylobacter jejuni transcriptome
Chaudhuri, Roy R.; Yu, Lu; Kanji, Alpa; Perkins, Timothy T.; Gardner, Paul P.; Choudhary, Jyoti; Maskell, Duncan J.
2011-01-01
Campylobacter jejuni is the most common bacterial cause of foodborne disease in the developed world. Its general physiology and biochemistry, as well as the mechanisms enabling it to colonize and cause disease in various hosts, are not well understood, and new approaches are required to understand its basic biology. High-throughput sequencing technologies provide unprecedented opportunities for functional genomic research. Recent studies have shown that direct Illumina sequencing of cDNA (RNA-seq) is a useful technique for the quantitative and qualitative examination of transcriptomes. In this study we report RNA-seq analyses of the transcriptomes of C. jejuni (NCTC11168) and its rpoN mutant. This has allowed the identification of hitherto unknown transcriptional units, and further defines the regulon that is dependent on rpoN for expression. The analysis of the NCTC11168 transcriptome was supplemented by additional proteomic analysis using liquid chromatography-MS. The transcriptomic and proteomic datasets represent an important resource for the Campylobacter research community. PMID:21816880
Bioinformatics of prokaryotic RNAs
Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F
2014-01-01
The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880
Comparative transcriptomics of early dipteran development
2013-01-01
Background Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and the scuttle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies). PMID:23432914
The Embryonic Transcriptome of the Red-Eared Slider Turtle (Trachemys scripta)
Kaplinsky, Nicholas J.; Gilbert, Scott F.; Cebra-Thomas, Judith; Lilleväli, Kersti; Saare, Merly; Chang, Eric Y.; Edelman, Hannah E.; Frick, Melissa A.; Guan, Yin; Hammond, Rebecca M.; Hampilos, Nicholas H.; Opoku, David S. B.; Sariahmed, Karim; Sherman, Eric A.; Watson, Ray
2013-01-01
The bony shell of the turtle is an evolutionary novelty not found in any other group of animals, however, research into its formation has suggested that it has evolved through modification of conserved developmental mechanisms. Although these mechanisms have been extensively characterized in model organisms, the tools for characterizing them in non-model organisms such as turtles have been limited by a lack of genomic resources. We have used a next generation sequencing approach to generate and assemble a transcriptome from stage 14 and 17 Trachemys scripta embryos, stages during which important events in shell development are known to take place. The transcriptome consists of 231,876 sequences with an N50 of 1,166 bp. GO terms and EC codes were assigned to the 61,643 unique predicted proteins identified in the transcriptome sequences. All major GO categories and metabolic pathways are represented in the transcriptome. Transcriptome sequences were used to amplify several cDNA fragments designed for use as RNA in situ probes. One of these, BMP5, was hybridized to a T. scripta embryo and exhibits both conserved and novel expression patterns. The transcriptome sequences should be of broad use for understanding the evolution and development of the turtle shell and for annotating any future T. scripta genome sequences. PMID:23840449
Dried Blood Spot RNA Transcriptomes Correlate with Transcriptomes Derived from Whole Blood RNA.
Reust, Mary J; Lee, Myung Hee; Xiang, Jenny; Zhang, Wei; Xu, Dong; Batson, Tatiana; Zhang, Tuo; Downs, Jennifer A; Dupnik, Kathryn M
2018-05-01
Obtaining RNA from clinical samples collected in resource-limited settings can be costly and challenging. The goals of this study were to 1) optimize messenger RNA extraction from dried blood spots (DBS) and 2) determine how transcriptomes generated from DBS RNA compared with RNA isolated from blood collected in Tempus tubes. We studied paired samples collected from eight adults in rural Tanzania. Venous blood was collected on Whatman 903 Protein Saver cards and in tubes with RNA preservation solution. Our optimal DBS RNA extraction used 8 × 3-mm DBS punches as the starting material, bead beater disruption at maximum speed for 60 seconds, extraction with Illustra RNAspin Mini RNA Isolation kit, and purification with Zymo RNA Concentrator kit. Spearman correlations of normalized gene counts in DBS versus whole blood ranged from 0.887 to 0.941. Bland-Altman plots did not show a trend toward over- or under-counting at any gene size. We report a method to obtain sufficient RNA from DBS to generate a transcriptome. The DBS transcriptome gene counts correlated well with whole blood transcriptome gene counts. Dried blood spots for transcriptome studies could be an option when field conditions preclude appropriate collection, storage, or transport of whole blood for RNA studies.
A high-quality annotated transcriptome of swine peripheral blood
USDA-ARS?s Scientific Manuscript database
Background: High throughput gene expression profiling assays of peripheral blood are widely used in biomedicine, as well as in animal genetics and physiology research. Accurate, comprehensive, and precise interpretation of such high throughput assays relies on well-characterized reference genomes an...
A comprehensive porcine blood transcriptome
USDA-ARS?s Scientific Manuscript database
Blood sample analyses are extensively used in high throughput assays in biomedicine, as well as animal genetics and physiology research. However, the draft quality of the current pig genome (Sscrofa 10.2) is insufficient for accurate interpretation of many of these assays because of incomplete gene ...
Li, Qike; Schissler, A Grant; Gardeux, Vincent; Achour, Ikbel; Kenost, Colleen; Berghout, Joanne; Li, Haiquan; Zhang, Hao Helen; Lussier, Yves A
2017-05-24
Transcriptome analytic tools are commonly used across patient cohorts to develop drugs and predict clinical outcomes. However, as precision medicine pursues more accurate and individualized treatment decisions, these methods are not designed to address single-patient transcriptome analyses. We previously developed and validated the N-of-1-pathways framework using two methods, Wilcoxon and Mahalanobis Distance (MD), for personal transcriptome analysis derived from a pair of samples of a single patient. Although, both methods uncover concordantly dysregulated pathways, they are not designed to detect dysregulated pathways with up- and down-regulated genes (bidirectional dysregulation) that are ubiquitous in biological systems. We developed N-of-1-pathways MixEnrich, a mixture model followed by a gene set enrichment test, to uncover bidirectional and concordantly dysregulated pathways one patient at a time. We assess its accuracy in a comprehensive simulation study and in a RNA-Seq data analysis of head and neck squamous cell carcinomas (HNSCCs). In presence of bidirectionally dysregulated genes in the pathway or in presence of high background noise, MixEnrich substantially outperforms previous single-subject transcriptome analysis methods, both in the simulation study and the HNSCCs data analysis (ROC Curves; higher true positive rates; lower false positive rates). Bidirectional and concordant dysregulated pathways uncovered by MixEnrich in each patient largely overlapped with the quasi-gold standard compared to other single-subject and cohort-based transcriptome analyses. The greater performance of MixEnrich presents an advantage over previous methods to meet the promise of providing accurate personal transcriptome analysis to support precision medicine at point of care.
A transcriptome atlas of rabbit revealed by PacBio single-molecule long-read sequencing.
Chen, Shi-Yi; Deng, Feilong; Jia, Xianbo; Li, Cao; Lai, Song-Jia
2017-08-09
It is widely acknowledged that transcriptional diversity largely contributes to biological regulation in eukaryotes. Since the advent of second-generation sequencing technologies, a large number of RNA sequencing studies have considerably improved our understanding of transcriptome complexity. However, it still remains a huge challenge for obtaining full-length transcripts because of difficulties in the short read-based assembly. In the present study we employ PacBio single-molecule long-read sequencing technology for whole-transcriptome profiling in rabbit (Oryctolagus cuniculus). We totally obtain 36,186 high-confidence transcripts from 14,474 genic loci, among which more than 23% of genic loci and 66% of isoforms have not been annotated yet within the current reference genome. Furthermore, about 17% of transcripts are computationally revealed to be non-coding RNAs. Up to 24,797 alternative splicing (AS) and 11,184 alternative polyadenylation (APA) events are detected within this de novo constructed transcriptome, respectively. The results provide a comprehensive set of reference transcripts and hence contribute to the improved annotation of rabbit genome.
Transcriptome characterisation of Pinus tabuliformis and evolution of genes in the Pinus phylogeny
2013-01-01
Background The Chinese pine (Pinus tabuliformis) is an indigenous conifer species in northern China but is relatively underdeveloped as a genomic resource; thus, limiting gene discovery and breeding. Large-scale transcriptome data were obtained using a next-generation sequencing platform to compensate for the lack of P. tabuliformis genomic information. Results The increasing amount of transcriptome data on Pinus provides an excellent resource for multi-gene phylogenetic analysis and studies on how conserved genes and functions are maintained in the face of species divergence. The first P. tabuliformis transcriptome from a normalised cDNA library of multiple tissues and individuals was sequenced in a full 454 GS-FLX run, producing 911,302 sequencing reads. The high quality overlapping expressed sequence tags (ESTs) were assembled into 46,584 putative transcripts, and more than 700 SSRs and 92,000 SNPs/InDels were characterised. Comparative analysis of the transcriptome of six conifer species yielded 191 orthologues, from which we inferred a phylogenetic tree, evolutionary patterns and calculated rates of gene diversion. We also identified 938 fast evolving sequences that may be useful for identifying genes that perhaps evolved in response to positive selection and might be responsible for speciation in the Pinus lineage. Conclusions A large collection of high-quality ESTs was obtained, de novo assembled and characterised, which represents a dramatic expansion of the current transcript catalogues of P. tabuliformis and which will gradually be applied in breeding programs of P. tabuliformis. Furthermore, these data will facilitate future studies of the comparative genomics of P. tabuliformis and other related species. PMID:23597112
NASA Astrophysics Data System (ADS)
Han, Zhaofang; Xiao, Shijun; Liu, Xiande; Liu, Yang; Li, Jiakai; Xie, Yangjie; Wang, Zhiyong
2017-03-01
The large yellow croaker, Larimichthys crocea is an important marine fish in China with a high economic value. In the last decade, the stock conservation and aquaculture industry of this species have been facing severe challenges because of wild population collapse and degeneration of important economic traits. However, genes contributing to growth and immunity in L. crocea have not been thoroughly analyzed, and available molecular markers are still not sufficient for genetic resource management and molecular selection. In this work, we sequenced the transcriptome in L. crocea liver tissue with a Roche 454 sequencing platform and assembled the transcriptome into 93 801 transcripts. Of them, 38 856 transcripts were successfully annotated in nt, nr, Swiss-Prot, InterPro, COG, GO and KEGG databases. Based on the annotation information, 3 165 unigenes related to growth and immunity were identified. Additionally, a total of 6 391 simple sequence repeats (SSRs) were identified from the transcriptome, among which 4 498 SSRs had enough flanking regions to design primers for polymerase chain reactions (PCR). To access the polymorphism of these markers, 30 primer pairs were randomly selected for PCR amplification and validation in 30 individuals, and 12 primer pairs (40.0%) exhibited obvious length polymorphisms. This work applied RNA-Seq to assemble and analyze a live transcriptome in L. crocea. With gene annotation and sequence information, genes related to growth and immunity were identified and massive SSR markers were developed, providing valuable genetic resources for future gene functional analysis and selective breeding of L. crocea.
Wang, Le; Yu, Cuiping; Guo, Liang; Lin, Haoran; Meng, Zining
2015-01-01
The common coral trout is one species of major importance in commercial fisheries and aquaculture. Recently, two different color morphs of Plectropomus leopardus were discovered and the biological importance of the color difference is unknown. Since coral trout species are poorly characterized at the molecular level, we undertook the transcriptomic characterization of the two color morphs, one black and one red coral trout, using Illumina next generation sequencing technologies. The study produced 55162966 and 54588952 paired-end reads, for black and red trout, respectively. De novo transcriptome assembly generated 95367 and 99424 unique sequences in black and red trout, respectively, with 88813 sequences shared between them. Approximately 50% of both trancriptomes were functionally annotated by BLAST searches against protein databases. The two trancriptomes were enriched into 25 functional categories and showed similar profiles of Gene Ontology category compositions. 34110 unigenes were grouped into 259 KEGG pathways. Moreover, we identified 14649 simple sequence repeats (SSRs) and designed primers for potential application. We also discovered 130524 putative single nucleotide polymorphisms (SNPs) in the two transcriptomes, supplying potential genomic resources for the coral trout species. In addition, we identified 936 fast-evolving genes and 165 candidate genes under positive selection between the two color morphs. Finally, 38 candidate genes underlying the mechanism of color and pigmentation were also isolated. This study presents the first transcriptome resources for the common coral trout and provides basic information for the development of genomic tools for the identification, conservation, and understanding of the speciation and local adaptation of coral reef fish species. PMID:26713756
Evangelisti, Edouard; Gogleva, Anna; Hainaux, Thomas; Doumane, Mehdi; Tulin, Frej; Quan, Clément; Yunusov, Temur; Floch, Kévin; Schornack, Sebastian
2017-05-11
Plant-pathogenic oomycetes are responsible for economically important losses in crops worldwide. Phytophthora palmivora, a tropical relative of the potato late blight pathogen, causes rotting diseases in many tropical crops including papaya, cocoa, oil palm, black pepper, rubber, coconut, durian, mango, cassava and citrus. Transcriptomics have helped to identify repertoires of host-translocated microbial effector proteins which counteract defenses and reprogram the host in support of infection. As such, these studies have helped in understanding how pathogens cause diseases. Despite the importance of P. palmivora diseases, genetic resources to allow for disease resistance breeding and identification of microbial effectors are scarce. We employed the model plant Nicotiana benthamiana to study the P. palmivora root infections at the cellular and molecular levels. Time-resolved dual transcriptomics revealed different pathogen and host transcriptome dynamics. De novo assembly of P. palmivora transcriptome and semi-automated prediction and annotation of the secretome enabled robust identification of conserved infection-promoting effectors. We show that one of them, REX3, suppresses plant secretion processes. In a survey for early transcriptionally activated plant genes we identified a N. benthamiana gene specifically induced at infected root tips that encodes a peptide with danger-associated molecular features. These results constitute a major advance in our understanding of P. palmivora diseases and establish extensive resources for P. palmivora pathogenomics, effector-aided resistance breeding and the generation of induced resistance to Phytophthora root infections. Furthermore, our approach to find infection-relevant secreted genes is transferable to other pathogen-host interactions and not restricted to plants.
Raherison, Elie S M; Giguère, Isabelle; Caron, Sébastien; Lamara, Mebarek; MacKay, John J
2015-07-01
Transcript profiling has shown the molecular bases of several biological processes in plants but few studies have developed an understanding of overall transcriptome variation. We investigated transcriptome structure in white spruce (Picea glauca), aiming to delineate its modular organization and associated functional and evolutionary attributes. Microarray analyses were used to: identify and functionally characterize groups of co-expressed genes; investigate expressional and functional diversity of vascular tissue preferential genes which were conserved among Picea species, and identify expression networks underlying wood formation. We classified 22 857 genes as variable (79%; 22 coexpression groups) or invariant (21%) by profiling across several vegetative tissues. Modular organization and complex transcriptome restructuring among vascular tissue preferential genes was revealed by their assignment to coexpression groups with partially overlapping profiles and partially distinct functions. Integrated analyses of tissue-based and temporally variable profiles identified secondary xylem gene networks, showed their remodelling over a growing season and identified PgNAC-7 (no apical meristerm (NAM), Arabidopsis transcription activation factor (ATAF) and cup-shaped cotyledon (CUC) transcription factor 007 in Picea glauca) as a major hub gene specific to earlywood formation. Reference profiling identified comprehensive, statistically robust coexpressed groups, revealing that modular organization underpins the evolutionary conservation of the transcriptome structure. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome
Kim, Gunjune
2017-01-01
Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is “leaves of three, let it be”, which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species. PMID:29125533
Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome.
Weisberg, Alexandra J; Kim, Gunjune; Westwood, James H; Jelesko, John G
2017-11-10
Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is "leaves of three, let it be", which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species.
The Landscape of long non-coding RNA classification
St Laurent, Georges; Wahlestedt, Claes; Kapranov, Philipp
2015-01-01
Advances in the depth and quality of transcriptome sequencing have revealed many new classes of long non-coding RNAs (lncRNAs). lncRNA classification has mushroomed to accommodate these new findings, even though the real dimensions and complexity of the non-coding transcriptome remain unknown. Although evidence of functionality of specific lncRNAs continues to accumulate, conflicting, confusing, and overlapping terminology has fostered ambiguity and lack of clarity in the field in general. The lack of fundamental conceptual un-ambiguous classification framework results in a number of challenges in the annotation and interpretation of non-coding transcriptome data. It also might undermine integration of the new genomic methods and datasets in an effort to unravel function of lncRNA. Here, we review existing lncRNA classifications, nomenclature, and terminology. Then we describe the conceptual guidelines that have emerged for their classification and functional annotation based on expanding and more comprehensive use of large systems biology-based datasets. PMID:25869999
Ni, Jun; Dong, Lixiang; Jiang, Zhifang; Yang, Xiuli; Chen, Ziying; Wu, Yuhuan; Xu, Maojun
2018-01-01
Ginkgo leaves are raw materials for flavonoid extraction. Thus, the timing of their harvest is important to optimize the extraction efficiency, which benefits the pharmaceutical industry. In this research, we compared the transcriptomes of Ginkgo leaves harvested at midday and midnight. The differentially expressed genes with the highest probabilities in each step of flavonoid biosynthesis were down-regulated at midnight. Furthermore, real-time PCR corroborated the transcriptome results, indicating the decrease in flavonoid biosynthesis at midnight. The flavonoid profiles of Ginkgo leaves harvested at midday and midnight were compared, and the total flavonoid content decreased at midnight. A detailed analysis of individual flavonoids showed that most of their contents were decreased by various degrees. Our results indicated that circadian rhythms affected the flavonoid contents in Ginkgo leaves, which provides valuable information for optimizing their harvesting times to benefit the pharmaceutical industry.
Guo, Yang; Townsend, Richard; Tsoi, Lam C
2017-01-01
In the past decade, high-throughput techniques have facilitated the "-omics" research. Transcriptomic study, for instance, has advanced our understanding on the expression landscape of different human diseases and cellular mechanisms. The National Center for Biotechnology Center (NCBI) initialized Genetic Expression Omnibus (GEO) to promote the sharing of transcriptomic data to facilitate biomedical research. In this chapter, we will illustrate how to use GEO to search and analyze the public available transcriptomic data, and we will provide easy to follow protocol for researchers to data mine the powerful resources in GEO to retrieve relevant information that can be valuable for fibrosis research.
De novo Assembly and Analysis of the Chilean Pencil Catfish Trichomycterus areolatus Transcriptome
Schulze, Thomas T.; Ali, Jonathan M.; Bartlett, Maggie L.; McFarland, Madalyn M.; Clement, Emalie J.; Won, Harim I.; Sanford, Austin G.; Monzingo, Elyssa B.; Martens, Matthew C.; Hemsley, Ryan M.; Kumar, Sidharta; Gouin, Nicolas; Kolok, Alan S.; Davis, Paul H.
2016-01-01
Trichomycterus areolatus is an endemic species of pencil catfish that inhabits the riffles and rapids of many freshwater ecosystems of Chile. Despite its unique adaptation to Chile's high gradient watersheds and therefore potential application in the investigation of ecosystem integrity and environmental contamination, relatively little is known regarding the molecular biology of this environmental sentinel. Here, we detail the assembly of the Trichomycterus areolatus transcriptome, a molecular resource for the study of this organism and its molecular response to the environment. RNA-Seq reads were obtained by next-generation sequencing with an Illumina® platform and processed using PRINSEQ. The transcriptome assembly was performed using TRINITY assembler. Transcriptome validation was performed by functional characterization with KOG, KEGG, and GO analyses. Additionally, differential expression analysis highlights sex-specific expression patterns, and a list of endocrine and oxidative stress related transcripts are included. PMID:27672404
Chen, Haimei; Guo, Baolin; Liu, Chang
2017-01-01
Epimedium pseudowushanense B.L.Guo, a light-demanding shade herb, is used in traditional medicine to increase libido and strengthen muscles and bones. The recognition of the health benefits of Epimedium has increased its market demand. However, its resource recycling rate is low and environmentally dependent. Furthermore, its natural sources are endangered, further increasing prices. Commercial culture can address resource constraints of it.Understanding the effects of environmental factors on the production of its active components would improve the technology for cultivation and germplasm conservation. Here, we studied the effects of light intensities on the flavonoid production and revealed the molecular mechanism using RNA-seq analysis. Plants were exposed to five levels of light intensity through the periods of germination to flowering, the flavonoid contents were measured using HPLC. Quantification of epimedin A, epimedin B, epimedin C, and icariin showed that the flavonoid contents varied with different light intensity levels. And the largest amount of epimedin C was produced at light intensity level 4 (I4). Next, the leaves under the treatment of three light intensity levels (“L”, “M” and “H”) with the largest differences in the flavonoid content, were subjected to RNA-seq analysis. Transcriptome reconstruction identified 43,657 unigenes. All unigene sequences were annotated by searching against the Nr, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. In total, 4008, 5260, and 3591 significant differentially expressed genes (DEGs) were identified between the groups L vs. M, M vs. H and L vs. H. Particularly, twenty-one full-length genes involved in flavonoid biosynthesis were identified. The expression levels of the flavonol synthase, chalcone synthase genes were strongly associated with light-induced flavonoid abundance with the highest expression levels found in the H group. Furthermore, 65 transcription factors, including 31 FAR1, 17 MYB-related, 12 bHLH, and 5 WRKY, were differentially expressed after light induction. Finally, a model was proposed to explain the light-induced flavonoid production. This study provided valuable information to improve cultivation practices and produced the first comprehensive resource for E. pseudowushanense transcriptomes. PMID:28786984
Pan, Junqian; Chen, Haimei; Guo, Baolin; Liu, Chang
2017-01-01
Epimedium pseudowushanense B.L.Guo, a light-demanding shade herb, is used in traditional medicine to increase libido and strengthen muscles and bones. The recognition of the health benefits of Epimedium has increased its market demand. However, its resource recycling rate is low and environmentally dependent. Furthermore, its natural sources are endangered, further increasing prices. Commercial culture can address resource constraints of it.Understanding the effects of environmental factors on the production of its active components would improve the technology for cultivation and germplasm conservation. Here, we studied the effects of light intensities on the flavonoid production and revealed the molecular mechanism using RNA-seq analysis. Plants were exposed to five levels of light intensity through the periods of germination to flowering, the flavonoid contents were measured using HPLC. Quantification of epimedin A, epimedin B, epimedin C, and icariin showed that the flavonoid contents varied with different light intensity levels. And the largest amount of epimedin C was produced at light intensity level 4 (I4). Next, the leaves under the treatment of three light intensity levels ("L", "M" and "H") with the largest differences in the flavonoid content, were subjected to RNA-seq analysis. Transcriptome reconstruction identified 43,657 unigenes. All unigene sequences were annotated by searching against the Nr, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. In total, 4008, 5260, and 3591 significant differentially expressed genes (DEGs) were identified between the groups L vs. M, M vs. H and L vs. H. Particularly, twenty-one full-length genes involved in flavonoid biosynthesis were identified. The expression levels of the flavonol synthase, chalcone synthase genes were strongly associated with light-induced flavonoid abundance with the highest expression levels found in the H group. Furthermore, 65 transcription factors, including 31 FAR1, 17 MYB-related, 12 bHLH, and 5 WRKY, were differentially expressed after light induction. Finally, a model was proposed to explain the light-induced flavonoid production. This study provided valuable information to improve cultivation practices and produced the first comprehensive resource for E. pseudowushanense transcriptomes.
Chen, Shuangyan; Huang, Xin; Yan, Xueqing; Liang, Ye; Wang, Yuezhu; Li, Xiaofeng; Peng, Xianjun; Ma, Xingyong; Zhang, Lexin; Cai, Yueyue; Ma, Tian; Cheng, Liqin; Qi, Dongmei; Zheng, Huajun; Yang, Xiaohan; Li, Xiaoxia; Liu, Gongshe
2013-01-01
Background Sheepgrass [Leymus chinensis (Trin.) Tzvel.] is an important perennial forage grass across the Eurasian Steppe and is known for its adaptability to various environmental conditions. However, insufficient data resources in public databases for sheepgrass limited our understanding of the mechanism of environmental adaptations, gene discovery and molecular marker development. Results The transcriptome of sheepgrass was sequenced using Roche 454 pyrosequencing technology. We assembled 952,328 high-quality reads into 87,214 unigenes, including 32,416 contigs and 54,798 singletons. There were 15,450 contigs over 500 bp in length. BLAST searches of our database against Swiss-Prot and NCBI non-redundant protein sequences (nr) databases resulted in the annotation of 54,584 (62.6%) of the unigenes. Gene Ontology (GO) analysis assigned 89,129 GO term annotations for 17,463 unigenes. We identified 11,675 core Poaceae-specific and 12,811 putative sheepgrass-specific unigenes by BLAST searches against all plant genome and transcriptome databases. A total of 2,979 specific freezing-responsive unigenes were found from this RNAseq dataset. We identified 3,818 EST-SSRs in 3,597 unigenes, and some SSRs contained unigenes that were also candidates for freezing-response genes. Characterizations of nucleotide repeats and dominant motifs of SSRs in sheepgrass were also performed. Similarity and phylogenetic analysis indicated that sheepgrass is closely related to barley and wheat. Conclusions This research has greatly enriched sheepgrass transcriptome resources. The identified stress-related genes will help us to decipher the genetic basis of the environmental and ecological adaptations of this species and will be used to improve wheat and barley crops through hybridization or genetic transformation. The EST-SSRs reported here will be a valuable resource for future gene-phenotype studies and for the molecular breeding of sheepgrass and other Poaceae species. PMID:23861841
Chen, Shuangyan; Huang, Xin; Yan, Xueqing; Liang, Ye; Wang, Yuezhu; Li, Xiaofeng; Peng, Xianjun; Ma, Xingyong; Zhang, Lexin; Cai, Yueyue; Ma, Tian; Cheng, Liqin; Qi, Dongmei; Zheng, Huajun; Yang, Xiaohan; Li, Xiaoxia; Liu, Gongshe
2013-01-01
Sheepgrass [Leymus chinensis (Trin.) Tzvel.] is an important perennial forage grass across the Eurasian Steppe and is known for its adaptability to various environmental conditions. However, insufficient data resources in public databases for sheepgrass limited our understanding of the mechanism of environmental adaptations, gene discovery and molecular marker development. The transcriptome of sheepgrass was sequenced using Roche 454 pyrosequencing technology. We assembled 952,328 high-quality reads into 87,214 unigenes, including 32,416 contigs and 54,798 singletons. There were 15,450 contigs over 500 bp in length. BLAST searches of our database against Swiss-Prot and NCBI non-redundant protein sequences (nr) databases resulted in the annotation of 54,584 (62.6%) of the unigenes. Gene Ontology (GO) analysis assigned 89,129 GO term annotations for 17,463 unigenes. We identified 11,675 core Poaceae-specific and 12,811 putative sheepgrass-specific unigenes by BLAST searches against all plant genome and transcriptome databases. A total of 2,979 specific freezing-responsive unigenes were found from this RNAseq dataset. We identified 3,818 EST-SSRs in 3,597 unigenes, and some SSRs contained unigenes that were also candidates for freezing-response genes. Characterizations of nucleotide repeats and dominant motifs of SSRs in sheepgrass were also performed. Similarity and phylogenetic analysis indicated that sheepgrass is closely related to barley and wheat. This research has greatly enriched sheepgrass transcriptome resources. The identified stress-related genes will help us to decipher the genetic basis of the environmental and ecological adaptations of this species and will be used to improve wheat and barley crops through hybridization or genetic transformation. The EST-SSRs reported here will be a valuable resource for future gene-phenotype studies and for the molecular breeding of sheepgrass and other Poaceae species.
Kukekova, Anna V; Johnson, Jennifer L; Teiling, Clotilde; Li, Lewyn; Oskina, Irina N; Kharlamova, Anastasiya V; Gulevich, Rimma G; Padte, Ravee; Dubreuil, Michael M; Vladimirova, Anastasiya V; Shepeleva, Darya V; Shikhevich, Svetlana G; Sun, Qi; Ponnala, Lalit; Temnykh, Svetlana V; Trut, Lyudmila N; Acland, Gregory M
2011-10-03
Two strains of the silver fox (Vulpes vulpes), with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain are defensive and exhibit aggression to humans. To understand the genetic differences underlying these behavioral phenotypes fox-specific genomic resources are needed. cDNA from mRNA from pre-frontal cortex of a tame and an aggressive fox was sequenced using the Roche 454 FLX Titanium platform (> 2.5 million reads & 0.9 Gbase of tame fox sequence; >3.3 million reads & 1.2 Gbase of aggressive fox sequence). Over 80% of the fox reads were assembled into contigs. Mapping fox reads against the fox transcriptome assembly and the dog genome identified over 30,000 high confidence fox-specific SNPs. Fox transcripts for approximately 14,000 genes were identified using SwissProt and the dog RefSeq databases. An at least 2-fold expression difference between the two samples (p < 0.05) was observed for 335 genes, fewer than 3% of the total number of genes identified in the fox transcriptome. Transcriptome sequencing significantly expanded genomic resources available for the fox, a species without a sequenced genome. In a very cost efficient manner this yielded a large number of fox-specific SNP markers for genetic studies and provided significant insights into the gene expression profile of the fox pre-frontal cortex; expression differences between the two fox samples; and a catalogue of potentially important gene-specific sequence variants. This result demonstrates the utility of this approach for developing genomic resources in species with limited genomic information.
2011-01-01
Background Two strains of the silver fox (Vulpes vulpes), with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain are defensive and exhibit aggression to humans. To understand the genetic differences underlying these behavioral phenotypes fox-specific genomic resources are needed. Results cDNA from mRNA from pre-frontal cortex of a tame and an aggressive fox was sequenced using the Roche 454 FLX Titanium platform (> 2.5 million reads & 0.9 Gbase of tame fox sequence; >3.3 million reads & 1.2 Gbase of aggressive fox sequence). Over 80% of the fox reads were assembled into contigs. Mapping fox reads against the fox transcriptome assembly and the dog genome identified over 30,000 high confidence fox-specific SNPs. Fox transcripts for approximately 14,000 genes were identified using SwissProt and the dog RefSeq databases. An at least 2-fold expression difference between the two samples (p < 0.05) was observed for 335 genes, fewer than 3% of the total number of genes identified in the fox transcriptome. Conclusions Transcriptome sequencing significantly expanded genomic resources available for the fox, a species without a sequenced genome. In a very cost efficient manner this yielded a large number of fox-specific SNP markers for genetic studies and provided significant insights into the gene expression profile of the fox pre-frontal cortex; expression differences between the two fox samples; and a catalogue of potentially important gene-specific sequence variants. This result demonstrates the utility of this approach for developing genomic resources in species with limited genomic information. PMID:21967120
Sudhagar, Arun; El-Matbouli, Mansour
2018-01-01
In recent years, with the advent of next-generation sequencing along with the development of various bioinformatics tools, RNA sequencing (RNA-Seq)-based transcriptome analysis has become much more affordable in the field of biological research. This technique has even opened up avenues to explore the transcriptome of non-model organisms for which a reference genome is not available. This has made fish health researchers march towards this technology to understand pathogenic processes and immune reactions in fish during the event of infection. Recent studies using this technology have altered and updated the previous understanding of many diseases in fish. RNA-Seq has been employed in the understanding of fish pathogens like bacteria, virus, parasites, and oomycetes. Also, it has been helpful in unraveling the immune mechanisms in fish. Additionally, RNA-Seq technology has made its way for future works, such as genetic linkage mapping, quantitative trait analysis, disease-resistant strain or broodstock selection, and the development of effective vaccines and therapies. Until now, there are no reviews that comprehensively summarize the studies which made use of RNA-Seq to explore the mechanisms of infection of pathogens and the defense strategies of fish hosts. This review aims to summarize the contemporary understanding and findings with regard to infectious pathogens and the immune system of fish that have been achieved through RNA-Seq technology. PMID:29342931
Zenoni, Sara; D'Agostino, Nunzio; Tornielli, Giovanni B; Quattrocchio, Francesca; Chiusano, Maria L; Koes, Ronald; Zethof, Jan; Guzzo, Flavia; Delledonne, Massimo; Frusciante, Luigi; Gerats, Tom; Pezzotti, Mario
2011-10-01
Petunia is an excellent model system, especially for genetic, physiological and molecular studies. Thus far, however, genome-wide expression analysis has been applied rarely because of the lack of sequence information. We applied next-generation sequencing to generate, through de novo read assembly, a large catalogue of transcripts for Petunia axillaris and Petunia inflata. On the basis of both transcriptomes, comprehensive microarray chips for gene expression analysis were established and used for the analysis of global- and organ-specific gene expression in Petunia axillaris and Petunia inflata and to explore the molecular basis of the seed coat defects in a Petunia hybrida mutant, anthocyanin 11 (an11), lacking a WD40-repeat (WDR) transcription regulator. Among the transcripts differentially expressed in an11 seeds compared with wild type, many expected targets of AN11 were found but also several interesting new candidates that might play a role in morphogenesis of the seed coat. Our results validate the combination of next-generation sequencing with microarray analyses strategies to identify the transcriptome of two petunia species without previous knowledge of their genome, and to develop comprehensive chips as useful tools for the analysis of gene expression in P. axillaris, P. inflata and P. hybrida. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.
De novo transcriptome assemblies of four xylem sap-feeding insects
Tassone, Erica E.; Cowden, Charles C.
2017-01-01
Abstract Background: Spittle bugs and sharpshooters are well-known xylem sap-feeding insects and vectors of the phytopathogenic bacterium Xylella fastidiosa (Wells), a causal agent of Pierce's disease of grapevines and other crop diseases. Specialized feeding on nutrient-deficient xylem sap is relatively rare among insect herbivores, and only limited genomic and transcriptomic information has been generated for xylem-sap feeders. To develop a more comprehensive understanding of biochemical adaptations and symbiotic relationships that support survival on a nutritionally austere dietary source, transcriptome assemblies for three sharpshooter species and one spittlebug species were produced. Findings: Trinity-based de novo transcriptome assemblies were generated for all four xylem-sap feeders using raw sequencing data originating from whole-insect preps. Total transcripts for each species ranged from 91 384 for Cuerna arida to 106 998 for Homalodisca liturata with transcript totals for Graphocephala atropunctata and the spittlebug Clastoptera arizonana falling in between. The percentage of transcripts comprising complete open reading frames ranged from 60% for H. liturata to 82% for C. arizonana. Bench-marking universal single-copy orthologs analyses for each dataset indicated quality assemblies and a high degree of completeness for all four species. Conclusions: These four transcriptomes represent a significant expansion of data for insect herbivores that feed exclusively on xylem sap, a nutritionally deficient dietary source relative to other plant tissues and fluids. Comparison of transcriptome data with insect herbivores that utilize other dietary sources may illuminate fundamental differences in the biochemistry of dietary specialization. PMID:28327966
Characterization of Chiton Ischnochiton hakodadensis Foot Based on Transcriptome Sequencing
NASA Astrophysics Data System (ADS)
Dou, Huaiqian; Miao, Yan; Li, Yuli; Li, Yangping; Dai, Xiaoting; Zhang, Xiaokang; Liang, Pengyu; Liu, Weizhi; Wang, Shi; Bao, Zhenmin
2018-06-01
Chiton ( Ischnochiton hakodadensis) is one of marine mollusks well known for its eight separate shell plates. I. hakodadensis is important, which plays a vital role in the ecosystems it inhabits. So far, the genetic studies on the chiton are scarce due in part to insufficient genomic resources available for this species. In this study, we investigated the transcriptome of the chiton foot using Illumina sequencing technology. The reads were assembled and clustered into 256461 unigenes, of which 42247 were divided into diverse functional categories by Gene Ontology (GO) annotation terms, and 17256 mapped onto 365 pathways by KEGG pathway mapping. Meanwhile, a set of differentially expressed genes (DEGs) between distal and proximal muscles were identified as the foot adhesive locomotion associated, thus were useful for our future studies. Moreover, up to 679384 high-quality single nucleotide polymorphisms (SNPs) and 19814 simple sequence repeats (SSRs) were identified in this study, which are valuable for subsequent studies on genetic diversity and variation. The transcriptomic resource obtained in this study should aid to future genetic and genomic studies of chiton.
Comprehensive benefit analysis of regional water resources based on multi-objective evaluation
NASA Astrophysics Data System (ADS)
Chi, Yixia; Xue, Lianqing; Zhang, Hui
2018-01-01
The purpose of the water resources comprehensive benefits analysis is to maximize the comprehensive benefits on the aspects of social, economic and ecological environment. Aiming at the defects of the traditional analytic hierarchy process in the evaluation of water resources, it proposed a comprehensive benefit evaluation of social, economic and environmental benefits index from the perspective of water resources comprehensive benefit in the social system, economic system and environmental system; determined the index weight by the improved fuzzy analytic hierarchy process (AHP), calculated the relative index of water resources comprehensive benefit and analyzed the comprehensive benefit of water resources in Xiangshui County by the multi-objective evaluation model. Based on the water resources data in Xiangshui County, 20 main comprehensive benefit assessment factors of 5 districts belonged to Xiangshui County were evaluated. The results showed that the comprehensive benefit of Xiangshui County was 0.7317, meanwhile the social economy has a further development space in the current situation of water resources.
Detection and Reconstruction of Circular RNAs from Transcriptomic Data.
Zheng, Yi; Zhao, Fangqing
2018-01-01
Recent studies have shown that circular RNAs (circRNAs) are a novel class of abundant, stable, and ubiquitous noncoding RNA molecules in eukaryotic organisms. Comprehensive detection and reconstruction of circRNAs from high-throughput transcriptome data is an initial step to study their biogenesis and function. Several tools have been developed to deal with this issue, but they require many steps and are difficult to use. To solve this problem, we provide a protocol for researchers to detect and reconstruct circRNA by employing CIRI2, CIRI-AS, and CIRI-full. This protocol can not only simplify the usage of above tools but also integrate their results.
Chana-Munoz, Andres; Jendroszek, Agnieszka; Sønnichsen, Malene; Kristiansen, Rune; Jensen, Jan K; Andreasen, Peter A; Bendixen, Christian; Panitz, Frank
2017-01-01
The spiny dogfish shark (Squalus acanthias) is one of the most commonly used cartilaginous fishes in biological research, especially in the fields of nitrogen metabolism, ion transporters and osmoregulation. Nonetheless, transcriptomic data for this organism is scarce. In the present study, a multi-tissue RNA-seq experiment and de novo transcriptome assembly was performed in four different spiny dogfish tissues (brain, liver, kidney and ovary), providing an annotated sequence resource. The characterization of the transcriptome greatly increases the scarce sequence information for shark species. Reads were assembled with the Trinity de novo assembler both within each tissue and across all tissues combined resulting in 362,690 transcripts in the combined assembly which represent 289,515 Trinity genes. BUSCO analysis determined a level of 87% completeness for the combined transcriptome. In total, 123,110 proteins were predicted of which 78,679 and 83,164 had significant hits against the SwissProt and Uniref90 protein databases, respectively. Additionally, 61,215 proteins aligned to known protein domains, 7,208 carried a signal peptide and 15,971 possessed at least one transmembrane region. Based on the annotation, 81,582 transcripts were assigned to gene ontology terms and 42,078 belong to known clusters of orthologous groups (eggNOG). To demonstrate the value of our molecular resource, we show that the improved transcriptome data enhances the current possibilities of osmoregulation research in spiny dogfish by utilizing the novel gene and protein annotations to investigate a set of genes involved in urea synthesis and urea, ammonia and water transport, all of them crucial in osmoregulation. We describe the presence of different gene copies and isoforms of key enzymes involved in this process, including arginases and transporters of urea and ammonia, for which sequence information is currently absent in the databases for this model species. The transcriptome assemblies and the derived annotations generated in this study will support the ongoing research for this particular animal model and provides a new molecular tool to assist biological research in cartilaginous fishes.
Chana-Munoz, Andres; Jendroszek, Agnieszka; Sønnichsen, Malene; Kristiansen, Rune; Jensen, Jan K.; Bendixen, Christian
2017-01-01
The spiny dogfish shark (Squalus acanthias) is one of the most commonly used cartilaginous fishes in biological research, especially in the fields of nitrogen metabolism, ion transporters and osmoregulation. Nonetheless, transcriptomic data for this organism is scarce. In the present study, a multi-tissue RNA-seq experiment and de novo transcriptome assembly was performed in four different spiny dogfish tissues (brain, liver, kidney and ovary), providing an annotated sequence resource. The characterization of the transcriptome greatly increases the scarce sequence information for shark species. Reads were assembled with the Trinity de novo assembler both within each tissue and across all tissues combined resulting in 362,690 transcripts in the combined assembly which represent 289,515 Trinity genes. BUSCO analysis determined a level of 87% completeness for the combined transcriptome. In total, 123,110 proteins were predicted of which 78,679 and 83,164 had significant hits against the SwissProt and Uniref90 protein databases, respectively. Additionally, 61,215 proteins aligned to known protein domains, 7,208 carried a signal peptide and 15,971 possessed at least one transmembrane region. Based on the annotation, 81,582 transcripts were assigned to gene ontology terms and 42,078 belong to known clusters of orthologous groups (eggNOG). To demonstrate the value of our molecular resource, we show that the improved transcriptome data enhances the current possibilities of osmoregulation research in spiny dogfish by utilizing the novel gene and protein annotations to investigate a set of genes involved in urea synthesis and urea, ammonia and water transport, all of them crucial in osmoregulation. We describe the presence of different gene copies and isoforms of key enzymes involved in this process, including arginases and transporters of urea and ammonia, for which sequence information is currently absent in the databases for this model species. The transcriptome assemblies and the derived annotations generated in this study will support the ongoing research for this particular animal model and provides a new molecular tool to assist biological research in cartilaginous fishes. PMID:28832628
2013-01-01
Background The grain aphid (Sitobion avenae F.) is a major agricultural pest which causes significant yield losses of wheat in China, Europe and North America annually. Transcriptome profiling of the grain aphid alimentary canal after feeding on wheat plants could provide comprehensive gene expression information involved in feeding, ingestion and digestion. Furthermore, selection of aphid-specific RNAi target genes would be essential for utilizing a plant-mediated RNAi strategy to control aphids via a non-toxic mode of action. However, due to the tiny size of the alimentary canal and lack of genomic information on grain aphid as a whole, selection of the RNAi targets is a challenging task that as far as we are aware, has never been documented previously. Results In this study, we performed de novo transcriptome assembly and gene expression analyses of the alimentary canals of grain aphids before and after feeding on wheat plants using Illumina RNA sequencing. The transcriptome profiling generated 30,427 unigenes with an average length of 664 bp. Furthermore, comparison of the transcriptomes of alimentary canals of pre- and post feeding grain aphids indicated that 5490 unigenes were differentially expressed, among which, diverse genes and/or pathways were identified and annotated. Based on the RPKM values of these unigenes, 16 of them that were significantly up or down-regulated upon feeding were selected for dsRNA artificial feeding assay. Of these, 5 unigenes led to higher mortality and developmental stunting in an artificial feeding assay due to the down-regulation of the target gene expression. Finally, by adding fluorescently labelled dsRNA into the artificial diet, the spread of fluorescence signal in the whole body tissues of grain aphid was observed. Conclusions Comparison of the transcriptome profiles of the alimentary canals of pre- and post-feeding grain aphids on wheat plants provided comprehensive gene expression information that could facilitate our understanding of the molecular mechanisms underlying feeding, ingestion and digestion. Furthermore, five novel and effective potential RNAi target genes were identified in grain aphid for the first time. This finding would provide a fundamental basis for aphid control in wheat through plant mediated RNAi strategy. PMID:23957588
High-Throughput Sequencing and De Novo Assembly of the Isatis indigotica Transcriptome
Tang, Xiaoqing; Xiao, Yunhua; Lv, Tingting; Wang, Fangquan; Zhu, QianHao; Zheng, Tianqing; Yang, Jie
2014-01-01
Background Isatis indigotica, the source of the traditional Chinese medicine Radix isatidis (Ban-Lan-Gen), is an extremely important economical crop in China. To facilitate biological, biochemical and molecular research on the medicinal chemicals in I. indigotica, here we report the first I. indigotica transcriptome generated by RNA sequencing (RNA-seq). Results RNA-seq library was created using RNA extracted from a mixed sample including leaf and root. A total of 33,238 unigenes were assembled from more than 28 million of high quality short reads. The quality of the assembly was experimentally examined by cDNA sequencing of seven randomly selected unigenes. Based on blast search 28,184 unigenes had a hit in at least one of the protein and nucleotide databases used in this study, and 8 unigenes were found to be associated with biosynthesis of indole and its derivatives. According to Gene Ontology classification, 22,365 unigenes were categorized into 48 functional groups. Furthermore, Clusters of Orthologous Group and Swiss-Port annotation were assigned for 7,707 and 18,679 unigenes, respectively. Analysis of repeat motifs identified 6,400 simple sequence repeat markers in 4,509 unigenes. Conclusion Our data provide a comprehensive sequence resource for molecular study of I. indigotica. Our results will facilitate studies on the functions of genes involved in the indole alkaloid biosynthesis pathway and on metabolism of nitrogen and indole alkaloids in I. indigotica and its related species. PMID:25259890
Phylogenetic Origin and Diversification of RNAi Pathway Genes in Insects.
Dowling, Daniel; Pauli, Thomas; Donath, Alexander; Meusemann, Karen; Podsiadlowski, Lars; Petersen, Malte; Peters, Ralph S; Mayer, Christoph; Liu, Shanlin; Zhou, Xin; Misof, Bernhard; Niehuis, Oliver
2016-12-01
RNA interference (RNAi) refers to the set of molecular processes found in eukaryotic organisms in which small RNA molecules mediate the silencing or down-regulation of target genes. In insects, RNAi serves a number of functions, including regulation of endogenous genes, anti-viral defense, and defense against transposable elements. Despite being well studied in model organisms, such as Drosophila, the distribution of core RNAi pathway genes and their evolution in insects is not well understood. Here we present the most comprehensive overview of the distribution and diversity of core RNAi pathway genes across 100 insect species, encompassing all currently recognized insect orders. We inferred the phylogenetic origin of insect-specific RNAi pathway genes and also identified several hitherto unrecorded gene expansions using whole-body transcriptome data from the international 1KITE (1000 Insect Transcriptome Evolution) project as well as other resources such as i5K (5000 Insect Genome Project). Specifically, we traced the origin of the double stranded RNA binding protein R2D2 to the last common ancestor of winged insects (Pterygota), the loss of Sid-1/Tag-130 orthologs in Antliophora (fleas, flies and relatives, and scorpionflies in a broad sense), and confirm previous evidence for the splitting of the Argonaute proteins Aubergine and Piwi in Brachyceran flies (Diptera, Brachycera). Our study offers new reference points for future experimental research on RNAi-related pathway genes in insects. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Pardo, Belén G; Álvarez-Dios, José Antonio; Cao, Asunción; Ramilo, Andrea; Gómez-Tato, Antonio; Planas, Josep V; Villalba, Antonio; Martínez, Paulino
2016-12-01
The flat oyster, Ostrea edulis, is one of the main farmed oysters, not only in Europe but also in the United States and Canada. Bonamiosis due to the parasite Bonamia ostreae has been associated with high mortality episodes in this species. This parasite is an intracellular protozoan that infects haemocytes, the main cells involved in oyster defence. Due to the economical and ecological importance of flat oyster, genomic data are badly needed for genetic improvement of the species, but they are still very scarce. The objective of this study is to develop a sequence database, OedulisDB, with new genomic and transcriptomic resources, providing new data and convenient tools to improve our knowledge of the oyster's immune mechanisms. Transcriptomic and genomic sequences were obtained using 454 pyrosequencing and compiled into an O. edulis database, OedulisDB, consisting of two sets of 10,318 and 7159 unique sequences that represent the oyster's genome (WG) and de novo haemocyte transcriptome (HT), respectively. The flat oyster transcriptome was obtained from two strains (naïve and tolerant) challenged with B. ostreae, and from their corresponding non-challenged controls. Approximately 78.5% of 5619 HT unique sequences were successfully annotated by Blast search using public databases. A total of 984 sequences were identified as being related to immune response and several key immune genes were identified for the first time in flat oyster. Additionally, transcriptome information was used to design and validate the first oligo-microarray in flat oyster enriched with immune sequences from haemocytes. Our transcriptomic and genomic sequencing and subsequent annotation have largely increased the scarce resources available for this economically important species and have enabled us to develop an OedulisDB database and accompanying tools for gene expression analysis. This study represents the first attempt to characterize in depth the O. edulis haemocyte transcriptome in response to B. ostreae through massively sequencing and has aided to improve our knowledge of the immune mechanisms of flat oyster. The validated oligo-microarray and the establishment of a reference transcriptome will be useful for large-scale gene expression studies in this species. Copyright © 2016 Elsevier Ltd. All rights reserved.
Valencia, Arnubio; Wang, Haichuan; Soto, Alberto; Aristizabal, Manuel; Arboleda, Jorge W; Eyun, Seong-Il; Noriega, Daniel D; Siegfried, Blair
2016-01-01
The banana weevil Cosmopolites sordidus is an important and serious insect pest in most banana and plantain-growing areas of the world. In spite of the economic importance of this insect pest very little genomic and transcriptomic information exists for this species. In the present study, we characterized the midgut transcriptome of C. sordidus using massive 454-pyrosequencing. We generated over 590,000 sequencing reads that assembled into 30,840 contigs with more than 400 bp, representing a significant expansion of existing sequences available for this insect pest. Among them, 16,427 contigs contained one or more GO terms. In addition, 15,263 contigs were assigned an EC number. In-depth transcriptome analysis identified genes potentially involved in insecticide resistance, peritrophic membrane biosynthesis, immunity-related function and defense against pathogens, and Bacillus thuringiensis toxins binding proteins as well as multiple enzymes involved with protein digestion. This transcriptome will provide a valuable resource for understanding larval physiology and for identifying novel target sites and management approaches for this important insect pest.
Valencia, Arnubio; Wang, Haichuan; Soto, Alberto; Aristizabal, Manuel; Arboleda, Jorge W.; Eyun, Seong-il; Noriega, Daniel D.; Siegfried, Blair
2016-01-01
The banana weevil Cosmopolites sordidus is an important and serious insect pest in most banana and plantain-growing areas of the world. In spite of the economic importance of this insect pest very little genomic and transcriptomic information exists for this species. In the present study, we characterized the midgut transcriptome of C. sordidus using massive 454-pyrosequencing. We generated over 590,000 sequencing reads that assembled into 30,840 contigs with more than 400 bp, representing a significant expansion of existing sequences available for this insect pest. Among them, 16,427 contigs contained one or more GO terms. In addition, 15,263 contigs were assigned an EC number. In-depth transcriptome analysis identified genes potentially involved in insecticide resistance, peritrophic membrane biosynthesis, immunity-related function and defense against pathogens, and Bacillus thuringiensis toxins binding proteins as well as multiple enzymes involved with protein digestion. This transcriptome will provide a valuable resource for understanding larval physiology and for identifying novel target sites and management approaches for this important insect pest. PMID:26949943
Toullec, Jean-Yves; Corre, Erwan; Bernay, Benoît; Thorne, Michael A. S.; Cascella, Kévin; Ollivaux, Céline; Henry, Joël; Clark, Melody S.
2013-01-01
Background The Ice krill, Euphausia crystallorophias is one of the species at the base of the Southern Ocean food chain. Given their significant contribution to the biomass of the Southern Ocean, it is vitally important to gain a better understanding of their physiology and, in particular, anticipate their responses to climate change effects in the warming seas around Antarctica. Methodology/Principal Findings Illumina sequencing was used to produce a transcriptome of the ice krill. Analysis of the assembled contigs via two different methods, produced 36 new pre-pro-peptides, coding for 61 neuropeptides or peptide hormones belonging to the following families: Allatostatins (A, B et C), Bursicon (α and β), Crustacean Hyperglycemic Hormones (CHH and MIH/VIHs), Crustacean Cardioactive Peptide (CCAP), Corazonin, Diuretic Hormones (DH), the Eclosion Hormone (EH), Neuroparsin, Neuropeptide F (NPF), small Neuropeptide F (sNPF), Pigment Dispersing Hormone (PDH), Red Pigment Concentrating Hormone (RPCH) and finally Tachykinin. LC/MS/MS proteomics was also carried out on eyestalk extracts, which are the major site of neuropeptide synthesis in decapod crustaceans. Results confirmed the presence of six neuropeptides and six precursor-related peptides previously identified in the transcriptome analyses. Conclusions This study represents the first comprehensive analysis of neuropeptide hormones in a Eucarida non-decapod Malacostraca, several of which are described for the first time in a non-decapod crustacean. Additionally, there is a potential expansion of PDH and Neuropeptide F family members, which may reflect certain life history traits such as circadian rhythms associated with diurnal migrations and also the confirmation via mass spectrometry of several novel pre-pro-peptides, of unknown function. Knowledge of these essential hormones provides a vital framework for understanding the physiological response of this key Southern Ocean species to climate change and provides a valuable resource for studies into the molecular phylogeny of these organisms and the evolution of neuropeptide hormones. PMID:23990964
Lv, Jianjian; Liu, Ping; Wang, Yu; Gao, Baoquan; Chen, Ping; Li, Jian
2013-01-01
Background The swimming crab, Portunus trituberculatus, which is naturally distributed in the coastal waters of Asia-Pacific countries, is an important farmed species in China. Salinity is one of the most important abiotic factors that influence not only the distribution and abundance of crustaceans, it is also an important factor for artificial propagation of the crab. To better understand the interaction between salinity stress and osmoregulation, we performed a transcriptome analysis in the gills of Portunus trituberculatus challenged with salinity stress, using the Illumina Deep Sequencing technology. Results We obtained 27,696,835, 28,268,353 and 33,901,271 qualified Illumina read pairs from low salinity challenged (LC), non-challenged (NC), and high salinity challenged (HC) Portunus trituberculatus cDNA libraries, respectively. The overall de novo assembly of cDNA sequence data generated 94,511 unigenes, with an average length of 644 bp. Comparative genomic analysis revealed that 1,705 genes differentially expressed in salinity stress compared to the controls, including 615 and 1,516 unigenes in NC vs LC and NC vs HC respectively. GO functional enrichment analysis results showed some differentially expressed genes were involved in crucial processes related to osmoregulation, such as ion transport processes, amino acid metabolism and synthesis processes, proteolysis process and chitin metabolic process. Conclusion This work represents the first report of the utilization of the next generation sequencing techniques for transcriptome analysis in Portunus trituberculatus and provides valuable information on salinity adaptation mechanism. Results reveal a substantial number of genes modified by salinity stress and a few important salinity acclimation pathways, which will serve as an invaluable resource for revealing the molecular basis of osmoregulation in Portunus trituberculatus. In addition, the most comprehensive sequences of transcripts reported in this study provide a rich source for identification of novel genes in the crab. PMID:24312639
The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum.
Bozdech, Zbynek; Llinás, Manuel; Pulliam, Brian Lee; Wong, Edith D; Zhu, Jingchun; DeRisi, Joseph L
2003-10-01
Plasmodium falciparum is the causative agent of the most burdensome form of human malaria, affecting 200-300 million individuals per year worldwide. The recently sequenced genome of P. falciparum revealed over 5,400 genes, of which 60% encode proteins of unknown function. Insights into the biochemical function and regulation of these genes will provide the foundation for future drug and vaccine development efforts toward eradication of this disease. By analyzing the complete asexual intraerythrocytic developmental cycle (IDC) transcriptome of the HB3 strain of P. falciparum, we demonstrate that at least 60% of the genome is transcriptionally active during this stage. Our data demonstrate that this parasite has evolved an extremely specialized mode of transcriptional regulation that produces a continuous cascade of gene expression, beginning with genes corresponding to general cellular processes, such as protein synthesis, and ending with Plasmodium-specific functionalities, such as genes involved in erythrocyte invasion. The data reveal that genes contiguous along the chromosomes are rarely coregulated, while transcription from the plastid genome is highly coregulated and likely polycistronic. Comparative genomic hybridization between HB3 and the reference genome strain (3D7) was used to distinguish between genes not expressed during the IDC and genes not detected because of possible sequence variations. Genomic differences between these strains were found almost exclusively in the highly antigenic subtelomeric regions of chromosomes. The simple cascade of gene regulation that directs the asexual development of P. falciparum is unprecedented in eukaryotic biology. The transcriptome of the IDC resembles a "just-in-time" manufacturing process whereby induction of any given gene occurs once per cycle and only at a time when it is required. These data provide to our knowledge the first comprehensive view of the timing of transcription throughout the intraerythrocytic development of P. falciparum and provide a resource for the identification of new chemotherapeutic and vaccine candidates.
Strand-specific transcriptome profiling with directly labeled RNA on genomic tiling microarrays
2011-01-01
Background With lower manufacturing cost, high spot density, and flexible probe design, genomic tiling microarrays are ideal for comprehensive transcriptome studies. Typically, transcriptome profiling using microarrays involves reverse transcription, which converts RNA to cDNA. The cDNA is then labeled and hybridized to the probes on the arrays, thus the RNA signals are detected indirectly. Reverse transcription is known to generate artifactual cDNA, in particular the synthesis of second-strand cDNA, leading to false discovery of antisense RNA. To address this issue, we have developed an effective method using RNA that is directly labeled, thus by-passing the cDNA generation. This paper describes this method and its application to the mapping of transcriptome profiles. Results RNA extracted from laboratory cultures of Porphyromonas gingivalis was fluorescently labeled with an alkylation reagent and hybridized directly to probes on genomic tiling microarrays specifically designed for this periodontal pathogen. The generated transcriptome profile was strand-specific and produced signals close to background level in most antisense regions of the genome. In contrast, high levels of signal were detected in the antisense regions when the hybridization was done with cDNA. Five antisense areas were tested with independent strand-specific RT-PCR and none to negligible amplification was detected, indicating that the strong antisense cDNA signals were experimental artifacts. Conclusions An efficient method was developed for mapping transcriptome profiles specific to both coding strands of a bacterial genome. This method chemically labels and uses extracted RNA directly in microarray hybridization. The generated transcriptome profile was free of cDNA artifactual signals. In addition, this method requires fewer processing steps and is potentially more sensitive in detecting small amount of RNA compared to conventional end-labeling methods due to the incorporation of more fluorescent molecules per RNA fragment. PMID:21235785
Joyce, Blake L.; Haug-Baltzell, Asher K.; Hulvey, Jonathan P.; McCarthy, Fiona; Devisetty, Upendra Kumar; Lyons, Eric
2017-01-01
This workflow allows novice researchers to leverage advanced computational resources such as cloud computing to carry out pairwise comparative transcriptomics. It also serves as a primer for biologists to develop data scientist computational skills, e.g. executing bash commands, visualization and management of large data sets. All command line code and further explanations of each command or step can be found on the wiki (https://wiki.cyverse.org/wiki/x/dgGtAQ). The Discovery Environment and Atmosphere platforms are connected together through the CyVerse Data Store. As such, once the initial raw sequencing data has been uploaded there is no more need to transfer large data files over an Internet connection, minimizing the amount of time needed to conduct analyses. This protocol is designed to analyze only two experimental treatments or conditions. Differential gene expression analysis is conducted through pairwise comparisons, and will not be suitable to test multiple factors. This workflow is also designed to be manual rather than automated. Each step must be executed and investigated by the user, yielding a better understanding of data and analytical outputs, and therefore better results for the user. Once complete, this protocol will yield de novo assembled transcriptome(s) for underserved (non-model) organisms without the need to map to previously assembled reference genomes (which are usually not available in underserved organism). These de novo transcriptomes are further used in pairwise differential gene expression analysis to investigate genes differing between two experimental conditions. Differentially expressed genes are then functionally annotated to understand the genetic response organisms have to experimental conditions. In total, the data derived from this protocol is used to test hypotheses about biological responses of underserved organisms. PMID:28518075
Hussain, Tajammul; Plunkett, Blue; Ejaz, Mahwish; Espley, Richard V.; Kayser, Oliver
2018-01-01
The liverwort Radula marginata belongs to the bryophyte division of land plants and is a prospective alternate source of cannabinoid-like compounds. However, mechanistic insights into the molecular pathways directing the synthesis of these cannabinoid-like compounds have been hindered due to the lack of genetic information. This prompted us to do deep sequencing, de novo assembly and annotation of R. marginata transcriptome, which resulted in the identification and validation of the genes for cannabinoid biosynthetic pathway. In total, we have identified 11,421 putative genes encoding 1,554 enzymes from 145 biosynthetic pathways. Interestingly, we have identified all the upstream genes of the central precursor of cannabinoid biosynthesis, cannabigerolic acid (CBGA), including its two first intermediates, stilbene acid (SA) and geranyl diphosphate (GPP). Expression of all these genes was validated using quantitative real-time PCR. We have characterized the protein structure of stilbene synthase (STS), which is considered as a homolog of olivetolic acid in R. marginata. Moreover, the metabolomics approach enabled us to identify CBGA-analogous compounds using electrospray ionization mass spectrometry (ESI-MS/MS) and gas chromatography mass spectrometry (GC-MS). Transcriptomic analysis revealed 1085 transcription factors (TF) from 39 families. Comparative analysis showed that six TF families have been uniquely predicted in R. marginata. In addition, the bioinformatics analysis predicted a large number of simple sequence repeats (SSRs) and non-coding RNAs (ncRNAs). Our results collectively provide mechanistic insights into the putative precursor genes for the biosynthesis of cannabinoid-like compounds and a novel transcriptomic resource for R. marginata. The large-scale transcriptomic resource generated in this study would further serve as a reference transcriptome to explore the Radulaceae family.
Gerchen, Jörn F.; Reichert, Samuel J.; Röhr, Johannes T.; Dieterich, Christoph; Kloas, Werner
2016-01-01
Large genome size, including immense repetitive and non-coding fractions, still present challenges for capacity, bioinformatics and thus affordability of whole genome sequencing in most amphibians. Here, we test the performance of a single transcriptome to understand whether it can provide a cost-efficient resource for species with large unknown genomes. Using RNA from six different tissues from a single Palearctic green toad (Bufo viridis) specimen and Hiseq2000, we obtained 22,5 Mio reads and publish >100,000 unigene sequences. To evaluate efficacy and quality, we first use this data to identify green toad specific candidate genes, known from other vertebrates for their role in sex determination and differentiation. Of a list of 37 genes, the transcriptome yielded 32 (87%), many of which providing the first such data for this non-model anuran species. However, for many of these genes, only fragments could be retrieved. In order to allow also applications to population genetics, we further used the transcriptome for the targeted development of 21 non-anonymous microsatellites and tested them in genetic families and backcrosses. Eleven markers were specifically developed to be located on the B. viridis sex chromosomes; for eight markers we can indeed demonstrate sex-specific transmission in genetic families. Depending on phylogenetic distance, several markers, which are sex-linked in green toads, show high cross-amplification success across the anuran phylogeny, involving nine systematic anuran families. Our data support the view that single transcriptome sequencing (based on multiple tissues) provides a reliable genomic resource and cost-efficient method for non-model amphibian species with large genome size and, despite limitations, should be considered as long as genome sequencing remains unaffordable for most species. PMID:27232626
Reeksting, Bianca J; Coetzer, Nanette; Mahomed, Waheed; Engelbrecht, Juanita; van den Berg, Noëlani
2014-01-01
Avocado is a diploid angiosperm containing 24 chromosomes with a genome estimated to be around 920 Mb. It is an important fruit crop worldwide but is susceptible to a root rot caused by the ubiquitous oomycete Phytophthora cinnamomi. Phytophthora root rot (PRR) causes damage to the feeder roots of trees, causing necrosis. This leads to branch-dieback and eventual tree death, resulting in severe losses in production. Control strategies are limited and at present an integrated approach involving the use of phosphite, tolerant rootstocks, and proper nursery management has shown the best results. Disease progression of PRR is accelerated under high soil moisture or flooding conditions. In addition, avocado is highly susceptible to flooding, with even short periods of flooding causing significant losses. Despite the commercial importance of avocado, limited genomic resources are available. Next generation sequencing has provided the means to generate sequence data at a relatively low cost, making this an attractive option for non-model organisms such as avocado. The aims of this study were to generate sequence data for the avocado root transcriptome and identify stress-related genes. Tissue was isolated from avocado infected with P. cinnamomi, avocado exposed to flooding and avocado exposed to a combination of these two stresses. Three separate sequencing runs were performed on the Roche 454 platform and produced approximately 124 Mb of data. This was assembled into 7685 contigs, with 106 448 sequences remaining as singletons. Genes involved in defence pathways such as the salicylic acid and jasmonic acid pathways as well as genes associated with the response to low oxygen caused by flooding, were identified. This is the most comprehensive study of transcripts derived from root tissue of avocado to date and will provide a useful resource for future studies.
Reeksting, Bianca J.; Coetzer, Nanette; Mahomed, Waheed; Engelbrecht, Juanita; van den Berg, Noëlani
2014-01-01
Avocado is a diploid angiosperm containing 24 chromosomes with a genome estimated to be around 920 Mb. It is an important fruit crop worldwide but is susceptible to a root rot caused by the ubiquitous oomycete Phytophthora cinnamomi. Phytophthora root rot (PRR) causes damage to the feeder roots of trees, causing necrosis. This leads to branch-dieback and eventual tree death, resulting in severe losses in production. Control strategies are limited and at present an integrated approach involving the use of phosphite, tolerant rootstocks, and proper nursery management has shown the best results. Disease progression of PRR is accelerated under high soil moisture or flooding conditions. In addition, avocado is highly susceptible to flooding, with even short periods of flooding causing significant losses. Despite the commercial importance of avocado, limited genomic resources are available. Next generation sequencing has provided the means to generate sequence data at a relatively low cost, making this an attractive option for non-model organisms such as avocado. The aims of this study were to generate sequence data for the avocado root transcriptome and identify stress-related genes. Tissue was isolated from avocado infected with P. cinnamomi, avocado exposed to flooding and avocado exposed to a combination of these two stresses. Three separate sequencing runs were performed on the Roche 454 platform and produced approximately 124 Mb of data. This was assembled into 7685 contigs, with 106 448 sequences remaining as singletons. Genes involved in defence pathways such as the salicylic acid and jasmonic acid pathways as well as genes associated with the response to low oxygen caused by flooding, were identified. This is the most comprehensive study of transcripts derived from root tissue of avocado to date and will provide a useful resource for future studies. PMID:24563685
The Physalis peruviana leaf transcriptome: assembly, annotation and gene model prediction
2012-01-01
Background Physalis peruviana commonly known as Cape gooseberry is a member of the Solanaceae family that has an increasing popularity due to its nutritional and medicinal values. A broad range of genomic tools is available for other Solanaceae, including tomato and potato. However, limited genomic resources are currently available for Cape gooseberry. Results We report the generation of a total of 652,614 P. peruviana Expressed Sequence Tags (ESTs), using 454 GS FLX Titanium technology. ESTs, with an average length of 371 bp, were obtained from a normalized leaf cDNA library prepared using a Colombian commercial variety. De novo assembling was performed to generate a collection of 24,014 isotigs and 110,921 singletons, with an average length of 1,638 bp and 354 bp, respectively. Functional annotation was performed using NCBI’s BLAST tools and Blast2GO, which identified putative functions for 21,191 assembled sequences, including gene families involved in all the major biological processes and molecular functions as well as defense response and amino acid metabolism pathways. Gene model predictions in P. peruviana were obtained by using the genomes of Solanum lycopersicum (tomato) and Solanum tuberosum (potato). We predict 9,436 P. peruviana sequences with multiple-exon models and conserved intron positions with respect to the potato and tomato genomes. Additionally, to study species diversity we developed 5,971 SSR markers from assembled ESTs. Conclusions We present the first comprehensive analysis of the Physalis peruviana leaf transcriptome, which will provide valuable resources for development of genetic tools in the species. Assembled transcripts with gene models could serve as potential candidates for marker discovery with a variety of applications including: functional diversity, conservation and improvement to increase productivity and fruit quality. P. peruviana was estimated to be phylogenetically branched out before the divergence of five other Solanaceae family members, S. lycopersicum, S. tuberosum, Capsicum spp, S. melongena and Petunia spp. PMID:22533342
NASA Astrophysics Data System (ADS)
Yang, Wei; Chen, Huapu; Cui, Xuefan; Zhang, Kewei; Jiang, Dongneng; Deng, Siping; Zhu, Chunhua; Li, Guangli
2017-09-01
Spotted scat (Scatophagus argus) is an economically important farmed fish, particularly in East and Southeast Asia. Because there has been little research on reproductive development and regulation in this species, the lack of a mature artificial reproduction technology remains a barrier for the sustainable development of the aquaculture industry. More genetic and genomic background knowledge is urgently needed for an in-depth understanding of the molecular mechanism of reproductive process and identification of functional genes related to sexual differentiation, gonad maturation and gametogenesis. For these reasons, we performed transcriptomic analysis on spotted scat using a multiple tissue sample mixing strategy. The Illumina RNA sequencing generated 118 510 486 raw reads. After trimming, de novo assembly was performed and yielded 99 888 unigenes with an average length of 905.75 bp. A total of 45 015 unigenes were successfully annotated to the Nr, Swiss-Prot, KOG and KEGG databases. Additionally, 23 783 and 27 183 annotated unigenes were assigned to 56 Gene Ontology (GO) functional groups and 228 KEGG pathways, respectively. Subsequently, 2 474 transcripts associated with reproduction were selected using GO term and KEGG pathway assignments, and a number of reproduction-related genes involved in sex differentiation, gonad development and gametogenesis were identified. Furthermore, 22 279 simple sequence repeat (SSR) loci were discovered and characterized. The comprehensive transcript dataset described here greatly increases the genetic information available for spotted scat and contributes valuable sequence resources for functional gene mining and analysis. Candidate transcripts involved in reproduction would make good starting points for future studies on reproductive mechanisms, and the putative sex differentiation-related genes will be helpful for sex-determining gene identification and sex-specific marker isolation. Lastly, the SSRs can serve as marker resources for future research into genetics, marker-assisted selection (MAS) and conservation biology.
The Physalis peruviana leaf transcriptome: assembly, annotation and gene model prediction.
Garzón-Martínez, Gina A; Zhu, Z Iris; Landsman, David; Barrero, Luz S; Mariño-Ramírez, Leonardo
2012-04-25
Physalis peruviana commonly known as Cape gooseberry is a member of the Solanaceae family that has an increasing popularity due to its nutritional and medicinal values. A broad range of genomic tools is available for other Solanaceae, including tomato and potato. However, limited genomic resources are currently available for Cape gooseberry. We report the generation of a total of 652,614 P. peruviana Expressed Sequence Tags (ESTs), using 454 GS FLX Titanium technology. ESTs, with an average length of 371 bp, were obtained from a normalized leaf cDNA library prepared using a Colombian commercial variety. De novo assembling was performed to generate a collection of 24,014 isotigs and 110,921 singletons, with an average length of 1,638 bp and 354 bp, respectively. Functional annotation was performed using NCBI's BLAST tools and Blast2GO, which identified putative functions for 21,191 assembled sequences, including gene families involved in all the major biological processes and molecular functions as well as defense response and amino acid metabolism pathways. Gene model predictions in P. peruviana were obtained by using the genomes of Solanum lycopersicum (tomato) and Solanum tuberosum (potato). We predict 9,436 P. peruviana sequences with multiple-exon models and conserved intron positions with respect to the potato and tomato genomes. Additionally, to study species diversity we developed 5,971 SSR markers from assembled ESTs. We present the first comprehensive analysis of the Physalis peruviana leaf transcriptome, which will provide valuable resources for development of genetic tools in the species. Assembled transcripts with gene models could serve as potential candidates for marker discovery with a variety of applications including: functional diversity, conservation and improvement to increase productivity and fruit quality. P. peruviana was estimated to be phylogenetically branched out before the divergence of five other Solanaceae family members, S. lycopersicum, S. tuberosum, Capsicum spp, S. melongena and Petunia spp.
USDA-ARS?s Scientific Manuscript database
Recently, we established and phenotypically characterized an immortalized porcine olfactory bulb neuroblast cell line, OBGF400 (Uebing-Czipura et al., 2008). To facilitate the future application of these cells in studies of neurological dysfunction and neuronal replacement therapies, a comprehensive...
Sequencing, Annotation and Analysis of the Syrian Hamster (Mesocricetus auratus) Transcriptome
Tchitchek, Nicolas; Safronetz, David; Rasmussen, Angela L.; Martens, Craig; Virtaneva, Kimmo; Porcella, Stephen F.; Feldmann, Heinz
2014-01-01
Background The Syrian hamster (golden hamster, Mesocricetus auratus) is gaining importance as a new experimental animal model for multiple pathogens, including emerging zoonotic diseases such as Ebola. Nevertheless there are currently no publicly available transcriptome reference sequences or genome for this species. Results A cDNA library derived from mRNA and snRNA isolated and pooled from the brains, lungs, spleens, kidneys, livers, and hearts of three adult female Syrian hamsters was sequenced. Sequence reads were assembled into 62,482 contigs and 111,796 reads remained unassembled (singletons). This combined contig/singleton dataset, designated as the Syrian hamster transcriptome, represents a total of 60,117,204 nucleotides. Our Mesocricetus auratus Syrian hamster transcriptome mapped to 11,648 mouse transcripts representing 9,562 distinct genes, and mapped to a similar number of transcripts and genes in the rat. We identified 214 quasi-complete transcripts based on mouse annotations. Canonical pathways involved in a broad spectrum of fundamental biological processes were significantly represented in the library. The Syrian hamster transcriptome was aligned to the current release of the Chinese hamster ovary (CHO) cell transcriptome and genome to improve the genomic annotation of this species. Finally, our Syrian hamster transcriptome was aligned against 14 other rodents, primate and laurasiatheria species to gain insights about the genetic relatedness and placement of this species. Conclusions This Syrian hamster transcriptome dataset significantly improves our knowledge of the Syrian hamster's transcriptome, especially towards its future use in infectious disease research. Moreover, this library is an important resource for the wider scientific community to help improve genome annotation of the Syrian hamster and other closely related species. Furthermore, these data provide the basis for development of expression microarrays that can be used in functional genomics studies. PMID:25398096
USDA-ARS?s Scientific Manuscript database
The development of resources for genomic studies in Mangifera indica (mango) will allow marker-assisted selection and identification of genetically diverse germplasm, greatly aiding mango breeding programs. We report here a first step in developing such resources, our identification of thousands una...
Analysis of the Citrullus colocynthis Transcriptome during Water Deficit Stress
Wang, Zhuoyu; Hu, Hongtao; Goertzen, Leslie R.; McElroy, J. Scott; Dane, Fenny
2014-01-01
Citrullus colocynthis is a very drought tolerant species, closely related to watermelon (C. lanatus var. lanatus), an economically important cucurbit crop. Drought is a threat to plant growth and development, and the discovery of drought inducible genes with various functions is of great importance. We used high throughput mRNA Illumina sequencing technology and bioinformatic strategies to analyze the C. colocynthis leaf transcriptome under drought treatment. Leaf samples at four different time points (0, 24, 36, or 48 hours of withholding water) were used for RNA extraction and Illumina sequencing. qRT-PCR of several drought responsive genes was performed to confirm the accuracy of RNA sequencing. Leaf transcriptome analysis provided the first glimpse of the drought responsive transcriptome of this unique cucurbit species. A total of 5038 full-length cDNAs were detected, with 2545 genes showing significant changes during drought stress. Principle component analysis indicated that drought was the major contributing factor regulating transcriptome changes. Up regulation of many transcription factors, stress signaling factors, detoxification genes, and genes involved in phytohormone signaling and citrulline metabolism occurred under the water deficit conditions. The C. colocynthis transcriptome data highlight the activation of a large set of drought related genes in this species, thus providing a valuable resource for future functional analysis of candidate genes in defense of drought stress. PMID:25118696
Aging-like Changes in the Transcriptome of Irradiated Microglia
Li, Matthew D.; Burns, Terry C.; Kumar, Sunny; Morgan, Alexander A.; Sloan, Steven A.; Palmer, Theo D.
2014-01-01
Whole brain irradiation remains important in the management of brain tumors. Although necessary for improving survival outcomes, cranial irradiation also results in cognitive decline in long-term survivors. A chronic inflammatory state characterized by microglial activation has been implicated in radiation-induced brain injury. We here provide the first comprehensive transcriptional profile of irradiated microglia. Fluorescence-activated cell sorting (FACS) was used to isolate CD11b+ microglia from the hippocampi of C57BL/6 and Balb/c mice 1 month after 10Gy cranial irradiation. Affymetrix gene expression profiles were evaluated using linear modeling, rank product analyses. One month after irradiation, a conserved irradiation signature across strains was identified, comprising 448 and 85 differentially up- and down-regulated genes, respectively. Gene set enrichment analysis (GSEA) demonstrated enrichment for inflammation, including M1 macrophage-associated genes, but also an unexpected enrichment for extracellular matrix and blood coagulation-related gene sets, in contrast previously described microglial states. Weighted gene co-expression network analysis (WGCNA) confirmed these findings and further revealed alterations in mitochondrial function. The RNA-seq transcriptome of microglia 24h post-radiation proved similar to the 1-month transcriptome, but additionally featured alterations in apoptotic and lysosomal gene expression. Re-analysis of published aging mouse microglia transcriptome data demonstrated striking similarity to the 1 month irradiated microglia transcriptome, suggesting that shared mechanisms may underlie aging and chronic irradiation-induced cognitive decline. PMID:25690519
Shen, Di; Wang, Haiping; Wu, Qingjun; Lu, Peng; Qiu, Yang; Song, Jiangping; Zhang, Youjun; Li, Xixiang
2013-01-01
Background The diamondback moth (DBM, Plutella xylostella) is a crucifer-specific pest that causes significant crop losses worldwide. Barbarea vulgaris (Brassicaceae) can resist DBM and other herbivorous insects by producing feeding-deterrent triterpenoid saponins. Plant breeders have long aimed to transfer this insect resistance to other crops. However, a lack of knowledge on the biosynthetic pathways and regulatory networks of these insecticidal saponins has hindered their practical application. A pyrosequencing-based transcriptome analysis of B. vulgaris during DBM larval feeding was performed to identify genes and gene networks responsible for saponin biosynthesis and its regulation at the genome level. Principal Findings Approximately 1.22, 1.19, 1.16, 1.23, 1.16, 1.20, and 2.39 giga base pairs of clean nucleotides were generated from B. vulgaris transcriptomes sampled 1, 4, 8, 12, 24, and 48 h after onset of P. xylostella feeding and from non-inoculated controls, respectively. De novo assembly using all data of the seven transcriptomes generated 39,531 unigenes. A total of 37,780 (95.57%) unigenes were annotated, 14,399 of which were assigned to one or more gene ontology terms and 19,620 of which were assigned to 126 known pathways. Expression profiles revealed 2,016–4,685 up-regulated and 557–5188 down-regulated transcripts. Secondary metabolic pathways, such as those of terpenoids, glucosinolates, and phenylpropanoids, and its related regulators were elevated. Candidate genes for the triterpene saponin pathway were found in the transcriptome. Orthological analysis of the transcriptome with four other crucifer transcriptomes identified 592 B. vulgaris-specific gene families with a P-value cutoff of 1e−5. Conclusion This study presents the first comprehensive transcriptome analysis of B. vulgaris subjected to a series of DBM feedings. The biosynthetic and regulatory pathways of triterpenoid saponins and other DBM deterrent metabolites in this plant were classified. The results of this study will provide useful data for future investigations on pest-resistance phytochemistry and plant breeding. PMID:23696897
Generation of a foveomacular transcriptome
Bernstein, Steven; Wong, Paul W.
2014-01-01
Purpose Organizing molecular biologic data is a growing challenge since the rate of data accumulation is steadily increasing. Information relevant to a particular biologic query can be difficult to extract from the comprehensive databases currently available. We present a data collection and organization model designed to ameliorate these problems and applied it to generate an expressed sequence tag (EST)–based foveomacular transcriptome. Methods Using Perl, MySQL, EST libraries, screening, and human foveomacular gene expression as a model system, we generated a foveomacular transcriptome database enriched for molecularly relevant data. Results Using foveomacula as a gene expression model tissue, we identified and organized 6,056 genes expressed in that tissue. Of those identified genes, 3,480 had not been previously described as expressed in the foveomacula. Internal experimental controls as well as comparison of our data set to published data sets suggest we do not yet have a complete description of the foveomacula transcriptome. Conclusions We present an organizational method designed to amplify the utility of data pertinent to a specific research interest. Our method is generic enough to be applicable to a variety of conditions yet focused enough to allow for specialized study. PMID:24991187
Rodriguez-Alonso, Gustavo; Matvienko, Marta; López-Valle, Mayra L; Lázaro-Mixteco, Pedro E; Napsucialy-Mendivil, Selene; Dubrovsky, Joseph G; Shishkova, Svetlana
2018-06-04
Many Cactaceae species exhibit determinate growth of the primary root as a consequence of root apical meristem (RAM) exhaustion. The genetic regulation of this growth pattern is unknown. Here, we de novo assembled and annotated the root apex transcriptome of the Pachycereus pringlei primary root at three developmental stages, with active or exhausted RAM. The assembled transcriptome is robust and comprehensive, and was used to infer a transcriptional regulatory network of the primary root apex. Putative orthologues of Arabidopsis regulators of RAM maintenance, as well as putative lineage-specific transcripts were identified. The transcriptome revealed putative orthologues of most proteins involved in housekeeping processes, hormone signalling, and metabolic pathways. Our results suggest that specific transcriptional programs operate in the root apex at specific developmental time points. Moreover, the transcriptional state of the P. pringlei root apex as the RAM becomes exhausted is comparable to the transcriptional state of cells from the meristematic, elongation, and differentiation zones of Arabidopsis roots along the root axis. We suggest that the transcriptional program underlying the drought stress response is induced during Cactaceae root development, and that lineage-specific transcripts could contribute to RAM exhaustion in Cactaceae.
Ochsner, Scott A.; Tsimelzon, Anna; Dong, Jianrong; Coarfa, Cristian
2016-01-01
The pregnane X receptor (PXR) (PXR/NR1I3) and constitutive androstane receptor (CAR) (CAR/NR1I2) members of the nuclear receptor (NR) superfamily of ligand-regulated transcription factors are well-characterized mediators of xenobiotic and endocrine-disrupting chemical signaling. The Nuclear Receptor Signaling Atlas maintains a growing library of transcriptomic datasets involving perturbations of NR signaling pathways, many of which involve perturbations relevant to PXR and CAR xenobiotic signaling. Here, we generated a reference transcriptome based on the frequency of differential expression of genes across 159 experiments compiled from 22 datasets involving perturbations of CAR and PXR signaling pathways. In addition to the anticipated overrepresentation in the reference transcriptome of genes encoding components of the xenobiotic stress response, the ranking of genes involved in carbohydrate metabolism and gonadotropin action sheds mechanistic light on the suspected role of xenobiotics in metabolic syndrome and reproductive disorders. Gene Set Enrichment Analysis showed that although acetaminophen, chlorpromazine, and phenobarbital impacted many similar gene sets, differences in direction of regulation were evident in a variety of processes. Strikingly, gene sets representing genes linked to Parkinson's, Huntington's, and Alzheimer's diseases were enriched in all 3 transcriptomes. The reference xenobiotic transcriptome will be supplemented with additional future datasets to provide the community with a continually updated reference transcriptomic dataset for CAR- and PXR-mediated xenobiotic signaling. Our study demonstrates how aggregating and annotating transcriptomic datasets, and making them available for routine data mining, facilitates research into the mechanisms by which xenobiotics and endocrine-disrupting chemicals subvert conventional NR signaling modalities. PMID:27409825
Ochsner, Scott A; Tsimelzon, Anna; Dong, Jianrong; Coarfa, Cristian; McKenna, Neil J
2016-08-01
The pregnane X receptor (PXR) (PXR/NR1I3) and constitutive androstane receptor (CAR) (CAR/NR1I2) members of the nuclear receptor (NR) superfamily of ligand-regulated transcription factors are well-characterized mediators of xenobiotic and endocrine-disrupting chemical signaling. The Nuclear Receptor Signaling Atlas maintains a growing library of transcriptomic datasets involving perturbations of NR signaling pathways, many of which involve perturbations relevant to PXR and CAR xenobiotic signaling. Here, we generated a reference transcriptome based on the frequency of differential expression of genes across 159 experiments compiled from 22 datasets involving perturbations of CAR and PXR signaling pathways. In addition to the anticipated overrepresentation in the reference transcriptome of genes encoding components of the xenobiotic stress response, the ranking of genes involved in carbohydrate metabolism and gonadotropin action sheds mechanistic light on the suspected role of xenobiotics in metabolic syndrome and reproductive disorders. Gene Set Enrichment Analysis showed that although acetaminophen, chlorpromazine, and phenobarbital impacted many similar gene sets, differences in direction of regulation were evident in a variety of processes. Strikingly, gene sets representing genes linked to Parkinson's, Huntington's, and Alzheimer's diseases were enriched in all 3 transcriptomes. The reference xenobiotic transcriptome will be supplemented with additional future datasets to provide the community with a continually updated reference transcriptomic dataset for CAR- and PXR-mediated xenobiotic signaling. Our study demonstrates how aggregating and annotating transcriptomic datasets, and making them available for routine data mining, facilitates research into the mechanisms by which xenobiotics and endocrine-disrupting chemicals subvert conventional NR signaling modalities.
Florio, Marta; Heide, Michael; Pinson, Anneline; Brandl, Holger; Albert, Mareike; Winkler, Sylke; Wimberger, Pauline; Huttner, Wieland B; Hiller, Michael
2018-03-21
Understanding the molecular basis that underlies the expansion of the neocortex during primate, and notably human, evolution requires the identification of genes that are particularly active in the neural stem and progenitor cells of the developing neocortex. Here, we have used existing transcriptome datasets to carry out a comprehensive screen for protein-coding genes preferentially expressed in progenitors of fetal human neocortex. We show that 15 human-specific genes exhibit such expression, and many of them evolved distinct neural progenitor cell-type expression profiles and levels compared to their ancestral paralogs. Functional studies on one such gene, NOTCH2NL , demonstrate its ability to promote basal progenitor proliferation in mice. An additional 35 human genes with progenitor-enriched expression are shown to have orthologs only in primates. Our study provides a resource of genes that are promising candidates to exert specific, and novel, roles in neocortical development during primate, and notably human, evolution. © 2018, Florio et al.
Distinct polyadenylation landscapes of diverse human tissues revealed by a modified PA-seq strategy
2013-01-01
Background Polyadenylation is a key regulatory step in eukaryotic gene expression and one of the major contributors of transcriptome diversity. Aberrant polyadenylation often associates with expression defects and leads to human diseases. Results To better understand global polyadenylation regulation, we have developed a polyadenylation sequencing (PA-seq) approach. By profiling polyadenylation events in 13 human tissues, we found that alternative cleavage and polyadenylation (APA) is prevalent in both protein-coding and noncoding genes. In addition, APA usage, similar to gene expression profiling, exhibits tissue-specific signatures and is sufficient for determining tissue origin. A 3′ untranslated region shortening index (USI) was further developed for genes with tandem APA sites. Strikingly, the results showed that different tissues exhibit distinct patterns of shortening and/or lengthening of 3′ untranslated regions, suggesting the intimate involvement of APA in establishing tissue or cell identity. Conclusions This study provides a comprehensive resource to uncover regulated polyadenylation events in human tissues and to characterize the underlying regulatory mechanism. PMID:24025092
Pinson, Anneline; Brandl, Holger; Albert, Mareike; Winkler, Sylke; Wimberger, Pauline
2018-01-01
Understanding the molecular basis that underlies the expansion of the neocortex during primate, and notably human, evolution requires the identification of genes that are particularly active in the neural stem and progenitor cells of the developing neocortex. Here, we have used existing transcriptome datasets to carry out a comprehensive screen for protein-coding genes preferentially expressed in progenitors of fetal human neocortex. We show that 15 human-specific genes exhibit such expression, and many of them evolved distinct neural progenitor cell-type expression profiles and levels compared to their ancestral paralogs. Functional studies on one such gene, NOTCH2NL, demonstrate its ability to promote basal progenitor proliferation in mice. An additional 35 human genes with progenitor-enriched expression are shown to have orthologs only in primates. Our study provides a resource of genes that are promising candidates to exert specific, and novel, roles in neocortical development during primate, and notably human, evolution. PMID:29561261
Mammary molecular portraits reveal lineage-specific features and progenitor cell vulnerabilities.
Casey, Alison E; Sinha, Ankit; Singhania, Rajat; Livingstone, Julie; Waterhouse, Paul; Tharmapalan, Pirashaanthy; Cruickshank, Jennifer; Shehata, Mona; Drysdale, Erik; Fang, Hui; Kim, Hyeyeon; Isserlin, Ruth; Bailey, Swneke; Medina, Tiago; Deblois, Genevieve; Shiah, Yu-Jia; Barsyte-Lovejoy, Dalia; Hofer, Stefan; Bader, Gary; Lupien, Mathieu; Arrowsmith, Cheryl; Knapp, Stefan; De Carvalho, Daniel; Berman, Hal; Boutros, Paul C; Kislinger, Thomas; Khokha, Rama
2018-06-19
The mammary epithelium depends on specific lineages and their stem and progenitor function to accommodate hormone-triggered physiological demands in the adult female. Perturbations of these lineages underpin breast cancer risk, yet our understanding of normal mammary cell composition is incomplete. Here, we build a multimodal resource for the adult gland through comprehensive profiling of primary cell epigenomes, transcriptomes, and proteomes. We define systems-level relationships between chromatin-DNA-RNA-protein states, identify lineage-specific DNA methylation of transcription factor binding sites, and pinpoint proteins underlying progesterone responsiveness. Comparative proteomics of estrogen and progesterone receptor-positive and -negative cell populations, extensive target validation, and drug testing lead to discovery of stem and progenitor cell vulnerabilities. Top epigenetic drugs exert cytostatic effects; prevent adult mammary cell expansion, clonogenicity, and mammopoiesis; and deplete stem cell frequency. Select drugs also abrogate human breast progenitor cell activity in normal and high-risk patient samples. This integrative computational and functional study provides fundamental insight into mammary lineage and stem cell biology. © 2018 Casey et al.
The developmental proteome of Drosophila melanogaster
Casas-Vila, Nuria; Bluhm, Alina; Sayols, Sergi; Dinges, Nadja; Dejung, Mario; Altenhein, Tina; Kappei, Dennis; Altenhein, Benjamin; Roignant, Jean-Yves; Butter, Falk
2017-01-01
Drosophila melanogaster is a widely used genetic model organism in developmental biology. While this model organism has been intensively studied at the RNA level, a comprehensive proteomic study covering the complete life cycle is still missing. Here, we apply label-free quantitative proteomics to explore proteome remodeling across Drosophila’s life cycle, resulting in 7952 proteins, and provide a high temporal-resolved embryogenesis proteome of 5458 proteins. Our proteome data enabled us to monitor isoform-specific expression of 34 genes during development, to identify the pseudogene Cyp9f3Ψ as a protein-coding gene, and to obtain evidence of 268 small proteins. Moreover, the comparison with available transcriptomic data uncovered examples of poor correlation between mRNA and protein, underscoring the importance of proteomics to study developmental progression. Data integration of our embryogenesis proteome with tissue-specific data revealed spatial and temporal information for further functional studies of yet uncharacterized proteins. Overall, our high resolution proteomes provide a powerful resource and can be explored in detail in our interactive web interface. PMID:28381612
Comparative de novo transcriptome analysis of male and female Sea buckthorn.
Bansal, Ankush; Salaria, Mehul; Sharma, Tashil; Stobdan, Tsering; Kant, Anil
2018-02-01
Sea buckthorn is a dioecious medicinal plant found at high altitude. The plant has both male and female reproductive organs in separate individuals. In this article, whole transcriptome de novo assemblies of male and female flower bud samples were carried out using Illumina NextSeq 500 platform to determine the role of the genes involved in sex determination. Moreover, genes with differential expression in male and female transcriptomes were identified to understand the underlying sex determination mechanism. The current study showed 63,904 and 62,272 coding sequences (CDS) in female and male transcriptome data sets, respectively. 16,831 common CDS were screened out from both transcriptomes, out of which 625 were upregulated and 491 were found to be downregulated. To understand the potential regulatory roles of differentially expressed genes in metabolic networks and biosynthetic pathways: KEGG mapping, gene ontology, and co-expression network analysis were performed. Comparison with Flowering Interactive Database (FLOR-ID) resulted in eight differentially expressed genes viz. CHD3-type chromatin-remodeling factor PICKLE ( PKL ), phytochrome-associated serine/threonine-protein phosphatase ( FYPP ), protein TOPLESS ( TPL ), sensitive to freezing 6 ( SFR6 ), lysine-specific histone demethylase 1 homolog 1 ( LDL1 ), pre-mRNA-processing-splicing factor 8A ( PRP8A ), sucrose synthase 4 ( SUS4 ), ubiquitin carboxyl-terminal hydrolase 12 ( UBP12 ), known to be broadly involved in flowering, photoperiodism, embryo development, and cold response pathways. Male and female flower bud transcriptome data of Sea buckthorn may provide comprehensive information at genomic level for the identification of genetic regulation involved in sex determination.
2011-01-01
Background Understanding polyphenism, the ability of a single genome to express multiple morphologically and behaviourally distinct phenotypes, is an important goal for evolutionary and developmental biology. Polyphenism has been key to the evolution of the Hymenoptera, and particularly the social Hymenoptera where the genome of a single species regulates distinct larval stages, sexual dimorphism and physical castes within the female sex. Transcriptomic analyses of social Hymenoptera will therefore provide unique insights into how changes in gene expression underlie such complexity. Here we describe gene expression in individual specimens of the pre-adult stages, sexes and castes of the key pollinator, the buff-tailed bumblebee Bombus terrestris. Results cDNA was prepared from mRNA from five life cycle stages (one larva, one pupa, one male, one gyne and two workers) and a total of 1,610,742 expressed sequence tags (ESTs) were generated using Roche 454 technology, substantially increasing the sequence data available for this important species. Overlapping ESTs were assembled into 36,354 B. terrestris putative transcripts, and functionally annotated. A preliminary assessment of differences in gene expression across non-replicated specimens from the pre-adult stages, castes and sexes was performed using R-STAT analysis. Individual samples from the life cycle stages of the bumblebee differed in the expression of a wide array of genes, including genes involved in amino acid storage, metabolism, immunity and olfaction. Conclusions Detailed analyses of immune and olfaction gene expression across phenotypes demonstrated how transcriptomic analyses can inform our understanding of processes central to the biology of B. terrestris and the social Hymenoptera in general. For example, examination of immunity-related genes identified high conservation of important immunity pathway components across individual specimens from the life cycle stages while olfactory-related genes exhibited differential expression with a wider repertoire of gene expression within adults, especially sexuals, in comparison to immature stages. As there is an absence of replication across the samples, the results of this study are preliminary but provide a number of candidate genes which may be related to distinct phenotypic stage expression. This comprehensive transcriptome catalogue will provide an important gene discovery resource for directed programmes in ecology, evolution and conservation of a key pollinator. PMID:22185240
Zhan, Chuansong; Li, Xiaohua; Zhao, Zeying; Yang, Tewu; Wang, Xuekui; Luo, Biaobiao; Zhang, Qiyun; Hu, Yanru; Hu, Xuebo
2016-01-01
Background: Anemone flaccida Fr. Shmidt (Ranunculaceae), commonly known as ‘Di Wu’ in China, is a perennial herb with limited distribution. The rhizome of A. flaccida has long been used to treat arthritis as a tradition in China. Studies disclosed that the plant contains a rich source of triterpenoid saponins. However, little is known about triterpenoid saponins biosynthesis in A. flaccida. Results: In this study, we conducted the tandem transcriptome and proteome profiling of a non-model medicinal plant, A. flaccida. Using Illumina HiSeq 2000 sequencing and iTRAQ technique, a total of 46,962 high-quality unigenes were obtained with an average sequence length of 1,310 bp, along with 1473 unique proteins from A. flaccida. Among the A. flaccida transcripts, 36,617 (77.97%) showed significant similarity (E-value < 1e-5) to the known proteins in the public database. Of the total 46,962 unigenes, 36,617 open reading frame (ORFs) were predicted. By the fragments per kilobases per million reads (FPKM) statistics, 14,004 isoforms/unigenes were found to be upregulated, and 14,090 isoforms/unigenes were down-regulated in the rhizomes as compared to those in the leaves. Based on the bioinformatics analysis, all possible enzymes involved in the triterpenoid saponins biosynthetic pathway of A. flaccida were identified, including cytosolic mevalonate pathway (MVA) and the plastidial methylerythritol pathway (MEP). Additionally, a total of 126 putative cytochrome P450 (CYP450) and 32 putative UDP glycosyltransferases were selected as the candidates of triterpenoid saponins modifiers. Among them, four of them were annotated as the gene of CYP716A subfamily, the key enzyme in the oleanane-type triterpenoid saponins biosynthetic pathway. Furthermore, based on RNA-Seq and proteome analysis, as well as quantitative RT-PCR verification, the expression level of gene and protein committed to triterpenoids biosynthesis in the leaf versus the rhizome was compared. Conclusion: A combination of the de novo transcriptome and proteome profiling based on the Illumina HiSeq 2000 sequencing platform and iTRAQ technique was shown to be a powerful method for the discovery of candidate genes, which encoded enzymes that were responsible for the biosynthesis of novel secondary metabolites in a non-model plant. The transcriptome data of our study provides a very important resource for the understanding of the triterpenoid saponins biosynthesis of A. flaccida. PMID:27504115
Workflow and web application for annotating NCBI BioProject transcriptome data
Vera Alvarez, Roberto; Medeiros Vidal, Newton; Garzón-Martínez, Gina A.; Barrero, Luz S.; Landsman, David
2017-01-01
Abstract The volume of transcriptome data is growing exponentially due to rapid improvement of experimental technologies. In response, large central resources such as those of the National Center for Biotechnology Information (NCBI) are continually adapting their computational infrastructure to accommodate this large influx of data. New and specialized databases, such as Transcriptome Shotgun Assembly Sequence Database (TSA) and Sequence Read Archive (SRA), have been created to aid the development and expansion of centralized repositories. Although the central resource databases are under continual development, they do not include automatic pipelines to increase annotation of newly deposited data. Therefore, third-party applications are required to achieve that aim. Here, we present an automatic workflow and web application for the annotation of transcriptome data. The workflow creates secondary data such as sequencing reads and BLAST alignments, which are available through the web application. They are based on freely available bioinformatics tools and scripts developed in-house. The interactive web application provides a search engine and several browser utilities. Graphical views of transcript alignments are available through SeqViewer, an embedded tool developed by NCBI for viewing biological sequence data. The web application is tightly integrated with other NCBI web applications and tools to extend the functionality of data processing and interconnectivity. We present a case study for the species Physalis peruviana with data generated from BioProject ID 67621. Database URL: http://www.ncbi.nlm.nih.gov/projects/physalis/ PMID:28605765
Won, Harim I.; Schulze, Thomas T.; Clement, Emalie J.; Watson, Gabrielle F.; Watson, Sean M.; Warner, Rosalie C.; Ramler, Elizabeth A. M.; Witte, Elias J.; Schoenbeck, Mark A.; Rauter, Claudia M.; Davis, Paul H.
2018-01-01
Burying beetles (Nicrophorus spp.) are among the relatively few insects that provide parental care while not belonging to the eusocial insects such as ants or bees. This behavior incurs energy costs as evidenced by immune deficits and shorter life-spans in reproducing beetles. In the absence of an assembled transcriptome, relatively little is known concerning the molecular biology of these beetles. This work details the assembly and analysis of the Nicrophorus orbicollis transcriptome at multiple developmental stages. RNA-Seq reads were obtained by next-generation sequencing and the transcriptome was assembled using the Trinity assembler. Validation of the assembly was performed by functional characterization using Gene Ontology (GO), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. Differential expression analysis highlights developmental stage-specific expression patterns, and immunity-related transcripts are discussed. The data presented provides a valuable molecular resource to aid further investigation into immunocompetence throughout this organism's sexual development. PMID:29707046
The duck genome and transcriptome provide insight into an avian influenza virus reservoir species
Chen, Hualan; Zhang, Yong; Qian, Wubin; Kim, Heebal; Gan, Shangquan; Zhao, Yiqiang; Li, Jianwen; Yi, Kang; Feng, Huapeng; Zhu, Pengyang; Li, Bo; Liu, Qiuyue; Fairley, Suan; Magor, Katharine E; Du, Zhenlin; Hu, Xiaoxiang; Goodman, Laurie; Tafer, Hakim; Vignal, Alain; Lee, Taeheon; Kim, Kyu-Won; Sheng, Zheya; An, Yang; Searle, Steve; Herrero, Javier; Groenen, Martien A M; Crooijmans, Richard P M A; Faraut, Thomas; Cai, Qingle; Webster, Robert G; Aldridge, Jerry R; Warren, Wesley C; Bartschat, Sebastian; Kehr, Stephanie; Marz, Manja; Stadler, Peter F; Smith, Jacqueline; Kraus, Robert H S; Zhao, Yaofeng; Ren, Liming; Fei, Jing; Morisson, Mireille; Kaiser, Pete; Griffin, Darren K; Rao, Man; Pitel, Frederique; Wang, Jun; Li, Ning
2014-01-01
The duck (Anas platyrhynchos) is one of the principal natural hosts of influenza A viruses. We present the duck genome sequence and perform deep transcriptome analyses to investigate immune-related genes. Our data indicate that the duck possesses a contractive immune gene repertoire, as in chicken and zebra finch, and this repertoire has been shaped through lineage-specific duplications. We identify genes that are responsive to influenza A viruses using the lung transcriptomes of control ducks and ones that were infected with either a highly pathogenic (A/duck/Hubei/49/05) or a weakly pathogenic (A/goose/Hubei/65/05) H5N1 virus. Further, we show how the duck’s defense mechanisms against influenza infection have been optimized through the diversification of its β-defensin and butyrophilin-like repertoires. These analyses, in combination with the genomic and transcriptomic data, provide a resource for characterizing the interaction between host and influenza viruses. PMID:23749191
2011-01-01
Background The genus Silene is widely used as a model system for addressing ecological and evolutionary questions in plants, but advances in using the genus as a model system are impeded by the lack of available resources for studying its genome. Massively parallel sequencing cDNA has recently developed into an efficient method for characterizing the transcriptomes of non-model organisms, generating massive amounts of data that enable the study of multiple species in a comparative framework. The sequences generated provide an excellent resource for identifying expressed genes, characterizing functional variation and developing molecular markers, thereby laying the foundations for future studies on gene sequence and gene expression divergence. Here, we report the results of a comparative transcriptome sequencing study of eight individuals representing four Silene and one Dianthus species as outgroup. All sequences and annotations have been deposited in a newly developed and publicly available database called SiESTa, the Silene EST annotation database. Results A total of 1,041,122 EST reads were generated in two runs on a Roche GS-FLX 454 pyrosequencing platform. EST reads were analyzed separately for all eight individuals sequenced and were assembled into contigs using TGICL. These were annotated with results from BLASTX searches and Gene Ontology (GO) terms, and thousands of single-nucleotide polymorphisms (SNPs) were characterized. Unassembled reads were kept as singletons and together with the contigs contributed to the unigenes characterized in each individual. The high quality of unigenes is evidenced by the proportion (49%) that have significant hits in similarity searches with the A. thaliana proteome. The SiESTa database is accessible at http://www.siesta.ethz.ch. Conclusion The sequence collections established in the present study provide an important genomic resource for four Silene and one Dianthus species and will help to further develop Silene as a plant model system. The genes characterized will be useful for future research not only in the species included in the present study, but also in related species for which no genomic resources are yet available. Our results demonstrate the efficiency of massively parallel transcriptome sequencing in a comparative framework as an approach for developing genomic resources in diverse groups of non-model organisms. PMID:21791039
Vatanparast, Mohammad; Shetty, Prateek; Chopra, Ratan; Doyle, Jeff J; Sathyanarayana, N; Egan, Ashley N
2016-06-30
Winged bean, Psophocarpus tetragonolobus (L.) DC., is similar to soybean in yield and nutritional value but more viable in tropical conditions. Here, we strengthen genetic resources for this orphan crop by producing a de novo transcriptome assembly and annotation of two Sri Lankan accessions (denoted herein as CPP34 [PI 491423] and CPP37 [PI 639033]), developing simple sequence repeat (SSR) markers, and identifying single nucleotide polymorphisms (SNPs) between geographically separated genotypes. A combined assembly based on 804,757 reads from two accessions produced 16,115 contigs with an N50 of 889 bp, over 90% of which has significant sequence similarity to other legumes. Combining contigs with singletons produced 97,241 transcripts. We identified 12,956 SSRs, including 2,594 repeats for which primers were designed and 5,190 high-confidence SNPs between Sri Lankan and Nigerian genotypes. The transcriptomic data sets generated here provide new resources for gene discovery and marker development in this orphan crop, and will be vital for future plant breeding efforts. We also analyzed the soybean trypsin inhibitor (STI) gene family, important plant defense genes, in the context of related legumes and found evidence for radiation of the Kunitz trypsin inhibitor (KTI) gene family within winged bean.
Hsu, Chi-Lin; Chou, Chih-Hsuan; Huang, Shih-Chuan; Lin, Chia-Yi; Lin, Meng-Ying; Tung, Chun-Che; Lin, Chun-Yen; Lai, Ivan Pochou; Zou, Yan-Fang; Youngson, Neil A; Lin, Shau-Ping; Yang, Chang-Hao; Chen, Shih-Kuo; Gau, Susan Shur-Fen; Huang, Hsien-Sung
2018-03-15
Visual system development is light-experience dependent, which strongly implicates epigenetic mechanisms in light-regulated maturation. Among many epigenetic processes, genomic imprinting is an epigenetic mechanism through which monoallelic gene expression occurs in a parent-of-origin-specific manner. It is unknown if genomic imprinting contributes to visual system development. We profiled the transcriptome and imprintome during critical periods of mouse visual system development under normal- and dark-rearing conditions using B6/CAST F1 hybrid mice. We identified experience-regulated, isoform-specific and brain-region-specific imprinted genes. We also found imprinted microRNAs were predominantly clustered into the Dlk1-Dio3 imprinted locus with light experience affecting some imprinted miRNA expression. Our findings provide the first comprehensive analysis of light-experience regulation of the transcriptome and imprintome during critical periods of visual system development. Our results may contribute to therapeutic strategies for visual impairments and circadian rhythm disorders resulting from a dysfunctional imprintome.
Single-cell transcriptome of early embryos and cultured embryonic stem cells of cynomolgus monkeys
Nakamura, Tomonori; Yabuta, Yukihiro; Okamoto, Ikuhiro; Sasaki, Kotaro; Iwatani, Chizuru; Tsuchiya, Hideaki; Saitou, Mitinori
2017-01-01
In mammals, the development of pluripotency and specification of primordial germ cells (PGCs) have been studied predominantly using mice as a model organism. However, divergences among mammalian species for such processes have begun to be recognized. Between humans and mice, pre-implantation development appears relatively similar, but the manner and morphology of post-implantation development are significantly different. Nevertheless, the embryogenesis just after implantation in primates, including the specification of PGCs, has been unexplored due to the difficulties in analyzing the embryos at relevant developmental stages. Here, we present a comprehensive single-cell transcriptome dataset of pre- and early post-implantation embryo cells, PGCs and embryonic stem cells (ESCs) of cynomolgus monkeys as a model of higher primates. The identities of each transcriptome were also validated rigorously by other way such as immunofluorescent analysis. The information reported here will serve as a foundation for our understanding of a wide range of processes in the developmental biology of primates, including humans. PMID:28649393
Isoform Sequencing and State-of-Art Applications for Unravelling Complexity of Plant Transcriptomes
An, Dong; Li, Changsheng; Humbeck, Klaus
2018-01-01
Single-molecule real-time (SMRT) sequencing developed by PacBio, also called third-generation sequencing (TGS), offers longer reads than the second-generation sequencing (SGS). Given its ability to obtain full-length transcripts without assembly, isoform sequencing (Iso-Seq) of transcriptomes by PacBio is advantageous for genome annotation, identification of novel genes and isoforms, as well as the discovery of long non-coding RNA (lncRNA). In addition, Iso-Seq gives access to the direct detection of alternative splicing, alternative polyadenylation (APA), gene fusion, and DNA modifications. Such applications of Iso-Seq facilitate the understanding of gene structure, post-transcriptional regulatory networks, and subsequently proteomic diversity. In this review, we summarize its applications in plant transcriptome study, specifically pointing out challenges associated with each step in the experimental design and highlight the development of bioinformatic pipelines. We aim to provide the community with an integrative overview and a comprehensive guidance to Iso-Seq, and thus to promote its applications in plant research. PMID:29346292
EuPathDB: the eukaryotic pathogen genomics database resource
Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie
2017-01-01
The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
Rangel, Luiz Thibério; Novaes, Jeniffer; Durham, Alan M.; Madeira, Alda Maria B. N.; Gruber, Arthur
2013-01-01
Parasites of the genus Eimeria infect a wide range of vertebrate hosts, including chickens. We have recently reported a comparative analysis of the transcriptomes of Eimeria acervulina, Eimeria maxima and Eimeria tenella, integrating ORESTES data produced by our group and publicly available Expressed Sequence Tags (ESTs). All cDNA reads have been assembled, and the reconstructed transcripts have been submitted to a comprehensive functional annotation pipeline. Additional studies included orthology assignment across apicomplexan parasites and clustering analyses of gene expression profiles among different developmental stages of the parasites. To make all this body of information publicly available, we constructed the Eimeria Transcript Database (EimeriaTDB), a web repository that provides access to sequence data, annotation and comparative analyses. Here, we describe the web interface, available sequence data sets and query tools implemented on the site. The main goal of this work is to offer a public repository of sequence and functional annotation data of reconstructed transcripts of parasites of the genus Eimeria. We believe that EimeriaTDB will represent a valuable and complementary resource for the Eimeria scientific community and for those researchers interested in comparative genomics of apicomplexan parasites. Database URL: http://www.coccidia.icb.usp.br/eimeriatdb/ PMID:23411718
SZGR 2.0: a one-stop shop of schizophrenia candidate genes
Jia, Peilin; Han, Guangchun; Zhao, Junfei; Lu, Pinyi; Zhao, Zhongming
2017-01-01
SZGR 2.0 is a comprehensive resource of candidate variants and genes for schizophrenia, covering genetic, epigenetic, transcriptomic, translational and many other types of evidence. By systematic review and curation of multiple lines of evidence, we included almost all variants and genes that have ever been reported to be associated with schizophrenia. In particular, we collected ∼4200 common variants reported in genome-wide association studies, ∼1000 de novo mutations discovered by large-scale sequencing of family samples, 215 genes spanning rare and replication copy number variations, 99 genes overlapping with linkage regions, 240 differentially expressed genes, 4651 differentially methylated genes and 49 genes as antipsychotic drug targets. To facilitate interpretation, we included various functional annotation data, especially brain eQTL, methylation QTL, brain expression featured in deep categorization of brain areas and developmental stages and brain-specific promoter and enhancer annotations. Furthermore, we conducted cross-study, cross-data type and integrative analyses of the multidimensional data deposited in SZGR 2.0, and made the data and results available through a user-friendly interface. In summary, SZGR 2.0 provides a one-stop shop of schizophrenia variants and genes and their function and regulation, providing an important resource in the schizophrenia and other mental disease community. SZGR 2.0 is available at https://bioinfo.uth.edu/SZGR/. PMID:27733502
USDA-ARS?s Scientific Manuscript database
A comprehensive transcriptome survey, or “Gene Atlas,” provides information essential for a complete understanding of the genomic biology of an organism. Using a digital gene expression approach, we developed a Gene Atlas of RNA abundance in 92 adult, juvenile and fetal cattle tissues. The samples...
USDA-ARS?s Scientific Manuscript database
Background A comprehensive transcriptome survey, or gene atlas, provides information essential for a complete understanding of the genomic biology of an organism. We present an atlas of RNA abundance for 92 adult, juvenile and fetal cattle tissues and three cattle cell lines. Results The Bovine Gene...
Hughes, Lily C; Ortí, Guillermo; Huang, Yu; Sun, Ying; Baldwin, Carole C; Thompson, Andrew W; Arcila, Dahiana; Betancur-R, Ricardo; Li, Chenhong; Becker, Leandro; Bellora, Nicolás; Zhao, Xiaomeng; Li, Xiaofeng; Wang, Min; Fang, Chao; Xie, Bing; Zhou, Zhuocheng; Huang, Hai; Chen, Songlin; Venkatesh, Byrappa; Shi, Qiong
2018-05-14
Our understanding of phylogenetic relationships among bony fishes has been transformed by analysis of a small number of genes, but uncertainty remains around critical nodes. Genome-scale inferences so far have sampled a limited number of taxa and genes. Here we leveraged 144 genomes and 159 transcriptomes to investigate fish evolution with an unparalleled scale of data: >0.5 Mb from 1,105 orthologous exon sequences from 303 species, representing 66 out of 72 ray-finned fish orders. We apply phylogenetic tests designed to trace the effect of whole-genome duplication events on gene trees and find paralogy-free loci using a bioinformatics approach. Genome-wide data support the structure of the fish phylogeny, and hypothesis-testing procedures appropriate for phylogenomic datasets using explicit gene genealogy interrogation settle some long-standing uncertainties, such as the branching order at the base of the teleosts and among early euteleosts, and the sister lineage to the acanthomorph and percomorph radiations. Comprehensive fossil calibrations date the origin of all major fish lineages before the end of the Cretaceous.
Tang, Qin; Iyer, Sowmya; Lobbardi, Riadh; Moore, John C; Chen, Huidong; Lareau, Caleb; Hebert, Christine; Shaw, McKenzie L; Neftel, Cyril; Suva, Mario L; Ceol, Craig J; Bernards, Andre; Aryee, Martin; Pinello, Luca; Drummond, Iain A; Langenau, David M
2017-10-02
Recent advances in single-cell, transcriptomic profiling have provided unprecedented access to investigate cell heterogeneity during tissue and organ development. In this study, we used massively parallel, single-cell RNA sequencing to define cell heterogeneity within the zebrafish kidney marrow, constructing a comprehensive molecular atlas of definitive hematopoiesis and functionally distinct renal cells found in adult zebrafish. Because our method analyzed blood and kidney cells in an unbiased manner, our approach was useful in characterizing immune-cell deficiencies within DNA-protein kinase catalytic subunit ( prkdc ), interleukin-2 receptor γ a ( il2rga ), and double-homozygous-mutant fish, identifying blood cell losses in T, B, and natural killer cells within specific genetic mutants. Our analysis also uncovered novel cell types, including two classes of natural killer immune cells, classically defined and erythroid-primed hematopoietic stem and progenitor cells, mucin-secreting kidney cells, and kidney stem/progenitor cells. In total, our work provides the first, comprehensive, single-cell, transcriptomic analysis of kidney and marrow cells in the adult zebrafish. © 2017 Tang et al.
Iyer, Sowmya; Lobbardi, Riadh; Chen, Huidong; Hebert, Christine; Shaw, McKenzie L.; Neftel, Cyril; Suva, Mario L.; Bernards, Andre; Aryee, Martin; Drummond, Iain A.
2017-01-01
Recent advances in single-cell, transcriptomic profiling have provided unprecedented access to investigate cell heterogeneity during tissue and organ development. In this study, we used massively parallel, single-cell RNA sequencing to define cell heterogeneity within the zebrafish kidney marrow, constructing a comprehensive molecular atlas of definitive hematopoiesis and functionally distinct renal cells found in adult zebrafish. Because our method analyzed blood and kidney cells in an unbiased manner, our approach was useful in characterizing immune-cell deficiencies within DNA–protein kinase catalytic subunit (prkdc), interleukin-2 receptor γ a (il2rga), and double-homozygous–mutant fish, identifying blood cell losses in T, B, and natural killer cells within specific genetic mutants. Our analysis also uncovered novel cell types, including two classes of natural killer immune cells, classically defined and erythroid-primed hematopoietic stem and progenitor cells, mucin-secreting kidney cells, and kidney stem/progenitor cells. In total, our work provides the first, comprehensive, single-cell, transcriptomic analysis of kidney and marrow cells in the adult zebrafish. PMID:28878000
2011-01-01
Background Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. Results We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression profiles helped elucidating molecular mechanisms governing these important quality-related traits during watermelon fruit development. Conclusion We have generated a large collection of watermelon ESTs, which represents a significant expansion of the current transcript catalog of watermelon and a valuable resource for future studies on the genomics of watermelon and other closely-related species. Digital expression analysis of this EST collection allowed us to identify a large set of genes that were differentially expressed during watermelon fruit development and ripening, which provide a rich source of candidates for future functional analysis and represent a valuable increase in our knowledge base of watermelon fruit biology. PMID:21936920
Guo, Shaogui; Liu, Jingan; Zheng, Yi; Huang, Mingyun; Zhang, Haiying; Gong, Guoyi; He, Hongju; Ren, Yi; Zhong, Silin; Fei, Zhangjun; Xu, Yong
2011-09-21
Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression profiles helped elucidating molecular mechanisms governing these important quality-related traits during watermelon fruit development. We have generated a large collection of watermelon ESTs, which represents a significant expansion of the current transcript catalog of watermelon and a valuable resource for future studies on the genomics of watermelon and other closely-related species. Digital expression analysis of this EST collection allowed us to identify a large set of genes that were differentially expressed during watermelon fruit development and ripening, which provide a rich source of candidates for future functional analysis and represent a valuable increase in our knowledge base of watermelon fruit biology.
Characterization of the heart transcriptome of the white shark (Carcharodon carcharias)
2013-01-01
Background The white shark (Carcharodon carcharias) is a globally distributed, apex predator possessing physical, physiological, and behavioral traits that have garnered it significant public attention. In addition to interest in the genetic basis of its form and function, as a representative of the oldest extant jawed vertebrate lineage, white sharks are also of conservation concern due to their small population size and threat from overfishing. Despite this, surprisingly little is known about the biology of white sharks, and genomic resources are unavailable. To address this deficit, we combined Roche-454 and Illumina sequencing technologies to characterize the first transciptome of any tissue for this species. Results From white shark heart cDNA we generated 665,399 Roche 454 reads (median length 387-bp) that were assembled into 141,626 contigs (mean length 503-bp). We also generated 78,566,588 Illumina reads, which we aligned to the 454 contigs producing 105,014 454/Illumina consensus sequences. To these, we added 3,432 non-singleton 454 contigs. By comparing these sequences to the UniProtKB/Swiss-Prot database we were able to annotate 21,019 translated open reading frames (ORFs) of ≥ 20 amino acids. Of these, 19,277 were additionally assigned Gene Ontology (GO) functional annotations. While acknowledging the limitations of our single tissue transcriptome, Fisher tests showed the white shark transcriptome to be significantly enriched for numerous metabolic GO terms compared to the zebra fish and human transcriptomes, with white shark showing more similarity to human than to zebra fish (i.e. fewer terms were significantly different). We also compared the transcriptome to other available elasmobranch sequences, for signatures of positive selection and identified several genes of putative adaptive significance on the white shark lineage. The white shark transcriptome also contained 8,404 microsatellites (dinucleotide, trinucleotide, or tetranucleotide motifs ≥ five perfect repeats). Detailed characterization of these microsatellites showed that ORFs with trinucleotide repeats, were significantly enriched for transcription regulatory roles and that trinucleotide frequency within ORFs was lower than for a wide range of taxonomic groups including other vertebrates. Conclusion The white shark heart transcriptome represents a valuable resource for future elasmobranch functional and comparative genomic studies, as well as for population and other biological studies vital for effective conservation of this globally vulnerable species. PMID:24112713
Characterization of the heart transcriptome of the white shark (Carcharodon carcharias).
Richards, Vincent P; Suzuki, Haruo; Stanhope, Michael J; Shivji, Mahmood S
2013-10-11
The white shark (Carcharodon carcharias) is a globally distributed, apex predator possessing physical, physiological, and behavioral traits that have garnered it significant public attention. In addition to interest in the genetic basis of its form and function, as a representative of the oldest extant jawed vertebrate lineage, white sharks are also of conservation concern due to their small population size and threat from overfishing. Despite this, surprisingly little is known about the biology of white sharks, and genomic resources are unavailable. To address this deficit, we combined Roche-454 and Illumina sequencing technologies to characterize the first transciptome of any tissue for this species. From white shark heart cDNA we generated 665,399 Roche 454 reads (median length 387-bp) that were assembled into 141,626 contigs (mean length 503-bp). We also generated 78,566,588 Illumina reads, which we aligned to the 454 contigs producing 105,014 454/Illumina consensus sequences. To these, we added 3,432 non-singleton 454 contigs. By comparing these sequences to the UniProtKB/Swiss-Prot database we were able to annotate 21,019 translated open reading frames (ORFs) of ≥ 20 amino acids. Of these, 19,277 were additionally assigned Gene Ontology (GO) functional annotations. While acknowledging the limitations of our single tissue transcriptome, Fisher tests showed the white shark transcriptome to be significantly enriched for numerous metabolic GO terms compared to the zebra fish and human transcriptomes, with white shark showing more similarity to human than to zebra fish (i.e. fewer terms were significantly different). We also compared the transcriptome to other available elasmobranch sequences, for signatures of positive selection and identified several genes of putative adaptive significance on the white shark lineage. The white shark transcriptome also contained 8,404 microsatellites (dinucleotide, trinucleotide, or tetranucleotide motifs ≥ five perfect repeats). Detailed characterization of these microsatellites showed that ORFs with trinucleotide repeats, were significantly enriched for transcription regulatory roles and that trinucleotide frequency within ORFs was lower than for a wide range of taxonomic groups including other vertebrates. The white shark heart transcriptome represents a valuable resource for future elasmobranch functional and comparative genomic studies, as well as for population and other biological studies vital for effective conservation of this globally vulnerable species.
De Novo Transcriptome of the Hemimetabolous German Cockroach (Blattella germanica)
Zhou, Xiaojie; Qian, Kun; Tong, Ying; Zhu, Junwei Jerry; Qiu, Xinghui; Zeng, Xiaopeng
2014-01-01
Background The German cockroach, Blattella germanica, is an important insect pest that transmits various pathogens mechanically and causes severe allergic diseases. This insect has long served as a model system for studies of insect biology, physiology and ecology. However, the lack of genome or transcriptome information heavily hinder our further understanding about the German cockroach in every aspect at a molecular level and on a genome-wide scale. To explore the transcriptome and identify unique sequences of interest, we subjected the B. germanica transcriptome to massively parallel pyrosequencing and generated the first reference transcriptome for B. germanica. Methodology/Principal Findings A total of 1,365,609 raw reads with an average length of 529 bp were generated via pyrosequencing the mixed cDNA library from different life stages of German cockroach including maturing oothecae, nymphs, adult females and males. The raw reads were de novo assembled to 48,800 contigs and 3,961 singletons with high-quality unique sequences. These sequences were annotated and classified functionally in terms of BLAST, GO and KEGG, and the genes putatively coding detoxification enzyme systems, insecticide targets, key components in systematic RNA interference, immunity and chemoreception pathways were identified. A total of 3,601 SSRs (Simple Sequence Repeats) loci were also predicted. Conclusions/Significance The whole transcriptome pyrosequencing data from this study provides a usable genetic resource for future identification of potential functional genes involved in various biological processes. PMID:25265537
The utility of transcriptomics in fish conservation.
Connon, Richard E; Jeffries, Ken M; Komoroske, Lisa M; Todgham, Anne E; Fangue, Nann A
2018-01-29
There is growing recognition of the need to understand the mechanisms underlying organismal resilience (i.e. tolerance, acclimatization) to environmental change to support the conservation management of sensitive and economically important species. Here, we discuss how functional genomics can be used in conservation biology to provide a cellular-level understanding of organismal responses to environmental conditions. In particular, the integration of transcriptomics with physiological and ecological research is increasingly playing an important role in identifying functional physiological thresholds predictive of compensatory responses and detrimental outcomes, transforming the way we can study issues in conservation biology. Notably, with technological advances in RNA sequencing, transcriptome-wide approaches can now be applied to species where no prior genomic sequence information is available to develop species-specific tools and investigate sublethal impacts that can contribute to population declines over generations and undermine prospects for long-term conservation success. Here, we examine the use of transcriptomics as a means of determining organismal responses to environmental stressors and use key study examples of conservation concern in fishes to highlight the added value of transcriptome-wide data to the identification of functional response pathways. Finally, we discuss the gaps between the core science and policy frameworks and how thresholds identified through transcriptomic evaluations provide evidence that can be more readily used by resource managers. © 2018. Published by The Company of Biologists Ltd.
Complexity and specificity of the maize (Zea mays L.) root hair transcriptome.
Hey, Stefan; Baldauf, Jutta; Opitz, Nina; Lithio, Andrew; Pasha, Asher; Provart, Nicholas; Nettleton, Dan; Hochholdinger, Frank
2017-04-01
Root hairs are tubular extensions of epidermis cells. Transcriptome profiling demonstrated that the single cell-type root hair transcriptome was less complex than the transcriptome of multiple cell-type primary roots without root hairs. In total, 831 genes were exclusively and 5585 genes were preferentially expressed in root hairs [false discovery rate (FDR) ≤1%]. Among those, the most significantly enriched Gene Ontology (GO) functional terms were related to energy metabolism, highlighting the high energy demand for the development and function of root hairs. Subsequently, the maize homologs for 138 Arabidopsis genes known to be involved in root hair development were identified and their phylogenetic relationship and expression in root hairs were determined. This study indicated that the genetic regulation of root hair development in Arabidopsis and maize is controlled by common genes, but also shows differences which need to be dissected in future genetic experiments. Finally, a maize root view of the eFP browser was implemented including the root hair transcriptome of the present study and several previously published maize root transcriptome data sets. The eFP browser provides color-coded expression levels for these root types and tissues for any gene of interest, thus providing a novel resource to study gene expression and function in maize roots. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Deep insight into the Ganoderma lucidum by comprehensive analysis of its transcriptome.
Yu, Guo-Jun; Wang, Man; Huang, Jie; Yin, Ya-Lin; Chen, Yi-Jie; Jiang, Shuai; Jin, Yan-Xia; Lan, Xian-Qing; Wong, Barry Hon Cheung; Liang, Yi; Sun, Hui
2012-01-01
Ganoderma lucidum is a basidiomycete white rot fungus and is of medicinal importance in China, Japan and other countries in the Asiatic region. To date, much research has been performed in identifying the medicinal ingredients in Ganoderma lucidum. Despite its important therapeutic effects in disease, little is known about Ganoderma lucidum at the genomic level. In order to gain a molecular understanding of this fungus, we utilized Illumina high-throughput technology to sequence and analyze the transcriptome of Ganoderma lucidum. We obtained 6,439,690 and 6,416,670 high-quality reads from the mycelium and fruiting body of Ganoderma lucidum, and these were assembled to form 18,892 and 27,408 unigenes, respectively. A similarity search was performed against the NCBI non-redundant nucleotide database and a customized database composed of five fungal genomes. 11,098 and 8, 775 unigenes were matched to the NCBI non-redundant nucleotide database and our customized database, respectively. All unigenes were subjected to annotation by Gene Ontology, Eukaryotic Orthologous Group terms and Kyoto Encyclopedia of Genes and Genomes. Differentially expressed genes from the Ganoderma lucidum mycelium and fruiting body stage were analyzed, resulting in the identification of 13 unigenes which are involved in the terpenoid backbone biosynthesis pathway. Quantitative real-time PCR was used to confirm the expression levels of these unigenes. Ganoderma lucidum was also studied for wood degrading activity and a total of 22 putative FOLymes (fungal oxidative lignin enzymes) and 120 CAZymes (carbohydrate-active enzymes) were predicted from our Ganoderma lucidum transcriptome. Our study provides comprehensive gene expression information on Ganoderma lucidum at the transcriptional level, which will form the foundation for functional genomics studies in this fungus. The use of Illumina sequencing technology has made de novo transcriptome assembly and gene expression analysis possible in species that lack full genome information.
Deep Insight into the Ganoderma lucidum by Comprehensive Analysis of Its Transcriptome
Yu, Guo-Jun; Wang, Man; Huang, Jie; Yin, Ya-Lin; Chen, Yi-Jie; Jiang, Shuai; Jin, Yan-Xia; Lan, Xian-Qing; Wong, Barry Hon Cheung; Liang, Yi; Sun, Hui
2012-01-01
Background Ganoderma lucidum is a basidiomycete white rot fungus and is of medicinal importance in China, Japan and other countries in the Asiatic region. To date, much research has been performed in identifying the medicinal ingredients in Ganoderma lucidum. Despite its important therapeutic effects in disease, little is known about Ganoderma lucidum at the genomic level. In order to gain a molecular understanding of this fungus, we utilized Illumina high-throughput technology to sequence and analyze the transcriptome of Ganoderma lucidum. Methodology/Principal Findings We obtained 6,439,690 and 6,416,670 high-quality reads from the mycelium and fruiting body of Ganoderma lucidum, and these were assembled to form 18,892 and 27,408 unigenes, respectively. A similarity search was performed against the NCBI non-redundant nucleotide database and a customized database composed of five fungal genomes. 11,098 and 8, 775 unigenes were matched to the NCBI non-redundant nucleotide database and our customized database, respectively. All unigenes were subjected to annotation by Gene Ontology, Eukaryotic Orthologous Group terms and Kyoto Encyclopedia of Genes and Genomes. Differentially expressed genes from the Ganoderma lucidum mycelium and fruiting body stage were analyzed, resulting in the identification of 13 unigenes which are involved in the terpenoid backbone biosynthesis pathway. Quantitative real-time PCR was used to confirm the expression levels of these unigenes. Ganoderma lucidum was also studied for wood degrading activity and a total of 22 putative FOLymes (fungal oxidative lignin enzymes) and 120 CAZymes (carbohydrate-active enzymes) were predicted from our Ganoderma lucidum transcriptome. Conclusions Our study provides comprehensive gene expression information on Ganoderma lucidum at the transcriptional level, which will form the foundation for functional genomics studies in this fungus. The use of Illumina sequencing technology has made de novo transcriptome assembly and gene expression analysis possible in species that lack full genome information. PMID:22952861
Yao, Heng; Wang, Xiaoxuan; Chen, Pengcheng; Hai, Ling; Jin, Kang; Yao, Lixia; Mao, Chuanzao; Chen, Xin
2018-05-01
An advanced functional understanding of omics data is important for elucidating the design logic of physiological processes in plants and effectively controlling desired traits in plants. We present the latest versions of the Predicted Arabidopsis Interactome Resource (PAIR) and of the gene set linkage analysis (GSLA) tool, which enable the interpretation of an observed transcriptomic change (differentially expressed genes [DEGs]) in Arabidopsis ( Arabidopsis thaliana ) with respect to its functional impact for biological processes. PAIR version 5.0 integrates functional association data between genes in multiple forms and infers 335,301 putative functional interactions. GSLA relies on this high-confidence inferred functional association network to expand our perception of the functional impacts of an observed transcriptomic change. GSLA then interprets the biological significance of the observed DEGs using established biological concepts (annotation terms), describing not only the DEGs themselves but also their potential functional impacts. This unique analytical capability can help researchers gain deeper insights into their experimental results and highlight prospective directions for further investigation. We demonstrate the utility of GSLA with two case studies in which GSLA uncovered how molecular events may have caused physiological changes through their collective functional influence on biological processes. Furthermore, we showed that typical annotation-enrichment tools were unable to produce similar insights to PAIR/GSLA. The PAIR version 5.0-inferred interactome and GSLA Web tool both can be accessed at http://public.synergylab.cn/pair/. © 2018 American Society of Plant Biologists. All Rights Reserved.
Ancient orphan crop joins modern era: gene-based SNP discovery and mapping in lentil.
Sharpe, Andrew G; Ramsay, Larissa; Sanderson, Lacey-Anne; Fedoruk, Michael J; Clarke, Wayne E; Li, Rong; Kagale, Sateesh; Vijayan, Perumal; Vandenberg, Albert; Bett, Kirstin E
2013-03-18
The genus Lens comprises a range of closely related species within the galegoid clade of the Papilionoideae family. The clade includes other important crops (e.g. chickpea and pea) as well as a sequenced model legume (Medicago truncatula). Lentil is a global food crop increasing in importance in the Indian sub-continent and elsewhere due to its nutritional value and quick cooking time. Despite this importance there has been a dearth of genetic and genomic resources for the crop and this has limited the application of marker-assisted selection strategies in breeding. We describe here the development of a deep and diverse transcriptome resource for lentil using next generation sequencing technology. The generation of data in multiple cultivated (L. culinaris) and wild (L. ervoides) genotypes together with the utilization of a bioinformatics workflow enabled the identification of a large collection of SNPs and the subsequent development of a genotyping platform that was used to establish the first comprehensive genetic map of the L. culinaris genome. Extensive collinearity with M. truncatula was evident on the basis of sequence homology between mapped markers and the model genome and large translocations and inversions relative to M. truncatula were identified. An estimate for the time divergence of L. culinaris from L. ervoides and of both from M. truncatula was also calculated. The availability of the genomic and derived molecular marker resources presented here will help change lentil breeding strategies and lead to increased genetic gain in the future.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Shuangyan; Huang, Xin; Yang, Xiaohan
BACKGROUND: Sheepgrass [Leymus chinensis (Trin.) Tzvel.] is an important perennial forage grass across the Eurasian Steppe and is known for its adaptability to various environmental conditions. However, insufficient data resources in public databases for sheepgrass limited our understanding of the mechanism of environmental adaptations, gene discovery and molecular marker development. RESULTS: The transcriptome of sheepgrass was sequenced using Roche 454 pyrosequencing technology. We assembled 952,328 high-quality reads into 87,214 unigenes, including 32,416 contigs and 54,798 singletons. There were 15,450 contigs over 500 bp in length. BLAST searches of our database against Swiss-Prot and NCBI non-redundant protein sequences (nr) databases resultedmore » in the annotation of 54,584 (62.6%) of the unigenes. Gene Ontology (GO) analysis assigned 89,129 GO term annotations for 17,463 unigenes. We identified 11,675 core Poaceae-specific and 12,811 putative sheepgrass-specific unigenes by BLAST searches against all plant genome and transcriptome databases. A total of 2,979 specific freezing-responsive unigenes were found from this RNAseq dataset. We identified 3,818 EST-SSRs in 3,597 unigenes, and some SSRs contained unigenes that were also candidates for freezing-response genes. Characterizations of nucleotide repeats and dominant motifs of SSRs in sheepgrass were also performed. Similarity and phylogenetic analysis indicated that sheepgrass is closely related to barley and wheat. CONCLUSIONS: This research has greatly enriched sheepgrass transcriptome resources. The identified stress-related genes will help us to decipher the genetic basis of the environmental and ecological adaptations of this species and will be used to improve wheat and barley crops through hybridization or genetic transformation. The EST-SSRs reported here will be a valuable resource for future gene-phenotype studies and for the molecular breeding of sheepgrass and other Poaceae species.« less
Zhang, Jinfeng; Chen, Lei; Fu, Chenglin; Wang, Lingxia; Liu, Huainian; Cheng, Yuanzhi; Li, Shuangcheng; Deng, Qiming; Wang, Shiquan; Zhu, Jun; Liang, Yueyang; Li, Ping; Zheng, Aiping
2017-01-01
Rice sheath blight, caused by Rhizoctonia solani , is one of the most devastating diseases for stable rice production in most rice-growing regions of the world. Currently, studies of the molecular mechanism of rice sheath blight resistance are scarce. Here, we used an RNA-seq approach to analyze the gene expression changes induced by the AG1 IA strain of R. solani in rice at 12, 24, 36, 48, and 72 h. By comparing the transcriptomes of TeQing (a moderately resistant cultivar) and Lemont (a susceptible cultivar) leaves, variable transcriptional responses under control and infection conditions were revealed. From these data, 4,802 differentially expressed genes (DEGs) were identified. Gene ontology and pathway enrichment analyses suggested that most DEGs and related metabolic pathways in both rice genotypes were common and spanned most biological activities after AG1 IA inoculation. The main difference between the resistant and susceptible plants was a difference in the timing of the response to AG1 IA infection. Photosynthesis, photorespiration, and jasmonic acid and phenylpropanoid metabolism play important roles in disease resistance, and the relative response of disease resistance-related pathways in TeQing leaves was more rapid than that of Lemont leaves at 12 h. Here, the transcription data include the most comprehensive list of genes and pathway candidates induced by AG1 IA that is available for rice and will serve as a resource for future studies into the molecular mechanisms of the responses of rice to AG1 IA.
Bansal, Raman; Michel, Andy
2018-01-18
The brown marmorated stink bug (Halyomorpha halys) is an invasive pest in North America which causes severe economic losses on tree fruits, ornamentals, vegetables, and field crops. The H. halys is an extreme generalist and this feeding behaviour may have been a major contributor behind its establishment and successful adaptation in invasive habitats of North America. To develop an understanding into the mechanism of H. halys' generalist herbivory, here we specifically focused on genes putatively facilitating its adaptation on diverse host plants. We generated over 142 million reads via sequencing eight RNA-Seq libraries, each representing an individual H. halys adult. The de novo assembly contained 79,855 high quality transcripts, totalling 39,600,178 bases. Following a comprehensive transcriptome analysis, H. halys had an expanded suite of cytochrome P450 and cathepsin-L genes compared to other insects. Detailed characterization of P450 genes from the CYP6 family, known for herbivore adaptation on host plants, strongly hinted towards H. halys-specific expansions involving gene duplications. In subsequent RT-PCR experiments, both P450 and cathepsin genes exhibited tissue-specific or distinct expression patterns which supported their principal roles of detoxification and/or digestion in a particular tissue. Our analysis into P450 and cathepsin genes in H. halys offers new insights into potential mechanisms for understanding generalist herbivory and adaptation success in invasive habitats. Additionally, the large-scale transcriptomic resource developed here provides highly useful data for gene discovery; functional, population and comparative genomics as well as efforts to assemble and annotate the H. halys genome.
Transcriptome Analysis of Salt Tolerant Common Bean (Phaseolus vulgaris L.) under Saline Conditions
Hiz, Mahmut Can; Canher, Balkan; Niron, Harun; Turet, Muge
2014-01-01
Salinity is one of the important abiotic stress factors that limit crop production. Common bean, Phaseolus vulgaris L., a major protein source in developing countries, is highly affected by soil salinity and the information on genes that play a role in salt tolerance is scarce. We aimed to identify differentially expressed genes (DEGs) and related pathways by comprehensive analysis of transcriptomes of both root and leaf tissues of the tolerant genotype grown under saline and control conditions in hydroponic system. We have generated a total of 158 million high-quality reads which were assembled into 83,774 all-unigenes with a mean length of 813 bp and N50 of 1,449 bp. Among the all-unigenes, 58,171 were assigned with Nr annotations after homology analyses. It was revealed that 6,422 and 4,555 all-unigenes were differentially expressed upon salt stress in leaf and root tissues respectively. Validation of the RNA-seq quantifications (RPKM values) was performed by qRT-PCR (Quantitative Reverse Transcription PCR) analyses. Enrichment analyses of DEGs based on GO and KEGG databases have shown that both leaf and root tissues regulate energy metabolism, transmembrane transport activity, and secondary metabolites to cope with salinity. A total of 2,678 putative common bean transcription factors were identified and classified under 59 transcription factor families; among them 441 were salt responsive. The data generated in this study will help in understanding the fundamentals of salt tolerance in common bean and will provide resources for functional genomic studies. PMID:24651267
Shah, Faheem Afzal; Wang, Qiaojian; Wang, Zhaocheng; Wu, Lifang
2018-01-01
Pecan is an economically important nut crop tree due to its unique texture and flavor properties. The pecan seed is rich of unsaturated fatty acid and protein. However, little is known about the molecular mechanisms of the biosynthesis of fatty acids in the developing seeds. In this study, transcriptome sequencing of the developing seeds was performed using Illumina sequencing technology. Pecan seed embryos at different developmental stages were collected and sequenced. The transcriptomes of pecan seeds at two key developing stages (PA, the initial stage and PS, the fast oil accumulation stage) were also compared. A total of 82,155 unigenes, with an average length of 1,198 bp from seven independent libraries were generated. After functional annotations, we detected approximately 55,854 CDS, among which, 2,807 were Transcription Factor (TF) coding unigenes. Further, there were 13,325 unigenes that showed a 2-fold or greater expression difference between the two groups of libraries (two developmental stages). After transcriptome analysis, we identified abundant unigenes that could be involved in fatty acid biosynthesis, degradation and some other aspects of seed development in pecan. This study presents a comprehensive dataset of transcriptomic changes during the seed development of pecan. It provides insights in understanding the molecular mechanisms responsible for fatty acid biosynthesis in the seed development. The identification of functional genes will also be useful for the molecular breeding work of pecan. PMID:29694395
Xu, Zheng; Ni, Jun; Shah, Faheem Afzal; Wang, Qiaojian; Wang, Zhaocheng; Wu, Lifang; Fu, Songling
2018-01-01
Pecan is an economically important nut crop tree due to its unique texture and flavor properties. The pecan seed is rich of unsaturated fatty acid and protein. However, little is known about the molecular mechanisms of the biosynthesis of fatty acids in the developing seeds. In this study, transcriptome sequencing of the developing seeds was performed using Illumina sequencing technology. Pecan seed embryos at different developmental stages were collected and sequenced. The transcriptomes of pecan seeds at two key developing stages (PA, the initial stage and PS, the fast oil accumulation stage) were also compared. A total of 82,155 unigenes, with an average length of 1,198 bp from seven independent libraries were generated. After functional annotations, we detected approximately 55,854 CDS, among which, 2,807 were Transcription Factor (TF) coding unigenes. Further, there were 13,325 unigenes that showed a 2-fold or greater expression difference between the two groups of libraries (two developmental stages). After transcriptome analysis, we identified abundant unigenes that could be involved in fatty acid biosynthesis, degradation and some other aspects of seed development in pecan. This study presents a comprehensive dataset of transcriptomic changes during the seed development of pecan. It provides insights in understanding the molecular mechanisms responsible for fatty acid biosynthesis in the seed development. The identification of functional genes will also be useful for the molecular breeding work of pecan.
Cho, Byuri Angela; Yoo, Seong-Keun; Song, Young Shin; Kim, Su-jin; Lee, Kyu Eun; Shong, Minho
2018-01-01
Background: Elucidating aging-related transcriptomic changes in human organs is necessary to understand the aging physiology and mechanisms, but little is known regarding the thyroid gland. We investigated aging-related transcriptomic alterations in the human thyroid gland and characterized the related molecular functions. Methods: Publicly available RNA sequencing data of 322 thyroid tissue samples from the Genotype-Tissue Expression project were analyzed. In addition, our own 64 RNA sequencing data of normal thyroid tissue samples were used as a validation set. To comprehensively evaluate the associations between aging and transcriptomic changes, we performed a weighted gene coexpression network analysis and pathway enrichment analysis. The thyroid differentiation score was then used for further analysis, defining the correlations between thyroid differentiation and aging. Results: The most significant aging-related transcriptomic change in thyroid was the downregulation of genes related to the mitochondrial and proteasomal functions (p = 3 × 10−6). Moreover, genes that are associated with immune processes were significantly upregulated with age (p = 3 × 10−4), and all of them overlapped with the upregulated genes in the thyroid glands affected by lymphocytic thyroiditis. Furthermore, these aging-related changes were not significantly different according to sex, but in terms of the thyroid differentiation, females were more susceptible to aging-related changes (p for trend = 0.03). Conclusions: Aging-related transcriptomic changes in the thyroid gland were associated with mitochondrial and proteasomal dysfunction, loss of differentiation, and activation of autoimmune processes. Our results provide clues to better understanding the age-related decline in thyroid function and higher susceptibility to autoimmune thyroid disease. PMID:29652618
De novo transcriptome assemblies of four xylem sap-feeding insects.
Tassone, Erica E; Cowden, Charles C; Castle, S J
2017-03-01
Spittle bugs and sharpshooters are well-known xylem sap-feeding insects and vectors of the phytopathogenic bacterium Xylella fastidiosa (Wells), a causal agent of Pierce's disease of grapevines and other crop diseases. Specialized feeding on nutrient-deficient xylem sap is relatively rare among insect herbivores, and only limited genomic and transcriptomic information has been generated for xylem-sap feeders. To develop a more comprehensive understanding of biochemical adaptations and symbiotic relationships that support survival on a nutritionally austere dietary source, transcriptome assemblies for three sharpshooter species and one spittlebug species were produced. Trinity-based de novo transcriptome assemblies were generated for all four xylem-sap feeders using raw sequencing data originating from whole-insect preps. Total transcripts for each species ranged from 91 384 for Cuerna arida to 106 998 for Homalodisca liturata with transcript totals for Graphocephala atropunctata and the spittlebug Clastoptera arizonana falling in between. The percentage of transcripts comprising complete open reading frames ranged from 60% for H. liturata to 82% for C. arizonana. Bench-marking universal single-copy orthologs analyses for each dataset indicated quality assemblies and a high degree of completeness for all four species. These four transcriptomes represent a significant expansion of data for insect herbivores that feed exclusively on xylem sap, a nutritionally deficient dietary source relative to other plant tissues and fluids. Comparison of transcriptome data with insect herbivores that utilize other dietary sources may illuminate fundamental differences in the biochemistry of dietary specialization. Published by Oxford University Press on behalf of GIGSCI 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie
2018-01-01
Abstract Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. PMID:29106630
Tzika, Athanasia C; Helaers, Raphaël; Schramm, Gerrit; Milinkovitch, Michel C
2011-09-26
Reptiles are largely under-represented in comparative genomics despite the fact that they are substantially more diverse in many respects than mammals. Given the high divergence of reptiles from classical model species, next-generation sequencing of their transcriptomes is an approach of choice for gene identification and annotation. Here, we use 454 technology to sequence the brain transcriptome of four divergent reptilian and one reference avian species: the Nile crocodile, the corn snake, the bearded dragon, the red-eared turtle, and the chicken. Using an in-house pipeline for recursive similarity searches of >3,000,000 reads against multiple databases from 7 reference vertebrates, we compile a reptilian comparative transcriptomics dataset, with homology assignment for 20,000 to 31,000 transcripts per species and a cumulated non-redundant sequence length of 248.6 Mbases. Our approach identifies the majority (87%) of chicken brain transcripts and about 50% of de novo assembled reptilian transcripts. In addition to 57,502 microsatellite loci, we identify thousands of SNP and indel polymorphisms for population genetic and linkage analyses. We also build very large multiple alignments for Sauropsida and mammals (two million residues per species) and perform extensive phylogenetic analyses suggesting that turtles are not basal living reptiles but are rather associated with Archosaurians, hence, potentially answering a long-standing question in the phylogeny of Amniotes. The reptilian transcriptome (freely available at http://www.reptilian-transcriptomes.org) should prove a useful new resource as reptiles are becoming important new models for comparative genomics, ecology, and evolutionary developmental genetics.
Drew, Damian Paul; Dueholm, Bjørn; Weitzel, Corinna; Zhang, Ye; Sensen, Christoph W.; Simonsen, Henrik Toft
2013-01-01
Thapsia laciniata Rouy (Apiaceae) produces irregular and regular sesquiterpenoids with thapsane and guaiene carbon skeletons, as found in other Apiaceae species. A transcriptomic analysis utilizing Illumina next-generation sequencing enabled the identification of novel genes involved in the biosynthesis of terpenoids in Thapsia. From 66.78 million HQ paired-end reads obtained from T. laciniata roots, 64.58 million were assembled into 76,565 contigs (N50: 1261 bp). Seventeen contigs were annotated as terpene synthases and five of these were predicted to be sesquiterpene synthases. Of the 67 contigs annotated as cytochromes P450, 18 of these are part of the CYP71 clade that primarily performs hydroxylations of specialized metabolites. Three contigs annotated as aldehyde dehydrogenases grouped phylogenetically with the characterized ALDH1 from Artemisia annua and three contigs annotated as alcohol dehydrogenases grouped with the recently described ADH1 from A. annua. ALDH1 and ADH1 were characterized as part of the artemisinin biosynthesis. We have produced a comprehensive EST dataset for T. laciniata roots, which contains a large sample of the T. laciniata transcriptome. These transcriptome data provide the foundation for future research into the molecular basis for terpenoid biosynthesis in Thapsia and on the evolution of terpenoids in Apiaceae. PMID:23698765
Roncaglia, Paola; Howe, Douglas G.; Laulederkind, Stanley J.F.; Khodiyar, Varsha K.; Berardini, Tanya Z.; Tweedie, Susan; Foulger, Rebecca E.; Osumi-Sutherland, David; Campbell, Nancy H.; Huntley, Rachael P.; Talmud, Philippa J.; Blake, Judith A.; Breckenridge, Ross; Riley, Paul R.; Lambiase, Pier D.; Elliott, Perry M.; Clapp, Lucie; Tinker, Andrew; Hill, David P.
2018-01-01
Background: A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. Methods and Results: In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. Conclusions: We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects. PMID:29440116
Lovering, Ruth C; Roncaglia, Paola; Howe, Douglas G; Laulederkind, Stanley J F; Khodiyar, Varsha K; Berardini, Tanya Z; Tweedie, Susan; Foulger, Rebecca E; Osumi-Sutherland, David; Campbell, Nancy H; Huntley, Rachael P; Talmud, Philippa J; Blake, Judith A; Breckenridge, Ross; Riley, Paul R; Lambiase, Pier D; Elliott, Perry M; Clapp, Lucie; Tinker, Andrew; Hill, David P
2018-02-01
A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects. © 2018 The Authors.
Lessons from single-cell transcriptome analysis of oxygen-sensing cells.
Zhou, Ting; Matsunami, Hiroaki
2018-05-01
The advent of single-cell RNA-sequencing (RNA-Seq) technology has enabled transcriptome profiling of individual cells. Comprehensive gene expression analysis at the single-cell level has proven to be effective in characterizing the most fundamental aspects of cellular function and identity. This unbiased approach is revolutionary for small and/or heterogeneous tissues like oxygen-sensing cells in identifying key molecules. Here, we review the major methods of current single-cell RNA-Seq technology. We discuss how this technology has advanced the understanding of oxygen-sensing glomus cells in the carotid body and helped uncover novel oxygen-sensing cells and mechanisms in the mice olfactory system. We conclude by providing our perspective on future single-cell RNA-Seq research directed at oxygen-sensing cells.
USDA-ARS?s Scientific Manuscript database
Meeting the increasing market demands for pork products requires improvement of the feed efficiency of growing pigs. The use of Affymetrix Porcine Gene 1.0 ST array containing 19,211 genes in this study provides a comprehensive gene expression profile of skeletal muscle of finishing pigs in response...
Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona
2014-01-01
The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia-nigra, compared to controls. This novel workflow allows deep multi-level inspection of RNA-Seq datasets and provides a comprehensive new resource for understanding disease transcriptome modifications in PD and other neurodegenerative diseases. PMID:24651478
Dilly, G F; Gaitán-Espitia, J D; Hofmann, G E
2015-03-01
This is the first de novo transcriptome and complete mitochondrial genome of an Antarctic sea urchin species sequenced to date. Sterechinus neumayeri is an Antarctic sea urchin and a model species for ecology, development, physiology and global change biology. To identify transcripts important to ocean acidification (OA) and thermal stress, this transcriptome was created pooling, and 13 larval samples representing developmental stages on day 11 (late gastrula), 19 (early pluteus) and 30 (mid pluteus) maintained at three CO2 levels (421, 652, and 1071 μatm) as well as four additional heat-shocked samples. The normalized cDNA pool was sequenced using emulsion PCR (pyrosequencing) resulting in 1.34M reads with an average read length of 492 base pairs. 40,994 isotigs were identified, averaging 1188 bp with a median coverage of 11×. Additional primer design and gap sequencing were required to complete the mitochondrial genome. The mitogenome of S. neumayeri is a circular DNA molecule with a length of 15 684 bp that contains all 37 genes normally found in metazoans. We detail the main features of the transcriptome and the mitogenome architecture and investigate the phylogenetic relationships of S. neumayeri within Echinoidea. In addition, we provide comparative analyses of S. neumayeri with its closest relative, Strongylocentrotus purpuratus, including a list of potential OA gene targets. The resources described here will support a variety of quantitative (genomic, proteomic, multistress and comparative) studies to interrogate physiological responses to OA and other stressors in this important Antarctic calcifier. © 2014 John Wiley & Sons Ltd.
Schäpe, Paul; Müller-Hagen, Dirk; Ouedraogo, Jean-Paul; Heiderich, Caroline; Jedamzick, Johanna; van den Hondel, Cees A.; Ram, Arthur F.; Meyer, Vera
2016-01-01
Understanding the genetic, molecular and evolutionary basis of cysteine-stabilized antifungal proteins (AFPs) from fungi is important for understanding whether their function is mainly defensive or associated with fungal growth and development. In the current study, a transcriptome meta-analysis of the Aspergillus niger γ-core protein AnAFP was performed to explore co-expressed genes and pathways, based on independent expression profiling microarrays covering 155 distinct cultivation conditions. This analysis uncovered that anafp displays a highly coordinated temporal and spatial transcriptional profile which is concomitant with key nutritional and developmental processes. Its expression profile coincides with early starvation response and parallels with genes involved in nutrient mobilization and autophagy. Using fluorescence- and luciferase reporter strains we demonstrated that the anafp promoter is active in highly vacuolated compartments and foraging hyphal cells during carbon starvation with CreA and FlbA, but not BrlA, as most likely regulators of anafp. A co-expression network analysis supported by luciferase-based reporter assays uncovered that anafp expression is embedded in several cellular processes including allorecognition, osmotic and oxidative stress survival, development, secondary metabolism and autophagy, and predicted StuA and VelC as additional regulators. The transcriptomic resources available for A. niger provide unparalleled resources to investigate the function of proteins. Our work illustrates how transcriptomic meta-analyses can lead to hypotheses regarding protein function and predict a role for AnAFP during slow growth, allorecognition, asexual development and nutrient recycling of A. niger and propose that it interacts with the autophagic machinery to enable these processes. PMID:27835655
Paege, Norman; Jung, Sascha; Schäpe, Paul; Müller-Hagen, Dirk; Ouedraogo, Jean-Paul; Heiderich, Caroline; Jedamzick, Johanna; Nitsche, Benjamin M; van den Hondel, Cees A; Ram, Arthur F; Meyer, Vera
2016-01-01
Understanding the genetic, molecular and evolutionary basis of cysteine-stabilized antifungal proteins (AFPs) from fungi is important for understanding whether their function is mainly defensive or associated with fungal growth and development. In the current study, a transcriptome meta-analysis of the Aspergillus niger γ-core protein AnAFP was performed to explore co-expressed genes and pathways, based on independent expression profiling microarrays covering 155 distinct cultivation conditions. This analysis uncovered that anafp displays a highly coordinated temporal and spatial transcriptional profile which is concomitant with key nutritional and developmental processes. Its expression profile coincides with early starvation response and parallels with genes involved in nutrient mobilization and autophagy. Using fluorescence- and luciferase reporter strains we demonstrated that the anafp promoter is active in highly vacuolated compartments and foraging hyphal cells during carbon starvation with CreA and FlbA, but not BrlA, as most likely regulators of anafp. A co-expression network analysis supported by luciferase-based reporter assays uncovered that anafp expression is embedded in several cellular processes including allorecognition, osmotic and oxidative stress survival, development, secondary metabolism and autophagy, and predicted StuA and VelC as additional regulators. The transcriptomic resources available for A. niger provide unparalleled resources to investigate the function of proteins. Our work illustrates how transcriptomic meta-analyses can lead to hypotheses regarding protein function and predict a role for AnAFP during slow growth, allorecognition, asexual development and nutrient recycling of A. niger and propose that it interacts with the autophagic machinery to enable these processes.
A New Omics Data Resource of Pleurocybella porrigens for Gene Discovery
Dohra, Hideo; Someya, Takumi; Takano, Tomoyuki; Harada, Kiyonori; Omae, Saori; Hirai, Hirofumi; Yano, Kentaro; Kawagishi, Hirokazu
2013-01-01
Background Pleurocybella porrigens is a mushroom-forming fungus, which has been consumed as a traditional food in Japan. In 2004, 55 people were poisoned by eating the mushroom and 17 people among them died of acute encephalopathy. Since then, the Japanese government has been alerting Japanese people to take precautions against eating the P . porrigens mushroom. Unfortunately, despite efforts, the molecular mechanism of the encephalopathy remains elusive. The genome and transcriptome sequence data of P . porrigens and the related species, however, are not stored in the public database. To gain the omics data in P . porrigens , we sequenced genome and transcriptome of its fruiting bodies and mycelia by next generation sequencing. Methodology/Principal Findings Short read sequences of genomic DNAs and mRNAs in P . porrigens were generated by Illumina Genome Analyzer. Genome short reads were de novo assembled into scaffolds using Velvet. Comparisons of genome signatures among Agaricales showed that P . porrigens has a unique genome signature. Transcriptome sequences were assembled into contigs (unigenes). Biological functions of unigenes were predicted by Gene Ontology and KEGG pathway analyses. The majority of unigenes would be novel genes without significant counterparts in the public omics databases. Conclusions Functional analyses of unigenes present the existence of numerous novel genes in the basidiomycetes division. The results mean that the omics information such as genome, transcriptome and metabolome in basidiomycetes is short in the current databases. The large-scale omics information on P . porrigens , provided from this research, will give a new data resource for gene discovery in basidiomycetes. PMID:23936076
Workflow and web application for annotating NCBI BioProject transcriptome data.
Vera Alvarez, Roberto; Medeiros Vidal, Newton; Garzón-Martínez, Gina A; Barrero, Luz S; Landsman, David; Mariño-Ramírez, Leonardo
2017-01-01
The volume of transcriptome data is growing exponentially due to rapid improvement of experimental technologies. In response, large central resources such as those of the National Center for Biotechnology Information (NCBI) are continually adapting their computational infrastructure to accommodate this large influx of data. New and specialized databases, such as Transcriptome Shotgun Assembly Sequence Database (TSA) and Sequence Read Archive (SRA), have been created to aid the development and expansion of centralized repositories. Although the central resource databases are under continual development, they do not include automatic pipelines to increase annotation of newly deposited data. Therefore, third-party applications are required to achieve that aim. Here, we present an automatic workflow and web application for the annotation of transcriptome data. The workflow creates secondary data such as sequencing reads and BLAST alignments, which are available through the web application. They are based on freely available bioinformatics tools and scripts developed in-house. The interactive web application provides a search engine and several browser utilities. Graphical views of transcript alignments are available through SeqViewer, an embedded tool developed by NCBI for viewing biological sequence data. The web application is tightly integrated with other NCBI web applications and tools to extend the functionality of data processing and interconnectivity. We present a case study for the species Physalis peruviana with data generated from BioProject ID 67621. URL: http://www.ncbi.nlm.nih.gov/projects/physalis/. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.
The pathway not taken: understanding 'omics data in the perinatal context.
Edlow, Andrea G; Slonim, Donna K; Wick, Heather C; Hui, Lisa; Bianchi, Diana W
2015-07-01
'Omics analysis of large datasets has an increasingly important role in perinatal research, but understanding gene expression analyses in the fetal context remains a challenge. We compared the interpretation provided by a widely used systems biology resource (ingenuity pathway analysis [IPA]) with that from gene set enrichment analysis (GSEA) with functional annotation curated specifically for the fetus (Developmental FunctionaL Annotation at Tufts [DFLAT]). Using amniotic fluid supernatant transcriptome datasets previously produced by our group, we analyzed 3 different developmental perturbations: aneuploidy (Trisomy 21 [T21]), hemodynamic (twin-twin transfusion syndrome [TTTS]), and metabolic (maternal obesity) vs sex- and gestational age-matched control subjects. Differentially expressed probe sets were identified with the use of paired t-tests with the Benjamini-Hochberg correction for multiple testing (P < .05). Functional analyses were performed with IPA and GSEA/DFLAT. Outputs were compared for biologic relevance to the fetus. Compared with control subjects, there were 414 significantly dysregulated probe sets in T21 fetuses, 2226 in TTTS recipient twins, and 470 in fetuses of obese women. Each analytic output was unique but complementary. For T21, both IPA and GSEA/DFLAT identified dysregulation of brain, cardiovascular, and integumentary system development. For TTTS, both analytic tools identified dysregulation of cell growth/proliferation, immune and inflammatory signaling, brain, and cardiovascular development. For maternal obesity, both tools identified dysregulation of immune and inflammatory signaling, brain and musculoskeletal development, and cell death. GSEA/DFLAT identified substantially more dysregulated biologic functions in fetuses of obese women (1203 vs 151). For all 3 datasets, GSEA/DFLAT provided more comprehensive information about brain development. IPA consistently provided more detailed annotation about cell death. IPA produced many dysregulated terms that pertained to cancer (14 in T21, 109 in TTTS, 26 in maternal obesity); GSEA/DFLAT did not. Interpretation of the fetal amniotic fluid supernatant transcriptome depends on the analytic program, which suggests that >1 resource should be used. Within IPA, physiologic cellular proliferation in the fetus produced many "false positive" annotations that pertained to cancer, which reflects its bias toward adult diseases. This study supports the use of gene annotation resources with a developmental focus, such as DFLAT, for 'omics studies in perinatal medicine. Copyright © 2015 Elsevier Inc. All rights reserved.
Zhou, Xiaoxu; Wang, Hongdi; Cui, Jun; Qiu, Xuemei; Chang, Yaqing; Wang, Xiuli
2016-12-01
Tube foot as one of the ambulacral appendages types in Aspidochirote holothurioids, is known for their functions in locomotion, feeding, chemoreception, light sensitivity and respiration. In this study, we explored the characteristic of transcriptome in the tube foot of sea cucumber (Apostichopus japonicus). Our results showed that among 390 unigenes which specifically expressed in the tube foot, 190 of them were annotated. Based on the assembly transcriptome, we found 219,860 SNPs from 34,749 unigenes, 97,683, 53,624, 27,767 and 40,786 were located in CDSs, 5'-UTRs, 3'-UTRs and non-CDS separately. Furthermore, 12,114 SSRs were detected from 7394 unigenes. Target genes of four specifically expressed miRNAs (miR-29a, miR-29b, miR-278-3p and miR-2005) in tube foot were also predicted based on the transcriptome, which contain immune-related factors (MBL, VLRA, AjC3, MyD88, CFB), skin pigmentation (MITF), candidate regeneration factor (TRP) and holothurians autolysis-related factor (CL). These results develop a relatively large number of molecular markers and transcriptome resources, and will provide a foundation for further analyses on the function and molecular mechanisms underlying A. japonicas tube foot. Copyright © 2016 Elsevier Inc. All rights reserved.
Nakayama, Hokuto; Sakamoto, Tomoaki; Okegawa, Yuki; Kaminoyama, Kaori; Fujie, Manabu; Ichihashi, Yasunori; Kurata, Tetsuya; Motohashi, Ken; Al-Shehbaz, Ihsan; Sinha, Neelima; Kimura, Seisuke
2018-02-19
Because natural variation in wild species is likely the result of local adaptation, it provides a valuable resource for understanding plant-environmental interactions. Rorippa aquatica (Brassicaceae) is a semi-aquatic North American plant with morphological differences between several accessions, but little information available on any physiological differences. Here, we surveyed the transcriptomes of two R. aquatica accessions and identified cryptic physiological differences between them. We first reconstructed a Rorippa phylogeny to confirm relationships between the accessions. We performed large-scale RNA-seq and de novo assembly; the resulting 87,754 unigenes were then annotated via comparisons to different databases. Between-accession physiological variation was identified with transcriptomes from both accessions. Transcriptome data were analyzed with principal component analysis and self-organizing map. Results of analyses suggested that photosynthetic capability differs between the accessions. Indeed, physiological experiments revealed between-accession variation in electron transport rate and the redox state of the plastoquinone pool. These results indicated that one accession may have adapted to differences in temperature or length of the growing season.
Peters, Linda M.; Belyantseva, Inna A.; Lagziel, Ayala; Battey, James F.; Friedman, Thomas B.; Morell, Robert J.
2007-01-01
Specialization in cell function and morphology is influenced by the differential expression of mRNAs, many of which are expressed at low abundance and restricted to certain cell types. Detecting such transcripts in cDNA libraries may require sequencing millions of clones. Massively parallel signature sequencing (MPSS) is well-suited for identifying transcripts that are expressed in discrete cell types and in low abundance. We have made MPSS libraries from microdissections of three inner ear tissues. By comparing these MPSS libraries to those of 87 other tissues included in the Mouse Reference Transcriptome (MRT) online resource, we have identified genes that are highly enriched in, or specific to, the inner ear. We show by RT-PCR and in situ hybridization that signatures unique to the inner ear libraries identify transcripts with highly specific cell-type localizations. These transcripts serve to illustrate the utility of a resource that is available to the research community. Utilization of these resources will increase the number of known transcription units and expand our knowledge of the tissue-specific regulation of the transcriptome. PMID:17049805
Vatanparast, Mohammad; Shetty, Prateek; Chopra, Ratan; Doyle, Jeff J.; Sathyanarayana, N.; Egan, Ashley N.
2016-01-01
Winged bean, Psophocarpus tetragonolobus (L.) DC., is similar to soybean in yield and nutritional value but more viable in tropical conditions. Here, we strengthen genetic resources for this orphan crop by producing a de novo transcriptome assembly and annotation of two Sri Lankan accessions (denoted herein as CPP34 [PI 491423] and CPP37 [PI 639033]), developing simple sequence repeat (SSR) markers, and identifying single nucleotide polymorphisms (SNPs) between geographically separated genotypes. A combined assembly based on 804,757 reads from two accessions produced 16,115 contigs with an N50 of 889 bp, over 90% of which has significant sequence similarity to other legumes. Combining contigs with singletons produced 97,241 transcripts. We identified 12,956 SSRs, including 2,594 repeats for which primers were designed and 5,190 high-confidence SNPs between Sri Lankan and Nigerian genotypes. The transcriptomic data sets generated here provide new resources for gene discovery and marker development in this orphan crop, and will be vital for future plant breeding efforts. We also analyzed the soybean trypsin inhibitor (STI) gene family, important plant defense genes, in the context of related legumes and found evidence for radiation of the Kunitz trypsin inhibitor (KTI) gene family within winged bean. PMID:27356763
Zhang, Xiaodong; Allan, Andrew C.; Li, Caixia; Wang, Yuanzhong; Yao, Qiuyang
2015-01-01
Gentiana rigescens is an important medicinal herb in China. The main validated medicinal component gentiopicroside is synthesized in shoots, but is mainly found in the plant’s roots. The gentiopicroside biosynthetic pathway and its regulatory control remain to be elucidated. Genome resources of gentian are limited. Next-generation sequencing (NGS) technologies can aid in supplying global gene expression profiles. In this study we present sequence and transcript abundance data for the root and leaf transcriptome of G. rigescens, obtained using the Illumina Hiseq2000. Over fifty million clean reads were obtained from leaf and root libraries. This yields 76,717 unigenes with an average length of 753 bp. Among these, 33,855 unigenes were identified as putative homologs of annotated sequences in public protein and nucleotide databases. Digital abundance analysis identified 3306 unigenes differentially enriched between leaf and root. Unigenes found in both tissues were categorized according to their putative functional categories. Of the differentially expressed genes, over 130 were annotated as related to terpenoid biosynthesis. This work is the first study of global transcriptome analyses in gentian. These sequences and putative functional data comprise a resource for future investigation of terpenoid biosynthesis in Gentianaceae species and annotation of the gentiopicroside biosynthetic pathway and its regulatory mechanisms. PMID:26006235
Seifert, Sarah; Lübbe, Torben; Leuschner, Christoph; Finkeldey, Reiner
2017-01-01
Despite the ecological and economic importance of European beech (Fagus sylvatica L.) genomic resources of this species are still limited. This hampers an understanding of the molecular basis of adaptation to stress. Since beech will most likely be threatened by the consequences of climate change, an understanding of adaptive processes to climate change-related drought stress is of major importance. Here, we used RNA-seq to provide the first drought stress-related transcriptome of beech. In a drought stress trial with beech saplings, 50 samples were taken for RNA extraction at five points in time during a soil desiccation experiment. De novo transcriptome assembly and analysis of differential gene expression revealed 44,335 contigs, and 662 differentially expressed genes between the stress and normally watered control group. Gene expression was specific to the different time points, and only five genes were significantly differentially expressed between the stress and control group on all five sampling days. GO term enrichment showed that mostly genes involved in lipid- and homeostasis-related processes were upregulated, whereas genes involved in oxidative stress response were downregulated in the stressed seedlings. This study gives first insights into the genomic drought stress response of European beech, and provides new genetic resources for adaptation research in this species. PMID:28873454
SolEST database: a "one-stop shop" approach to the study of Solanaceae transcriptomes.
D'Agostino, Nunzio; Traini, Alessandra; Frusciante, Luigi; Chiusano, Maria Luisa
2009-11-30
Since no genome sequences of solanaceous plants have yet been completed, expressed sequence tag (EST) collections represent a reliable tool for broad sampling of Solanaceae transcriptomes, an attractive route for understanding Solanaceae genome functionality and a powerful reference for the structural annotation of emerging Solanaceae genome sequences. We describe the SolEST database http://biosrv.cab.unina.it/solestdb which integrates different EST datasets from both cultivated and wild Solanaceae species and from two species of the genus Coffea. Background as well as processed data contained in the database, extensively linked to external related resources, represent an invaluable source of information for these plant families. Two novel features differentiate SolEST from other resources: i) the option of accessing and then visualizing Solanaceae EST/TC alignments along the emerging tomato and potato genome sequences; ii) the opportunity to compare different Solanaceae assemblies generated by diverse research groups in the attempt to address a common complaint in the SOL community. Different databases have been established worldwide for collecting Solanaceae ESTs and are related in concept, content and utility to the one presented herein. However, the SolEST database has several distinguishing features that make it appealing for the research community and facilitates a "one-stop shop" for the study of Solanaceae transcriptomes.
Fang, Lu; Yang, Yuchen; Guo, Wuxia; Li, Jianfang; Zhong, Cairong; Huang, Yelin; Zhou, Renchao; Shi, Suhua
2016-08-01
Aegiceras corniculatum (L.) Blanco is one of the most salt tolerant mangrove species and can thrive in 3% salinity at the seaward edge of mangrove forests. Here we sequenced the transcriptome of A. corniculatum used Illumina GA platform to develop its genomic resources for ecological and evolutionary studies. We obtained about 50 million high-quality paired-end reads with 75bp in length. Using the short read assembler Velvet, we yielded 49,437 contigs with the average length of 625bp. A total of 32,744 (66.23%) contigs showed significant similarity to the GenBank non-redundant (NR) protein database. 30,911 and 18,004 of these sequences were assigned to Gene Ontology and eukaryotic orthologous groups of proteins (KOG). A total of 4942 transcripts from our assemblies had significant similarity with KEGG Orthologs and were involved in 144 KEGG pathways, while 9899 unigenes had enzyme commission (EC) numbers. In addition, 9792 transcriptome-derived SSRs were identified from 7342 sequences. With our strict criteria, 4165 candidate SNPs were also identified from 2058 contigs. Some of these SNPs were further validated by Sanger sequencing. Genomic resources generated in this study should be valuable in ecological, evolutionary, and functional genomics studies for this mangrove species. Copyright © 2016 Elsevier B.V. All rights reserved.
Transcriptome sequencing for high throughput SNP development and genetic mapping in Pea
2014-01-01
Background Pea has a complex genome of 4.3 Gb for which only limited genomic resources are available to date. Although SNP markers are now highly valuable for research and modern breeding, only a few are described and used in pea for genetic diversity and linkage analysis. Results We developed a large resource by cDNA sequencing of 8 genotypes representative of modern breeding material using the Roche 454 technology, combining both long reads (400 bp) and high coverage (3.8 million reads, reaching a total of 1,369 megabases). Sequencing data were assembled and generated a 68 K unigene set, from which 41 K were annotated from their best blast hit against the model species Medicago truncatula. Annotated contigs showed an even distribution along M. truncatula pseudochromosomes, suggesting a good representation of the pea genome. 10 K pea contigs were found to be polymorphic among the genetic material surveyed, corresponding to 35 K SNPs. We validated a subset of 1538 SNPs through the GoldenGate assay, proving their ability to structure a diversity panel of breeding germplasm. Among them, 1340 were genetically mapped and used to build a new consensus map comprising a total of 2070 markers. Based on blast analysis, we could establish 1252 bridges between our pea consensus map and the pseudochromosomes of M. truncatula, which provides new insight on synteny between the two species. Conclusions Our approach created significant new resources in pea, i.e. the most comprehensive genetic map to date tightly linked to the model species M. truncatula and a large SNP resource for both academic research and breeding. PMID:24521263
Pardo, Ivanesa; Lillemoe, Heather A; Blosser, Rachel J; Choi, MiRan; Sauder, Candice A M; Doxey, Diane K; Mathieson, Theresa; Hancock, Bradley A; Baptiste, Dadrie; Atale, Rutuja; Hickenbotham, Matthew; Zhu, Jin; Glasscock, Jarret; Storniolo, Anna Maria V; Zheng, Faye; Doerge, R W; Liu, Yunlong; Badve, Sunil; Radovich, Milan; Clare, Susan E
2014-03-17
Our efforts to prevent and treat breast cancer are significantly impeded by a lack of knowledge of the biology and developmental genetics of the normal mammary gland. In order to provide the specimens that will facilitate such an understanding, The Susan G. Komen for the Cure Tissue Bank at the IU Simon Cancer Center (KTB) was established. The KTB is, to our knowledge, the only biorepository in the world prospectively established to collect normal, healthy breast tissue from volunteer donors. As a first initiative toward a molecular understanding of the biology and developmental genetics of the normal mammary gland, the effect of the menstrual cycle and hormonal contraceptives on DNA expression in the normal breast epithelium was examined. Using normal breast tissue from 20 premenopausal donors to KTB, the changes in the mRNA of the normal breast epithelium as a function of phase of the menstrual cycle and hormonal contraception were assayed using next-generation whole transcriptome sequencing (RNA-Seq). In total, 255 genes representing 1.4% of all genes were deemed to have statistically significant differential expression between the two phases of the menstrual cycle. The overwhelming majority (221; 87%) of the genes have higher expression during the luteal phase. These data provide important insights into the processes occurring during each phase of the menstrual cycle. There was only a single gene significantly differentially expressed when comparing the epithelium of women using hormonal contraception to those in the luteal phase. We have taken advantage of a unique research resource, the KTB, to complete the first-ever next-generation transcriptome sequencing of the epithelial compartment of 20 normal human breast specimens. This work has produced a comprehensive catalog of the differences in the expression of protein-coding genes as a function of the phase of the menstrual cycle. These data constitute the beginning of a reference data set of the normal mammary gland, which can be consulted for comparison with data developed from malignant specimens, or to mine the effects of the hormonal flux that occurs during the menstrual cycle.
2014-01-01
Introduction Our efforts to prevent and treat breast cancer are significantly impeded by a lack of knowledge of the biology and developmental genetics of the normal mammary gland. In order to provide the specimens that will facilitate such an understanding, The Susan G. Komen for the Cure Tissue Bank at the IU Simon Cancer Center (KTB) was established. The KTB is, to our knowledge, the only biorepository in the world prospectively established to collect normal, healthy breast tissue from volunteer donors. As a first initiative toward a molecular understanding of the biology and developmental genetics of the normal mammary gland, the effect of the menstrual cycle and hormonal contraceptives on DNA expression in the normal breast epithelium was examined. Methods Using normal breast tissue from 20 premenopausal donors to KTB, the changes in the mRNA of the normal breast epithelium as a function of phase of the menstrual cycle and hormonal contraception were assayed using next-generation whole transcriptome sequencing (RNA-Seq). Results In total, 255 genes representing 1.4% of all genes were deemed to have statistically significant differential expression between the two phases of the menstrual cycle. The overwhelming majority (221; 87%) of the genes have higher expression during the luteal phase. These data provide important insights into the processes occurring during each phase of the menstrual cycle. There was only a single gene significantly differentially expressed when comparing the epithelium of women using hormonal contraception to those in the luteal phase. Conclusions We have taken advantage of a unique research resource, the KTB, to complete the first-ever next-generation transcriptome sequencing of the epithelial compartment of 20 normal human breast specimens. This work has produced a comprehensive catalog of the differences in the expression of protein-coding genes as a function of the phase of the menstrual cycle. These data constitute the beginning of a reference data set of the normal mammary gland, which can be consulted for comparison with data developed from malignant specimens, or to mine the effects of the hormonal flux that occurs during the menstrual cycle. PMID:24636070
Zhou, Fan; Wang, Guirong; An, Chunju
2014-01-01
Background The Asian corn borer (Ostrinia furnacalis (Guenée)) is one of the most serious corn pests in Asia. Control of this pest with entomopathogenic fungus Beauveria bassiana has been proposed. However, the molecular mechanisms involved in the interactions between O. furnacalis and B. bassiana are unclear, especially under the conditions that the genomic information of O. furnacalis is currently unavailable. So we sequenced and characterized the transcriptome of O. furnacalis larvae infected by B. bassiana with special emphasis on immunity-related genes. Methodology/Principal Findings Illumina Hiseq2000 was used to sequence 4.64 and 4.72 Gb of the transcriptome from water-injected and B. bassiana-injected O. furnacalis larvae, respectively. De novo assembly generated 62,382 unigenes with mean length of 729 nt. All unigenes were searched against Nt, Nr, Swiss-Prot, COG, and KEGG databases for annotations using BLASTN or BLASTX algorithm with an E-value cut-off of 10−5. A total of 35,700 (57.2%) unigenes were annotated to at least one database. Pairwise comparisons resulted in 13,890 differentially expressed genes, with 5,843 up-regulated and 8,047 down-regulated. Based on sequence similarity to homologs known to participate in immune responses, we totally identified 190 potential immunity-related unigenes. They encode 45 pattern recognition proteins, 33 modulation proteins involved in the prophenoloxidase activation cascade, 46 signal transduction molecules, and 66 immune responsive effectors, respectively. The obtained transcriptome contains putative orthologs for nearly all components of the Toll, Imd, and JAK/STAT pathways. We randomly selected 24 immunity-related unigenes and investigated their expression profiles using quantitative RT-PCR assay. The results revealed variant expression patterns in response to the infection of B. bassiana. Conclusions/Significance This study provides the comprehensive sequence resource and expression profiles of the immunity-related genes of O. furnacalis. The obtained data gives an insight into better understanding the molecular mechanisms of innate immune processes in O. furnacalis larvae against B. bassiana. PMID:24466095
Wu, Chen; Crowhurst, Ross N; Dennis, Alice B; Twort, Victoria G; Liu, Shanlin; Newcomb, Richard D; Ross, Howard A; Buckley, Thomas R
2016-01-01
Phasmatodea, more commonly known as stick insects, have been poorly studied at the molecular level for several key traits, such as components of the sensory system and regulators of reproduction and development, impeding a deeper understanding of their functional biology. Here, we employ de novo transcriptome analysis to identify genes with primary functions related to female odour reception, digestion, and male sexual traits in the New Zealand common stick insect Clitarchus hookeri (White). The female olfactory gene repertoire revealed ten odorant binding proteins with three recently duplicated, 12 chemosensory proteins, 16 odorant receptors, and 17 ionotropic receptors. The majority of these olfactory genes were over-expressed in female antennae and have the inferred function of odorant reception. Others that were predominantly expressed in male terminalia (n = 3) and female midgut (n = 1) suggest they have a role in sexual reproduction and digestion, respectively. Over-represented transcripts in the midgut were enriched with digestive enzyme gene families. Clitarchus hookeri is likely to harbour nine members of an endogenous cellulase family (glycoside hydrolase family 9), two of which appear to be specific to the C. hookeri lineage. All of these cellulase sequences fall into four main phasmid clades and show gene duplication events occurred early in the diversification of Phasmatodea. In addition, C. hookeri genome is likely to express γ-proteobacteria pectinase transcripts that have recently been shown to be the result of horizontal transfer. We also predicted 711 male terminalia-enriched transcripts that are candidate accessory gland proteins, 28 of which were annotated to have molecular functions of peptidase activity and peptidase inhibitor activity, two groups being widely reported to regulate female reproduction through proteolytic cascades. Our study has yielded new insights into the genetic basis of odour detection, nutrient digestion, and male sexual traits in stick insects. The C. hookeri reference transcriptome, together with identified gene families, provides a comprehensive resource for studying the evolution of sensory perception, digestive systems, and reproductive success in phasmids.
Hao, Xiaolong; Zhong, Yijun; Fu, Xueqing; Lv, Zongyou; Shen, Qian; Yan, Tingxiang; Shi, Pu; Ma, Yanan; Chen, Minghui; Lv, Xueying; Wu, Zhangkuanyu; Zhao, Jingya; Sun, Xiaofen; Li, Ling; Tang, Kexuan
2017-01-01
Artemisinin is a sesquiterpene lactone endoperoxide extracted from a traditional Chinese medicinal plant Artemisia annua. Artemisinin-based combination therapies (ACTs) are recommended as the best treatment of malaria by the World Health Organization (WHO). Both the phytohormone jasmonic acid (JA) and light promote artemisinin biosynthesis in A. annua. Interestingly, we found that the increase of artemisinin biosynthesis by JA was dependent on light. However, the relationship between the two signal pathways mediated by JA and light remains unclear. Here, we collected the A. annua seedlings of 24 h continuous light (Light), 24 h dark treatment (Dark), 4 h MeJA treatment under the continuous light conditions (Light-MeJA-4h) and 4 h MeJA treatment under the dark conditions (Dark-MeJA-4h) and performed the transcriptome sequencing using Illumina HiSeq 4000 System. A total of 266.7 million clean data were produced and assembled into 185,653 unigenes, with an average length of 537 bp. Among them, 59,490 unigenes were annotated and classified based on the public information. Differential expression analyses were performed between Light and Dark, Light and Light-MeJA-4h, Dark and Dark-MeJA-4h, Light-MeJA-4h, and Dark-MeJA-4h, respectively. Furthermore, transcription factor (TF) analysis revealed that 1588 TFs were identified and divided into 55 TF families, with 284 TFs down-regulated in the Dark relative to Light and 96 TFs up-regulated in the Light-MeJA-4h relative to Light. 8 TFs were selected as candidates for regulating the artemisinin biosynthesis and one of them was validated to be involved in artemisinin transcriptional regulation by Dual-Luciferase (Dual-LUC) assay. The transcriptome data shown in our study offered a comprehensive transcriptional expression pattern influenced by the MeJA and light in A. annua seedling, which will serve as a valuable resource for further studies on transcriptional regulation mechanisms underlying artemisinin biosynthesis. PMID:28642777
Lan, Daoliang; Xiong, Xianrong; Huang, Cai; Mipam, Tserang Donko; Li, Jian
2016-01-01
Yaks (Bos grunniens) are endemic species that can adapt well to thin air, cold temperatures, and high altitude. These species can survive in harsh plateau environments and are major source of animal production for local residents, being an important breed in the Qinghai-Tibet Plateau. However, compared with ordinary cattle that live in the plains, yaks generally have lower fertility. Investigating the basic physiological molecular features of yak ovary and identifying the biological events underlying the differences between the ovaries of yak and plain cattle is necessary to understand the specificity of yak reproduction. Therefore, RNA-seq technology was applied to analyze transcriptome data comparatively between the yak and plain cattle estrous ovaries. After deep sequencing, 3,653,032 clean reads with a total of 4,828,772,880 base pairs were obtained from yak ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome, among which, 12,731 and 14,631 genes were assigned to Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Furthermore, comparison of yak and cattle ovary transcriptome data revealed that 1307 genes were significantly and differentially expressed between the two libraries, wherein 661 genes were upregulated and 646 genes were downregulated in yak ovary. Functional analysis showed that the differentially expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. GO annotations indicated that the genes related to "cell adhesion," "hormonal" biological processes, and "calcium ion binding," "cation transmembrane transport" molecular events were significantly active. KEGG pathway analysis showed that the "complement and coagulation cascade" pathway was the most enriched in yak ovary transcriptome data, followed by the "cytochrome P450" related and "ECM-receptor interaction" pathways. Moreover, several novel pathways, such as "circadian rhythm," were significantly enriched despite having no evident associations with the reproductive function. Our findings provide a molecular resource for further investigation of the general molecular mechanism of yak ovary and offer new insights to understand comprehensively the specificity of yak reproduction.
2011-01-01
Background Reptiles are largely under-represented in comparative genomics despite the fact that they are substantially more diverse in many respects than mammals. Given the high divergence of reptiles from classical model species, next-generation sequencing of their transcriptomes is an approach of choice for gene identification and annotation. Results Here, we use 454 technology to sequence the brain transcriptome of four divergent reptilian and one reference avian species: the Nile crocodile, the corn snake, the bearded dragon, the red-eared turtle, and the chicken. Using an in-house pipeline for recursive similarity searches of >3,000,000 reads against multiple databases from 7 reference vertebrates, we compile a reptilian comparative transcriptomics dataset, with homology assignment for 20,000 to 31,000 transcripts per species and a cumulated non-redundant sequence length of 248.6 Mbases. Our approach identifies the majority (87%) of chicken brain transcripts and about 50% of de novo assembled reptilian transcripts. In addition to 57,502 microsatellite loci, we identify thousands of SNP and indel polymorphisms for population genetic and linkage analyses. We also build very large multiple alignments for Sauropsida and mammals (two million residues per species) and perform extensive phylogenetic analyses suggesting that turtles are not basal living reptiles but are rather associated with Archosaurians, hence, potentially answering a long-standing question in the phylogeny of Amniotes. Conclusions The reptilian transcriptome (freely available at http://www.reptilian-transcriptomes.org) should prove a useful new resource as reptiles are becoming important new models for comparative genomics, ecology, and evolutionary developmental genetics. PMID:21943375
Wall, Christopher E; Cozza, Steven; Riquelme, Cecilia A; McCombie, W Richard; Heimiller, Joseph K; Marr, Thomas G; Leinwand, Leslie A
2011-01-01
The infrequently feeding Burmese python (Python molurus) experiences significant and rapid postprandial cardiac hypertrophy followed by regression as digestion is completed. To begin to explore the molecular mechanisms of this response, we have sequenced and assembled the fasted and postfed Burmese python heart transcriptomes with Illumina technology using the chicken (Gallus gallus) genome as a reference. In addition, we have used RNA-seq analysis to identify differences in the expression of biological processes and signaling pathways between fasted, 1 day postfed (DPF), and 3 DPF hearts. Out of a combined transcriptome of ∼2,800 mRNAs, 464 genes were differentially expressed. Genes showing differential expression at 1 DPF compared with fasted were enriched for biological processes involved in metabolism and energetics, while genes showing differential expression at 3 DPF compared with fasted were enriched for processes involved in biogenesis, structural remodeling, and organization. Moreover, we present evidence for the activation of physiological and not pathological signaling pathways in this rapid, novel model of cardiac growth in pythons. Together, our data provide the first comprehensive gene expression profile for a reptile heart.
Lo, Kinyui Alice; Huang, Shiqi; Walet, Arcinas Camille Esther; Zhang, Zhi-Chun; Leow, Melvin Khee-Shing; Liu, Meihui; Sun, Lei
2018-06-01
Obesity induces profound transcriptome changes in adipocytes, and recent evidence suggests that long-noncoding RNAs (lncRNAs) play key roles in this process. We performed a comprehensive transcriptome study by RNA sequencing in adipocytes isolated from interscapular brown, inguinal, and epididymal white adipose tissue in diet-induced obese mice. The analysis revealed a set of obesity-dysregulated lncRNAs, many of which exhibit dynamic changes in the fed versus fasted state, potentially serving as novel molecular markers of adipose energy status. Among the most prominent lncRNAs is Lnc-leptin , which is transcribed from an enhancer region upstream of leptin ( Lep ). Expression of Lnc-leptin is sensitive to insulin and closely correlates to Lep expression across diverse pathophysiological conditions. Functionally, induction of Lnc-leptin is essential for adipogenesis, and its presence is required for the maintenance of Lep expression in vitro and in vivo. Direct interaction was detected between DNA loci of Lnc-leptin and Lep in mature adipocytes, which diminished upon Lnc-leptin knockdown. Our study establishes Lnc-leptin as a new regulator of Lep . © 2018 by the American Diabetes Association.
Hsiang, Chien-Yun; Chen, Yueh-Sheng; Ho, Tin-Yun
2009-06-01
Establishment of a comprehensive platform for the assessment of host-biomaterial interaction in vivo is an important issue. Nuclear factor-kappaB (NF-kappaB) is an inducible transcription factor that is activated by numerous stimuli. Therefore, NF-kappaB-dependent luminescent signal in transgenic mice carrying the luciferase genes was used as the guide to monitor the biomaterials-affected organs, and transcriptomic analysis was further applied to evaluate the complex host responses in affected organs in this study. In vivo imaging showed that genipin-cross-linked gelatin conduit (GGC) implantation evoked the strong NF-kappaB activity at 6h in the implanted region, and transcriptomic analysis showed that the expressions of interleukin-6 (IL-6), IL-24, and IL-1 family were up-regulated. A strong luminescent signal was observed in spleen on 14 d, suggesting that GGC implantation might elicit the biological events in spleen. Transcriptomic analysis of spleen showed that 13 Kyoto Encyclopedia of Genes and Genomes pathways belonging to cell cycles, immune responses, and metabolism were significantly altered by GGC implants. Connectivity Map analysis suggested that the gene signatures of GGC were similar to those of compounds that affect lipid or glucose metabolism. GeneSetTest analysis further showed that host responses to GGC implants might be related to diseases states, especially the metabolic and cardiovascular diseases. In conclusion, our data provided a concept of molecular imaging-guided transcriptomic platform for the evaluation and the prediction of host-biomaterial interaction in vivo.
USDA-ARS?s Scientific Manuscript database
The Chinese chinquapin (Castanea henryi) nut provides a rich source of starch and nutrient elements as food and feed, but its yield is restricted by a low ratio of female to male flowers (1/2000-1/3000). Little is known about the developmental programs underlying the sex differentiation of the flowe...
Chery, Joyce G; Sass, Chodon; Specht, Chelsea D
2017-09-01
We developed a bioinformatic pipeline that leverages a publicly available genome and published transcriptomes to design primers in conserved coding sequences flanking targeted introns of single-copy nuclear loci. Paullinieae (Sapindaceae) is used to demonstrate the pipeline. Transcriptome reads phylogenetically closer to the lineage of interest are aligned to the closest genome. Single-nucleotide polymorphisms are called, generating a "pseudoreference" closer to the lineage of interest. Several filters are applied to meet the criteria of single-copy nuclear loci with introns of a desired size. Primers are designed in conserved coding sequences flanking introns. Using this pipeline, we developed nine single-copy nuclear intron markers for Paullinieae. This pipeline is highly flexible and can be used for any group with available genomic and transcriptomic resources. This pipeline led to the development of nine variable markers for phylogenetic study without generating sequence data de novo.
NGS Catalog: A Database of Next Generation Sequencing Studies in Humans
Xia, Junfeng; Wang, Qingguo; Jia, Peilin; Wang, Bing; Pao, William; Zhao, Zhongming
2015-01-01
Next generation sequencing (NGS) technologies have been rapidly applied in biomedical and biological research since its advent only a few years ago, and they are expected to advance at an unprecedented pace in the following years. To provide the research community with a comprehensive NGS resource, we have developed the database Next Generation Sequencing Catalog (NGS Catalog, http://bioinfo.mc.vanderbilt.edu/NGS/index.html), a continually updated database that collects, curates and manages available human NGS data obtained from published literature. NGS Catalog deposits publication information of NGS studies and their mutation characteristics (SNVs, small insertions/deletions, copy number variations, and structural variants), as well as mutated genes and gene fusions detected by NGS. Other functions include user data upload, NGS general analysis pipelines, and NGS software. NGS Catalog is particularly useful for investigators who are new to NGS but would like to take advantage of these powerful technologies for their own research. Finally, based on the data deposited in NGS Catalog, we summarized features and findings from whole exome sequencing, whole genome sequencing, and transcriptome sequencing studies for human diseases or traits. PMID:22517761
The non-coding RNA landscape of human hematopoiesis and leukemia.
Schwarzer, Adrian; Emmrich, Stephan; Schmidt, Franziska; Beck, Dominik; Ng, Michelle; Reimer, Christina; Adams, Felix Ferdinand; Grasedieck, Sarah; Witte, Damian; Käbler, Sebastian; Wong, Jason W H; Shah, Anushi; Huang, Yizhou; Jammal, Razan; Maroz, Aliaksandra; Jongen-Lavrencic, Mojca; Schambach, Axel; Kuchenbauer, Florian; Pimanda, John E; Reinhardt, Dirk; Heckl, Dirk; Klusmann, Jan-Henning
2017-08-09
Non-coding RNAs have emerged as crucial regulators of gene expression and cell fate decisions. However, their expression patterns and regulatory functions during normal and malignant human hematopoiesis are incompletely understood. Here we present a comprehensive resource defining the non-coding RNA landscape of the human hematopoietic system. Based on highly specific non-coding RNA expression portraits per blood cell population, we identify unique fingerprint non-coding RNAs-such as LINC00173 in granulocytes-and assign these to critical regulatory circuits involved in blood homeostasis. Following the incorporation of acute myeloid leukemia samples into the landscape, we further uncover prognostically relevant non-coding RNA stem cell signatures shared between acute myeloid leukemia blasts and healthy hematopoietic stem cells. Our findings highlight the importance of the non-coding transcriptome in the formation and maintenance of the human blood hierarchy.While micro-RNAs are known regulators of haematopoiesis and leukemogenesis, the role of long non-coding RNAs is less clear. Here the authors provide a non-coding RNA expression landscape of the human hematopoietic system, highlighting their role in the formation and maintenance of the human blood hierarchy.
Cell wall evolution and diversity
Fangel, Jonatan U.; Ulvskov, Peter; Knox, J. P.; Mikkelsen, Maria D.; Harholt, Jesper; Popper, Zoë A.; Willats, William G.T.
2012-01-01
Plant cell walls display a considerable degree of diversity in their compositions and molecular architectures. In some cases the functional significance of a particular cell wall type appears to be easy to discern: secondary cells walls are often reinforced with lignin that provides durability; the thin cell walls of pollen tubes have particular compositions that enable their tip growth; lupin seed cell walls are characteristically thickened with galactan used as a storage polysaccharide. However, more frequently the evolutionary mechanisms and selection pressures that underpin cell wall diversity and evolution are unclear. For diverse green plants (chlorophytes and streptophytes) the rapidly increasing availability of transcriptome and genome data sets, the development of methods for cell wall analyses which require less material for analysis, and expansion of molecular probe sets, are providing new insights into the diversity and occurrence of cell wall polysaccharides and associated biosynthetic genes. Such research is important for refining our understanding of some of the fundamental processes that enabled plants to colonize land and to subsequently radiate so comprehensively. The study of cell wall structural diversity is also an important aspect of the industrial utilization of global polysaccharide bio-resources. PMID:22783271
Cui, Kai; Wang, Haiying; Liao, Shengxi; Tang, Qi; Li, Li; Cui, Yongzhong; He, Yuan
2016-01-01
Dendrocalamus sinicus is the world’s largest bamboo species with strong woody culms, and known for its fast-growing culms. As an economic bamboo species, it was popularized for multi-functional applications including furniture, construction, and industrial paper pulp. To comprehensively elucidate the molecular processes involved in its culm elongation, Illumina paired-end sequencing was conducted. About 65.08 million high-quality reads were produced, and assembled into 81,744 unigenes with an average length of 723 bp. A total of 64,338 (79%) unigenes were annotated for their functions, of which, 56,587 were annotated in the NCBI non-redundant protein database and 35,262 were annotated in the Swiss-Prot database. Also, 42,508 and 21,009 annotated unigenes were allocated to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. By searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG), 33,920 unigenes were assigned to 128 KEGG pathways. Meanwhile, 8,553 simple sequence repeats (SSRs) and 81,534 single-nucleotide polymorphism (SNPs) were identified, respectively. Additionally, 388 transcripts encoding lignin biosynthesis were detected, among which, 27 transcripts encoding Shikimate O-hydroxycinnamoyltransferase (HCT) specifically expressed in D. sinicus when compared to other bamboo species and rice. The phylogenetic relationship between D. sinicus and other plants was analyzed, suggesting functional diversity of HCT unigenes in D. sinicus. We conjectured that HCT might lead to the high lignin content and giant culm. Given that the leaves are not yet formed and culm is covered with sheaths during culm elongation, the existence of photosynthesis of bamboo culm is usually neglected. Surprisedly, 109 transcripts encoding photosynthesis were identified, including photosystem I and II, cytochrome b6/f complex, photosynthetic electron transport and F-type ATPase, and 24 transcripts were characterized as antenna proteins that regarded as the main tool for capturing light of plants, implying stem photosynthesis plays a key role during culm elongation due to the unavailability of its leaf. By real-time quantitative PCR, the expression level of 6 unigenes was detected. The results showed the expression level of all genes accorded with the transcriptome data, which confirm the reliability of the transcriptome data. As we know, this is the first study underline the D. sinicus transcriptome, which will deepen the understanding of the molecular mechanisms of culm development. The results may help variety improvement and resource utilization of bamboos. PMID:27304219
Cui, Kai; Wang, Haiying; Liao, Shengxi; Tang, Qi; Li, Li; Cui, Yongzhong; He, Yuan
2016-01-01
Dendrocalamus sinicus is the world's largest bamboo species with strong woody culms, and known for its fast-growing culms. As an economic bamboo species, it was popularized for multi-functional applications including furniture, construction, and industrial paper pulp. To comprehensively elucidate the molecular processes involved in its culm elongation, Illumina paired-end sequencing was conducted. About 65.08 million high-quality reads were produced, and assembled into 81,744 unigenes with an average length of 723 bp. A total of 64,338 (79%) unigenes were annotated for their functions, of which, 56,587 were annotated in the NCBI non-redundant protein database and 35,262 were annotated in the Swiss-Prot database. Also, 42,508 and 21,009 annotated unigenes were allocated to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. By searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG), 33,920 unigenes were assigned to 128 KEGG pathways. Meanwhile, 8,553 simple sequence repeats (SSRs) and 81,534 single-nucleotide polymorphism (SNPs) were identified, respectively. Additionally, 388 transcripts encoding lignin biosynthesis were detected, among which, 27 transcripts encoding Shikimate O-hydroxycinnamoyltransferase (HCT) specifically expressed in D. sinicus when compared to other bamboo species and rice. The phylogenetic relationship between D. sinicus and other plants was analyzed, suggesting functional diversity of HCT unigenes in D. sinicus. We conjectured that HCT might lead to the high lignin content and giant culm. Given that the leaves are not yet formed and culm is covered with sheaths during culm elongation, the existence of photosynthesis of bamboo culm is usually neglected. Surprisedly, 109 transcripts encoding photosynthesis were identified, including photosystem I and II, cytochrome b6/f complex, photosynthetic electron transport and F-type ATPase, and 24 transcripts were characterized as antenna proteins that regarded as the main tool for capturing light of plants, implying stem photosynthesis plays a key role during culm elongation due to the unavailability of its leaf. By real-time quantitative PCR, the expression level of 6 unigenes was detected. The results showed the expression level of all genes accorded with the transcriptome data, which confirm the reliability of the transcriptome data. As we know, this is the first study underline the D. sinicus transcriptome, which will deepen the understanding of the molecular mechanisms of culm development. The results may help variety improvement and resource utilization of bamboos.
Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie; Zhang, Gong
2018-01-04
Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Meyer, B; Martini, P; Biscontin, A; De Pittà, C; Romualdi, C; Teschke, M; Frickenhaus, S; Harms, L; Freier, U; Jarman, S; Kawaguchi, S
2015-11-01
The Antarctic krill, Euphausia superba, has a key position in the Southern Ocean food web by serving as direct link between primary producers and apex predators. The south-west Atlantic sector of the Southern Ocean, where the majority of the krill population is located, is experiencing one of the most profound environmental changes worldwide. Up to now, we have only cursory information about krill's genomic plasticity to cope with the ongoing environmental changes induced by anthropogenic CO2 emission. The genome of krill is not yet available due to its large size (about 48 Gbp). Here, we present two cDNA normalized libraries from whole krill and krill heads sampled in different seasons that were combined with two data sets of krill transcriptome projects, already published, to produce the first knowledgebase krill 'master' transcriptome. The new library produced 25% more E. superba transcripts and now includes nearly all the enzymes involved in the primary oxidative metabolism (Glycolysis, Krebs cycle and oxidative phosphorylation) as well as all genes involved in glycogenesis, glycogen breakdown, gluconeogenesis, fatty acid synthesis and fatty acids β-oxidation. With these features, the 'master' transcriptome provides the most complete picture of metabolic pathways in Antarctic krill and will provide a major resource for future physiological and molecular studies. This will be particularly valuable for characterizing the molecular networks that respond to stressors caused by the anthropogenic CO2 emissions and krill's capacity to cope with the ongoing environmental changes in the Atlantic sector of the Southern Ocean. © 2015 The Authors. Molecular Ecology Resources published by John Wiley & Sons Ltd.
The De Novo Transcriptome and Its Functional Annotation in the Seed Beetle Callosobruchus maculatus.
Sayadi, Ahmed; Immonen, Elina; Bayram, Helen; Arnqvist, Göran
2016-01-01
Despite their unparalleled biodiversity, the genomic resources available for beetles (Coleoptera) remain relatively scarce. We present an integrative and high quality annotated transcriptome of the beetle Callosobruchus maculatus, an important and cosmopolitan agricultural pest as well as an emerging model species in ecology and evolutionary biology. Using Illumina sequencing technology, we sequenced 492 million read pairs generated from 51 samples of different developmental stages (larvae, pupae and adults) of C. maculatus. Reads were de novo assembled using the Trinity software, into a single combined assembly as well as into three separate assemblies based on data from the different developmental stages. The combined assembly generated 218,192 transcripts and 145,883 putative genes. Putative genes were annotated with the Blast2GO software and the Trinotate pipeline. In total, 33,216 putative genes were successfully annotated using Blastx against the Nr (non-redundant) database and 13,382 were assigned to 34,100 Gene Ontology (GO) terms. We classified 5,475 putative genes into Clusters of Orthologous Groups (COG) and 116 metabolic pathways maps were predicted based on the annotation. Our analyses suggested that the transcriptional specificity increases with ontogeny. For example, out of 33,216 annotated putative genes, 51 were only expressed in larvae, 63 only in pupae and 171 only in adults. Our study illustrates the importance of including samples from several developmental stages when the aim is to provide an integrative and high quality annotated transcriptome. Our results will represent an invaluable resource for those working with the ecology, evolution and pest control of C. maculatus, as well for comparative studies of the transcriptomics and genomics of beetles more generally.
The De Novo Transcriptome and Its Functional Annotation in the Seed Beetle Callosobruchus maculatus
Sayadi, Ahmed; Immonen, Elina; Bayram, Helen
2016-01-01
Despite their unparalleled biodiversity, the genomic resources available for beetles (Coleoptera) remain relatively scarce. We present an integrative and high quality annotated transcriptome of the beetle Callosobruchus maculatus, an important and cosmopolitan agricultural pest as well as an emerging model species in ecology and evolutionary biology. Using Illumina sequencing technology, we sequenced 492 million read pairs generated from 51 samples of different developmental stages (larvae, pupae and adults) of C. maculatus. Reads were de novo assembled using the Trinity software, into a single combined assembly as well as into three separate assemblies based on data from the different developmental stages. The combined assembly generated 218,192 transcripts and 145,883 putative genes. Putative genes were annotated with the Blast2GO software and the Trinotate pipeline. In total, 33,216 putative genes were successfully annotated using Blastx against the Nr (non-redundant) database and 13,382 were assigned to 34,100 Gene Ontology (GO) terms. We classified 5,475 putative genes into Clusters of Orthologous Groups (COG) and 116 metabolic pathways maps were predicted based on the annotation. Our analyses suggested that the transcriptional specificity increases with ontogeny. For example, out of 33,216 annotated putative genes, 51 were only expressed in larvae, 63 only in pupae and 171 only in adults. Our study illustrates the importance of including samples from several developmental stages when the aim is to provide an integrative and high quality annotated transcriptome. Our results will represent an invaluable resource for those working with the ecology, evolution and pest control of C. maculatus, as well for comparative studies of the transcriptomics and genomics of beetles more generally. PMID:27442123
Jo, Jihoon; Park, Jongsun; Lee, Hyun-Gwan; Kern, Elizabeth M A; Cheon, Seongmin; Jin, Soyeong; Park, Joong-Ki; Cho, Sung-Jin; Park, Chungoo
2016-08-01
The sea cucumber Apostichopus japonicus Selenka 1867 represents an important resource in biomedical research, traditional medicine, and the seafood industry. Much of the commercial value of A. japonicus is determined by dorsal/ventral color variation (red, green, and black), yet the taxonomic relationships between these color variants are not clearly understood. We performed the first comparative analysis of de novo assembled transcriptome data from three color variants of A. japonicus. Using the Illumina platform, we sequenced nearly 177,596,774 clean reads representing a total of 18.2Gbp of sea cucumber transcriptome. A comparison of over 0.3 million transcript scaffolds against the Uniprot/Swiss-Prot database yielded 8513, 8602, and 8588 positive matches for green, red, and black body color transcriptomes, respectively. Using the Panther gene classification system, we assessed an extensive and diverse set of expressed genes in three color variants and found that (1) among the three color variants of A. japonicus, genes associated with RNA binding protein, oxidoreductase, nucleic acid binding, transferase, and KRAB box transcription factor were most commonly expressed; and (2) the main protein functional classes are differently regulated in all three color variants (extracellular matrix protein and phosphatase for green color, transporter and potassium channel for red color, and G-protein modulator and enzyme modulator for black color). This work will assist in the discovery and annotation of novel genes that play significant morphological and physiological roles in color variants of A. japonicus, and these sequence data will provide a useful set of resources for the rapidly growing sea cucumber aquaculture industry. Copyright © 2016 Elsevier B.V. All rights reserved.
Kang, Se Won; Patnaik, Bharat Bhusan; Hwang, Hee-Ju; Park, So Young; Chung, Jong Min; Song, Dae Kwon; Patnaik, Hongray Howrelia; Lee, Jae Bong; Kim, Changmu; Kim, Soonok; Park, Hong Seog; Park, Seung-Hwan; Park, Young-Su; Han, Yeon Soo; Lee, Jun Sang; Lee, Yong Seok
2017-03-01
Satsuma myomphala is critically endangered through loss of natural habitats, predation by natural enemies, and indiscriminate collection. It is a protected species in Korea but lacks genomic resources for an understanding of varied functional processes attributable to evolutionary success under natural habitats. For assessing the genetic information of S. myomphala, we performed for the first time, de novo transcriptome sequencing and functional annotation of expressed sequences using Illumina Next-Generation Sequencing (NGS) platform and bioinformatics analysis. We identified 103,774 unigenes of which 37,959, 12,890, and 17,699 were annotated in the PANM (Protostome DB), Unigene, and COG (Clusters of Orthologous Groups) databases, respectively. In addition, 14,451 unigenes were predicted under Gene Ontology functional categories, with 4581 assigned to a single category. Furthermore, 3369 sequences with 646 having Enzyme Commission (EC) numbers were mapped to 122 pathways in the Kyoto Encyclopedia of Genes and Genomes Pathway database. The prominent protein domains included the Zinc finger (C2H2-like), Reverse Transcriptase, Thioredoxin-like fold, and RNA recognition motif domain. Many unigenes with homology to immunity, defense, and reproduction-related genes were screened in the transcriptome. We also detected 3120 putative simple sequence repeats (SSRs) encompassing dinucleotide to hexanucleotide repeat motifs from >1kb unigene sequences. A list of PCR primers of SSR loci have been identified to study the genetic polymorphisms. The transcriptome data represents a valuable resource for further investigations on the species genome structure and biology. The unigenes information and microsatellites would provide an indispensable tool for conservation of the species in natural and adaptive environments. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Transcriptomic Profiling of Fruit Development in Black Raspberry Rubus coreanus
Hu, Yaodong
2018-01-01
The wild Rubus species R. coreanus, which is widely distributed in southwest China, shows great promise as a genetic resource for breeding. One of its outstanding properties is adaptation to high temperature and humidity. To facilitate its use in selection and breeding programs, we assembled de novo 179,738,287 R. coreanus reads (125 bp in length) generated by RNA sequencing from fruits at three representative developmental stages. We also used the recently released draft genome of R. occidentalis to perform reference-guided assembly. We inferred a final 95,845-transcript reference for R. coreanus. Of these genetic resources, 66,597 (69.5%) were annotated. Based on these results, we carried out a comprehensive analysis of differentially expressed genes. Flavonoid biosynthesis, phenylpropanoid biosynthesis, plant hormone signal transduction, and cutin, suberin, and wax biosynthesis pathways were significantly enriched throughout the ripening process. We identified 23 transcripts involved in the flavonoid biosynthesis pathway whose expression perfectly paralleled changes in the metabolites. Additionally, we identified 119 nucleotide-binding site leucine-rich repeat (NBS-LRR) protein-coding genes, involved in pathogen resistance, of which 74 were in the completely conserved domain. These results provide, for the first time, genome-wide genetic information for understanding developmental regulation of R. coreanus fruits. They have the potential for use in breeding through functional genetic approaches in the near future. PMID:29805970
SZGR 2.0: a one-stop shop of schizophrenia candidate genes.
Jia, Peilin; Han, Guangchun; Zhao, Junfei; Lu, Pinyi; Zhao, Zhongming
2017-01-04
SZGR 2.0 is a comprehensive resource of candidate variants and genes for schizophrenia, covering genetic, epigenetic, transcriptomic, translational and many other types of evidence. By systematic review and curation of multiple lines of evidence, we included almost all variants and genes that have ever been reported to be associated with schizophrenia. In particular, we collected ∼4200 common variants reported in genome-wide association studies, ∼1000 de novo mutations discovered by large-scale sequencing of family samples, 215 genes spanning rare and replication copy number variations, 99 genes overlapping with linkage regions, 240 differentially expressed genes, 4651 differentially methylated genes and 49 genes as antipsychotic drug targets. To facilitate interpretation, we included various functional annotation data, especially brain eQTL, methylation QTL, brain expression featured in deep categorization of brain areas and developmental stages and brain-specific promoter and enhancer annotations. Furthermore, we conducted cross-study, cross-data type and integrative analyses of the multidimensional data deposited in SZGR 2.0, and made the data and results available through a user-friendly interface. In summary, SZGR 2.0 provides a one-stop shop of schizophrenia variants and genes and their function and regulation, providing an important resource in the schizophrenia and other mental disease community. SZGR 2.0 is available at https://bioinfo.uth.edu/SZGR/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Transcriptomic analysis of flower development in tea (Camellia sinensis (L.)).
Liu, Feng; Wang, Yu; Ding, Zhaotang; Zhao, Lei; Xiao, Jun; Wang, Linjun; Ding, Shibo
2017-10-05
Flowering is a critical and complicated process in plant development, involving interactions of numerous endogenous and environmental factors, but little is known about the complex network regulating flower development in tea plants. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptomic analysis assembles gene-related information involved in reproductive growth of C. sinensis. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction were enriched among the DEGs. Furthermore, 207 flowering-associated unigenes were identified from our database. Some transcription factors, such as WRKY, ERF, bHLH, MYB and MADS-box were shown to be up-regulated in floral transition, which might play the role of progression of flowering. Furthermore, 14 genes were selected for confirmation of expression levels using quantitative real-time PCR (qRT-PCR). The comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in C. sinensis. Our data also provided a useful database for further research of tea and other species of plants. Copyright © 2017 Elsevier B.V. All rights reserved.
Plouhinec, Jean-Louis; Medina-Ruiz, Sofía; Borday, Caroline; Bernard, Elsa; Vert, Jean-Philippe; Eisen, Michael B; Harland, Richard M; Monsoro-Burq, Anne H
2017-10-01
During vertebrate neurulation, the embryonic ectoderm is patterned into lineage progenitors for neural plate, neural crest, placodes and epidermis. Here, we use Xenopus laevis embryos to analyze the spatial and temporal transcriptome of distinct ectodermal domains in the course of neurulation, during the establishment of cell lineages. In order to define the transcriptome of small groups of cells from a single germ layer and to retain spatial information, dorsal and ventral ectoderm was subdivided along the anterior-posterior and medial-lateral axes by microdissections. Principal component analysis on the transcriptomes of these ectoderm fragments primarily identifies embryonic axes and temporal dynamics. This provides a genetic code to define positional information of any ectoderm sample along the anterior-posterior and dorsal-ventral axes directly from its transcriptome. In parallel, we use nonnegative matrix factorization to predict enhanced gene expression maps onto early and mid-neurula embryos, and specific signatures for each ectoderm area. The clustering of spatial and temporal datasets allowed detection of multiple biologically relevant groups (e.g., Wnt signaling, neural crest development, sensory placode specification, ciliogenesis, germ layer specification). We provide an interactive network interface, EctoMap, for exploring synexpression relationships among genes expressed in the neurula, and suggest several strategies to use this comprehensive dataset to address questions in developmental biology as well as stem cell or cancer research.
De Novo Transcriptome Analysis of Allium cepa L. (Onion) Bulb to Identify Allergens and Epitopes.
Rajkumar, Hemalatha; Ramagoni, Ramesh Kumar; Anchoju, Vijayendra Chary; Vankudavath, Raju Naik; Syed, Arshi Uz Zaman
2015-01-01
Allium cepa (onion) is a diploid plant with one of the largest nuclear genomes among all diploids. Onion is an example of an under-researched crop which has a complex heterozygous genome. There are no allergenic proteins and genomic data available for onions. This study was conducted to establish a transcriptome catalogue of onion bulb that will enable us to study onion related genes involved in medicinal use and allergies. Transcriptome dataset generated from onion bulb using the Illumina HiSeq 2000 technology showed a total of 99,074,309 high quality raw reads (~20 Gb). Based on sequence homology onion genes were categorized into 49 different functional groups. Most of the genes however, were classified under 'unknown' in all three gene ontology categories. Of the categorized genes, 61.2% showed metabolic functions followed by cellular components such as binding, cellular processes; catalytic activity and cell part. With BLASTx top hit analysis, a total of 2,511 homologous allergenic sequences were found, which had 37-100% similarity with 46 different types of allergens existing in the database. From the 46 contigs or allergens, 521 B-cell linear epitopes were identified using BepiPred linear epitope prediction tool. This is the first comprehensive insight into the transcriptome of onion bulb tissue using the NGS technology, which can be used to map IgE epitopes and prediction of structures and functions of various proteins.
Habuka, Masato; Fagerberg, Linn; Hallström, Björn M.; Pontén, Fredrik; Yamamoto, Tadashi; Uhlen, Mathias
2015-01-01
To understand functions and diseases of urinary bladder, it is important to define its molecular constituents and their roles in urinary bladder biology. Here, we performed genome-wide deep RNA sequencing analysis of human urinary bladder samples and identified genes up-regulated in the urinary bladder by comparing the transcriptome data to those of all other major human tissue types. 90 protein-coding genes were elevated in the urinary bladder, either with enhanced expression uniquely in the urinary bladder or elevated expression together with at least one other tissue (group enriched). We further examined the localization of these proteins by immunohistochemistry and tissue microarrays and 20 of these 90 proteins were localized to the whole urothelium with a majority not yet described in the context of the urinary bladder. Four additional proteins were found specifically in the umbrella cells (Uroplakin 1a, 2, 3a, and 3b), and three in the intermediate/basal cells (KRT17, PCP4L1 and ATP1A4). 61 of the 90 elevated genes have not been previously described in the context of urinary bladder and the corresponding proteins are interesting targets for more in-depth studies. In summary, an integrated omics approach using transcriptomics and antibody-based profiling has been used to define a comprehensive list of proteins elevated in the urinary bladder. PMID:26694548
Magistri, Marco; Velmeshev, Dmitry; Makhmutova, Madina; Faghihi, Mohammad Ali
2015-01-01
Abstract The underlying genetic variations of late-onset Alzheimer’s disease (LOAD) cases remain largely unknown. A combination of genetic variations with variable penetrance and lifetime epigenetic factors may converge on transcriptomic alterations that drive LOAD pathological process. Transcriptome profiling using deep sequencing technology offers insight into common altered pathways regardless of underpinning genetic or epigenetic factors and thus represents an ideal tool to investigate molecular mechanisms related to the pathophysiology of LOAD. We performed directional RNA sequencing on high quality RNA samples extracted from hippocampi of LOAD and age-matched controls. We further validated our data using qRT-PCR on a larger set of postmortem brain tissues, confirming downregulation of the gene encoding substance P (TAC1) and upregulation of the gene encoding the plasminogen activator inhibitor-1 (SERPINE1). Pathway analysis indicates dysregulation in neural communication, cerebral vasculature, and amyloid-β clearance. Beside protein coding genes, we identified several annotated and non-annotated long noncoding RNAs that are differentially expressed in LOAD brain tissues, three of them are activity-dependent regulated and one is induced by Aβ1 - 42 exposure of human neural cells. Our data provide a comprehensive list of transcriptomics alterations in LOAD hippocampi and warrant holistic approach including both coding and non-coding RNAs in functional studies aimed to understand the pathophysiology of LOAD. PMID:26402107
Borday, Caroline; Bernard, Elsa; Vert, Jean-Philippe; Eisen, Michael B.; Harland, Richard M.
2017-01-01
During vertebrate neurulation, the embryonic ectoderm is patterned into lineage progenitors for neural plate, neural crest, placodes and epidermis. Here, we use Xenopus laevis embryos to analyze the spatial and temporal transcriptome of distinct ectodermal domains in the course of neurulation, during the establishment of cell lineages. In order to define the transcriptome of small groups of cells from a single germ layer and to retain spatial information, dorsal and ventral ectoderm was subdivided along the anterior-posterior and medial-lateral axes by microdissections. Principal component analysis on the transcriptomes of these ectoderm fragments primarily identifies embryonic axes and temporal dynamics. This provides a genetic code to define positional information of any ectoderm sample along the anterior-posterior and dorsal-ventral axes directly from its transcriptome. In parallel, we use nonnegative matrix factorization to predict enhanced gene expression maps onto early and mid-neurula embryos, and specific signatures for each ectoderm area. The clustering of spatial and temporal datasets allowed detection of multiple biologically relevant groups (e.g., Wnt signaling, neural crest development, sensory placode specification, ciliogenesis, germ layer specification). We provide an interactive network interface, EctoMap, for exploring synexpression relationships among genes expressed in the neurula, and suggest several strategies to use this comprehensive dataset to address questions in developmental biology as well as stem cell or cancer research. PMID:29049289
Investigating Extreme Lifestyles through Mangrove Transcriptomics
ERIC Educational Resources Information Center
Dassanayake, Maheshi
2009-01-01
Mangroves represent phylogenetically diverse taxa in tropical coastal terrestrial habitats. They are extremophiles, evolutionarily adapted to tolerate flooding, anoxia, high temperatures, wind, and high and extremely variable salt conditions in typically resource-poor environments. The genetic basis for these adaptations is, however, virtually…
Fatima, Tahira; Snyder, Crystal L; Schroeder, William R; Cram, Dustin; Datla, Raju; Wishart, David; Weselake, Randall J; Krishna, Priti
2012-01-01
Sea buckthorn (Hippophae rhamnoides L.) is a hardy, fruit-producing plant known historically for its medicinal and nutraceutical properties. The most recognized product of sea buckthorn is its fruit oil, composed of seed oil that is rich in essential fatty acids, linoleic (18:2 ω-6) and α-linolenic (18:3 ω-3) acids, and pulp oil that contains high levels of monounsaturated palmitoleic acid (16:1 ω-7). Sea buckthorn is fast gaining popularity as a source of functional food and nutraceuticals, but currently has few genomic resources; therefore, we explored the fatty acid composition of Canadian-grown cultivars (ssp. mongolica) and the sea buckthorn seed transcriptome using the 454 GS FLX sequencing technology. GC-MS profiling of fatty acids in seeds and pulp of berries indicated that the seed oil contained linoleic and α-linolenic acids at 33-36% and 30-36%, respectively, while the pulp oil contained palmitoleic acid at 32-42%. 454 sequencing of sea buckthorn cDNA collections from mature seeds yielded 500,392 sequence reads, which identified 89,141 putative unigenes represented by 37,482 contigs and 51,659 singletons. Functional annotation by Gene Ontology and computational prediction of metabolic pathways indicated that primary metabolism (protein>nucleic acid>carbohydrate>lipid) and fatty acid and lipid biosynthesis pathways were highly represented categories. Sea buckthorn sequences related to fatty acid biosynthesis genes in Arabidopsis were identified, and a subset of these was examined for transcript expression at four developing stages of the berry. This study provides the first comprehensive genomic resources represented by expressed sequences for sea buckthorn, and demonstrates that the seed oil of Canadian-grown sea buckthorn cultivars contains high levels of linoleic acid and α-linolenic acid in a close to 1:1 ratio, which is beneficial for human health. These data provide the foundation for further studies on sea buckthorn oil, the enzymes involved in its biosynthesis, and the genes involved in the general hardiness of sea buckthorn against environmental conditions.
Fatima, Tahira; Snyder, Crystal L.; Schroeder, William R.; Cram, Dustin; Datla, Raju; Wishart, David; Weselake, Randall J.; Krishna, Priti
2012-01-01
Background Sea buckthorn (Hippophae rhamnoides L.) is a hardy, fruit-producing plant known historically for its medicinal and nutraceutical properties. The most recognized product of sea buckthorn is its fruit oil, composed of seed oil that is rich in essential fatty acids, linoleic (18∶2ω-6) and α-linolenic (18∶3ω-3) acids, and pulp oil that contains high levels of monounsaturated palmitoleic acid (16∶1ω-7). Sea buckthorn is fast gaining popularity as a source of functional food and nutraceuticals, but currently has few genomic resources; therefore, we explored the fatty acid composition of Canadian-grown cultivars (ssp. mongolica) and the sea buckthorn seed transcriptome using the 454 GS FLX sequencing technology. Results GC-MS profiling of fatty acids in seeds and pulp of berries indicated that the seed oil contained linoleic and α-linolenic acids at 33–36% and 30–36%, respectively, while the pulp oil contained palmitoleic acid at 32–42%. 454 sequencing of sea buckthorn cDNA collections from mature seeds yielded 500,392 sequence reads, which identified 89,141 putative unigenes represented by 37,482 contigs and 51,659 singletons. Functional annotation by Gene Ontology and computational prediction of metabolic pathways indicated that primary metabolism (protein>nucleic acid>carbohydrate>lipid) and fatty acid and lipid biosynthesis pathways were highly represented categories. Sea buckthorn sequences related to fatty acid biosynthesis genes in Arabidopsis were identified, and a subset of these was examined for transcript expression at four developing stages of the berry. Conclusion This study provides the first comprehensive genomic resources represented by expressed sequences for sea buckthorn, and demonstrates that the seed oil of Canadian-grown sea buckthorn cultivars contains high levels of linoleic acid and α-linolenic acid in a close to 1∶1 ratio, which is beneficial for human health. These data provide the foundation for further studies on sea buckthorn oil, the enzymes involved in its biosynthesis, and the genes involved in the general hardiness of sea buckthorn against environmental conditions. PMID:22558083
The Emerging Oilseed Crop Sesamum indicum Enters the “Omics” Era
Dossa, Komivi; Diouf, Diaga; Wang, Linhai; Wei, Xin; Zhang, Yanxin; Niang, Mareme; Fonceka, Daniel; Yu, Jingyin; Mmadi, Marie A.; Yehouessi, Louis W.; Liao, Boshou; Zhang, Xiurong; Cisse, Ndiaga
2017-01-01
Sesame (Sesamum indicum L.) is one of the oldest oilseed crops widely grown in Africa and Asia for its high-quality nutritional seeds. It is well adapted to harsh environments and constitutes an alternative cash crop for smallholders in developing countries. Despite its economic and nutritional importance, sesame is considered as an orphan crop because it has received very little attention from science. As a consequence, it lags behind the other major oil crops as far as genetic improvement is concerned. In recent years, the scenario has considerably changed with the decoding of the sesame nuclear genome leading to the development of various genomic resources including molecular markers, comprehensive genetic maps, high-quality transcriptome assemblies, web-based functional databases and diverse daft genome sequences. The availability of these tools in association with the discovery of candidate genes and quantitative trait locis for key agronomic traits including high oil content and quality, waterlogging and drought tolerance, disease resistance, cytoplasmic male sterility, high yield, pave the way to the development of some new strategies for sesame genetic improvement. As a result, sesame has graduated from an “orphan crop” to a “genomic resource-rich crop.” With the limited research teams working on sesame worldwide, more synergic efforts are needed to integrate these resources in sesame breeding for productivity upsurge, ensuring food security and improved livelihood in developing countries. This review retraces the evolution of sesame research by highlighting the recent advances in the “Omics” area and also critically discusses the future prospects for a further genetic improvement and a better expansion of this crop. PMID:28713412
Hinkson, Izumi V.; Davidsen, Tanja M.; Klemm, Juli D.; Chandramouliswaran, Ishwar; Kerlavage, Anthony R.; Kibbe, Warren A.
2017-01-01
Advancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wherein each patient has a unique collection of germline variants and somatic mutations. Molecular profiling of patient-derived samples has led to a data explosion that could help us understand the contributions of environment and germline to risk, therapeutic response, and outcome. To maximize the value of these data, an interdisciplinary approach is paramount. The National Cancer Institute (NCI) has initiated multiple projects to characterize tumor samples using multi-omic approaches. These projects harness the expertise of clinicians, biologists, computer scientists, and software engineers to investigate cancer biology and therapeutic response in multidisciplinary teams. Petabytes of cancer genomic, transcriptomic, epigenomic, proteomic, and imaging data have been generated by these projects. To address the data analysis challenges associated with these large datasets, the NCI has sponsored the development of the Genomic Data Commons (GDC) and three Cloud Resources. The GDC ensures data and metadata quality, ingests and harmonizes genomic data, and securely redistributes the data. During its pilot phase, the Cloud Resources tested multiple cloud-based approaches for enhancing data access, collaboration, computational scalability, resource democratization, and reproducibility. These NCI-led efforts are continuously being refined to better support open data practices and precision oncology, and to serve as building blocks of the NCI Cancer Research Data Commons. PMID:28983483
The Emerging Oilseed Crop Sesamum indicum Enters the "Omics" Era.
Dossa, Komivi; Diouf, Diaga; Wang, Linhai; Wei, Xin; Zhang, Yanxin; Niang, Mareme; Fonceka, Daniel; Yu, Jingyin; Mmadi, Marie A; Yehouessi, Louis W; Liao, Boshou; Zhang, Xiurong; Cisse, Ndiaga
2017-01-01
Sesame ( Sesamum indicum L.) is one of the oldest oilseed crops widely grown in Africa and Asia for its high-quality nutritional seeds. It is well adapted to harsh environments and constitutes an alternative cash crop for smallholders in developing countries. Despite its economic and nutritional importance, sesame is considered as an orphan crop because it has received very little attention from science. As a consequence, it lags behind the other major oil crops as far as genetic improvement is concerned. In recent years, the scenario has considerably changed with the decoding of the sesame nuclear genome leading to the development of various genomic resources including molecular markers, comprehensive genetic maps, high-quality transcriptome assemblies, web-based functional databases and diverse daft genome sequences. The availability of these tools in association with the discovery of candidate genes and quantitative trait locis for key agronomic traits including high oil content and quality, waterlogging and drought tolerance, disease resistance, cytoplasmic male sterility, high yield, pave the way to the development of some new strategies for sesame genetic improvement. As a result, sesame has graduated from an "orphan crop" to a "genomic resource-rich crop." With the limited research teams working on sesame worldwide, more synergic efforts are needed to integrate these resources in sesame breeding for productivity upsurge, ensuring food security and improved livelihood in developing countries. This review retraces the evolution of sesame research by highlighting the recent advances in the "Omics" area and also critically discusses the future prospects for a further genetic improvement and a better expansion of this crop.
Zhou, Xiaofan; Rinker, David C.; Pitts, Ronald Jason; Rokas, Antonis; Zwiebel, Laurence J.
2014-01-01
Many mosquito species serve as vectors of diseases such as malaria and yellow fever, wherein pathogen transmission is tightly associated with the reproductive requirement of taking vertebrate blood meals. Toxorhynchites is one of only three known mosquito genera that does not host-seek and initiates egg development in the absence of a blood-derived protein bolus. These remarkable differences make Toxorhynchites an attractive comparative reference for understanding mosquito chemosensation as it pertains to host-seeking. We performed deep transcriptome profiling of adult female Toxorhynchites amboinensis bodies, antennae and maxillary palps, and identified 25,084 protein-coding “genes” in the de novo assembly. Phylogenomic analysis of 4,266 single-copy “genes” from T. amboinensis, Aedes aegypti, Anopheles gambiae, and Culex quinquefasciatus robustly supported Ae. aegypti as the closest relative of T. amboinensis, with the two species diverged approximately 40 Ma. We identified a large number of T. amboinensis chemosensory “genes,” the majority of which have orthologs in other mosquitoes. Finally, cross-species expression analyses indicated that patterns of chemoreceptor transcript abundance were very similar for chemoreceptors that are conserved between T. amboinensis and Ae. aegypti, whereas T. amboinensis appeared deficient in the variety of expressed, lineage-specific chemoreceptors. Our transcriptome assembly of T. amboinensis represents the first comprehensive genomic resource for a nonblood-feeding mosquito and establishes a foundation for future comparative studies of blood-feeding and nonblood-feeding mosquitoes. We hypothesize that chemosensory genes that display discrete patterns of evolution and abundance between T. amboinensis and blood-feeding mosquitoes are likely to play critical roles in host-seeking and hence the vectorial capacity. PMID:25326137
Xie, Chunliang; Gong, Wenbing; Zhu, Zuohua; Yan, Li; Hu, Zhenxiu; Peng, Yuande
2018-05-01
Blue light is an important environmental factor which could induce mushroom primordium differentiation and fruiting body development. However, the mechanisms of Pleurotus eryngii primordium differentiation and development induced by blue light are still unclear. The CAZymes (carbohydrate-active enzymes) play important roles in degradation of renewable lignocelluloses to provide carbohydrates for fungal growth, development and reproduction. In the present research, the expression profiles of genes were measured by comparison between the Pleurotus eryngii at primordium differentiated into fruiting body stage after blue light stimulation and dark using high-throughput sequencing approach. After assembly and compared to the Pleurotus eryngii reference genome, 11,343 unigenes were identified. 539 differentially expressed genes including white collar 2 type of transcription factor gene, A mating type protein gene, MAP kinase gene, oxidative phosphorylation associated genes, CAZymes genes and other metabolism related genes were identified during primordium differentiated into fruiting body stage after blue light stimulation. KEGG results showed that carbon metabolism, glycolysis/gluconeogenesis and biosynthesis of amino acids pathways were affected during blue light inducing primordia formation. Most importantly, 319 differentially expressed CAZymes participated in carbon metabolism were identified. The expression patterns of six representative CAZymes and laccase genes were further confirmed by qRT-PCR. Enzyme activity results indicated that the activities of CAZymes and laccase were affected in primordium differentiated into fruiting body under blue light stimulation. In conclusion, the comprehensive transcriptome and CAZymes of Pleurotus eryngii at primordium differentiated into fruiting body stage after blue light stimulation were obtained. The biological insights gained from this integrative system represent a valuable resource for future genomic studies on this commercially important mushroom. Copyright © 2017. Published by Elsevier Inc.
2013-01-01
Background Transcription factors (TFs) are vital elements that regulate transcription and the spatio-temporal expression of genes, thereby ensuring the accurate development and functioning of an organism. The identification of TF-encoding genes in a liverwort, Marchantia polymorpha, offers insights into TF organization in the members of the most basal lineages of land plants (embryophytes). Therefore, a comparison of Marchantia TF genes with other land plants (monocots, dicots, bryophytes) and algae (chlorophytes, rhodophytes) provides the most comprehensive view of the rates of expansion or contraction of TF genes in plant evolution. Results In this study, we report the identification of TF-encoding transcripts in M. polymorpha for the first time, as evidenced by deep RNA sequencing data. In total, 3,471 putative TF encoding transcripts, distributed in 80 families, were identified, representing 7.4% of the generated Marchantia gametophytic transcriptome dataset. Overall, TF basic functions and distribution across families appear to be conserved when compared to other plant species. However, it is of interest to observe the genesis of novel sequences in 24 TF families and the apparent termination of 2 TF families with the emergence of Marchantia. Out of 24 TF families, 6 are known to be associated with plant reproductive development processes. We also examined the expression pattern of these TF-encoding transcripts in six male and female developmental stages in vegetative and reproductive gametophytic tissues of Marchantia. Conclusions The analysis highlighted the importance of Marchantia, a model plant system, in an evolutionary context. The dataset generated here provides a scientific resource for TF gene discovery and other comparative evolutionary studies of land plants. PMID:24365221
Transcriptome Profile Analysis from Different Sex Types of Ginkgo biloba L.
Du, Shuhui; Sang, Yalin; Liu, Xiaojing; Xing, Shiyan; Li, Jihong; Tang, Haixia; Sun, Limin
2016-01-01
In plants, sex determination is a comprehensive process of correlated events, which involves genes that are differentially and/or specifically expressed in distinct developmental phases. Exploring gene expression profiles from different sex types will contribute to fully understanding sex determination in plants. In this study, we conducted RNA-sequencing of female and male buds (FB and MB) as well as ovulate strobilus and staminate strobilus (OS and SS) of Ginkgo biloba to gain insights into the genes potentially related to sex determination in this species. Approximately 60 Gb of clean reads were obtained from eight cDNA libraries. De novo assembly of the clean reads generated 108,307 unigenes with an average length of 796 bp. Among these unigenes, 51,953 (47.97%) had at least one significant match with a gene sequence in the public databases searched. A total of 4709 and 9802 differentially expressed genes (DEGs) were identified in MB vs. FB and SS vs. OS, respectively. Genes involved in plant hormone signal and transduction as well as those encoding DNA methyltransferase were found to be differentially expressed between different sex types. Their potential roles in sex determination of G. biloba were discussed. Pistil-related genes were expressed in male buds while anther-specific genes were identified in female buds, suggesting that dioecism in G. biloba was resulted from the selective arrest of reproductive primordia. High correlation of expression level was found between the RNA-Seq and quantitative real-time PCR results. The transcriptome resources that we generated allowed us to characterize gene expression profiles and examine differential expression profiles, which provided foundations for identifying functional genes associated with sex determination in G. biloba.
Yang, Liandong; Wang, Ying; Zhang, Zhaolei; He, Shunping
2014-12-26
Elucidating the genetic mechanisms of organismal adaptation to the Tibetan Plateau at a genomic scale can provide insights into the process of adaptive evolution. Many highland species have been investigated and various candidate genes that may be responsible for highland adaptation have been identified. However, we know little about the genomic basis of adaptation to Tibet in fishes. Here, we performed transcriptome sequencing of a schizothoracine fish (Gymnodiptychus pachycheilus) and used it to identify potential genetic mechanisms of highland adaptation. We obtained totally 66,105 assembled unigenes, of which 7,232 were assigned as putative one-to-one orthologs in zebrafish. Comparative gene annotations from several species indicated that at least 350 genes lost and 41 gained since the divergence between G. pachycheilus and zebrafish. An analysis of 6,324 orthologs among zebrafish, fugu, medaka, and spotted gar identified consistent evidence for genome-wide accelerated evolution in G. pachycheilus and only the terminal branch of G. pachycheilus had an elevated Ka/Ks ratio than the ancestral branch. Many functional categories related to hypoxia and energy metabolism exhibited rapid evolution in G. pachycheilus relative to zebrafish. Genes showing signature of rapid evolution and positive selection in the G. pachycheilus lineage were also enriched in functions associated with energy metabolism and hypoxia. The first genomic resources for fish in the Tibetan Plateau and evolutionary analyses provided some novel insights into highland adaptation in fishes and served as a foundation for future studies aiming to identify candidate genes underlying the genetic bases of adaptation to Tibet in fishes. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Transcriptome Profile Analysis from Different Sex Types of Ginkgo biloba L.
Du, Shuhui; Sang, Yalin; Liu, Xiaojing; Xing, Shiyan; Li, Jihong; Tang, Haixia; Sun, Limin
2016-01-01
In plants, sex determination is a comprehensive process of correlated events, which involves genes that are differentially and/or specifically expressed in distinct developmental phases. Exploring gene expression profiles from different sex types will contribute to fully understanding sex determination in plants. In this study, we conducted RNA-sequencing of female and male buds (FB and MB) as well as ovulate strobilus and staminate strobilus (OS and SS) of Ginkgo biloba to gain insights into the genes potentially related to sex determination in this species. Approximately 60 Gb of clean reads were obtained from eight cDNA libraries. De novo assembly of the clean reads generated 108,307 unigenes with an average length of 796 bp. Among these unigenes, 51,953 (47.97%) had at least one significant match with a gene sequence in the public databases searched. A total of 4709 and 9802 differentially expressed genes (DEGs) were identified in MB vs. FB and SS vs. OS, respectively. Genes involved in plant hormone signal and transduction as well as those encoding DNA methyltransferase were found to be differentially expressed between different sex types. Their potential roles in sex determination of G. biloba were discussed. Pistil-related genes were expressed in male buds while anther-specific genes were identified in female buds, suggesting that dioecism in G. biloba was resulted from the selective arrest of reproductive primordia. High correlation of expression level was found between the RNA-Seq and quantitative real-time PCR results. The transcriptome resources that we generated allowed us to characterize gene expression profiles and examine differential expression profiles, which provided foundations for identifying functional genes associated with sex determination in G. biloba. PMID:27379148
Xiu, Yu; Wu, Guodong; Tang, Wensi; Peng, Zhengfeng; Bu, Xiangpan; Chao, Longjun; Yin, Xue; Xiong, Jiannan; Zhang, Haiwu; Zhao, Xiaoqing; Ding, Jing; Ma, Lvyi; Wang, Huafang; van Staden, Johannes
2018-06-04
Paeonia ostii var. lishizhenii, a well-known medicinal and horticultural plant, is indigenous to China. Recent studies have shown that its seed has a high oil content, and it was approved as a novel resource of edible oil with a high level of α-linolenic acid by the Chinese Government. This study measured the seed oil contents and fatty acid components of P. ostii var. lishizhenii and six other peonies, P. suffruticosa, P. ludlowii, P. decomposita, P. rockii, and P. lactiflora Pall. 'Heze' and 'Gansu'. The results show that P. ostii var. lishizhenii exhibits the average oil characteristics of tested peonies, with an oil content of 21.3%, α-linolenic acid 43.8%, and unsaturated fatty acids around 92.1%. Hygiene indicators for the seven peony seed oils met the Chinese national food standards. P. ostii var. lishizhenii seeds were used to analyze transcriptome gene regulation networks on endosperm development and oil biosynthesis. In total, 124,117 transcripts were obtained from six endosperm developing stages (S0-S5). The significant changes in differential expression genes (DEGs) clarify three peony endosperm developmental phases: the endosperm cell mitotic phase (S0-S1), the TAG biosynthesis phase (S1-S4), and the mature phase (S5). The DEGs in plant hormone signal transduction, DNA replication, cell division, differentiation, transcription factors, and seed dormancy pathways regulate the endosperm development process. Another 199 functional DEGs participate in glycolysis, pentose phosphate pathway, citrate cycle, FA biosynthesis, TAG assembly, and other pathways. A key transcription factor (WRI1) and some important target genes (ACCase, FATA, LPCAT, FADs, and DGAT etc.) were found in the comprehensive genetic networks of oil biosynthesis. Copyright © 2018 Elsevier GmbH. All rights reserved.
Magistri, Marco; Khoury, Nathalie; Mazza, Emilia Maria Cristina; Velmeshev, Dmitry; Lee, Jae K; Bicciato, Silvio; Tsoulfas, Pantelis; Faghihi, Mohammad Ali
2016-11-01
Astrocytes are a morphologically and functionally heterogeneous population of cells that play critical roles in neurodevelopment and in the regulation of central nervous system homeostasis. Studies of human astrocytes have been hampered by the lack of specific molecular markers and by the difficulties associated with purifying and culturing astrocytes from adult human brains. Human neural progenitor cells (NPCs) with self-renewal and multipotent properties represent an appealing model system to gain insight into the developmental genetics and function of human astrocytes, but a comprehensive molecular characterization that confirms the validity of this cellular system is still missing. Here we used an unbiased transcriptomic analysis to characterize in vitro culture of human NPCs and to define the gene expression programs activated during the differentiation of these cells into astrocytes using FBS or the combination of CNTF and BMP4. Our results demonstrate that in vitro cultures of human NPCs isolated during the gliogenic phase of neurodevelopment mainly consist of radial glial cells (RGCs) and glia-restricted progenitor cells. In these cells the combination of CNTF and BMP4 activates the JAK/STAT and SMAD signaling cascades, leading to the inhibition of oligodendrocytes lineage commitment and activation of astrocytes differentiation. On the other hand, FBS-derived astrocytes have properties of reactive astrocytes. Our work suggests that in vitro culture of human NPCs represents a valuable cellular system to study human disorders characterized by impairment of astrocytes development and function. Our datasets represent an important resource for researchers studying human astrocytes development and might set the basis for the discovery of novel human-specific astrocyte markers. © 2016 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Wang, Yejun; MacKenzie, Keith D; White, Aaron P
2015-05-07
As sequencing costs are being lowered continuously, RNA-seq has gradually been adopted as the first choice for comparative transcriptome studies with bacteria. Unlike microarrays, RNA-seq can directly detect cDNA derived from mRNA transcripts at a single nucleotide resolution. Not only does this allow researchers to determine the absolute expression level of genes, but it also conveys information about transcript structure. Few automatic software tools have yet been established to investigate large-scale RNA-seq data for bacterial transcript structure analysis. In this study, 54 directional RNA-seq libraries from Salmonella serovar Typhimurium (S. Typhimurium) 14028s were examined for potential relationships between read mapping patterns and transcript structure. We developed an empirical method, combined with statistical tests, to automatically detect key transcript features, including transcriptional start sites (TSSs), transcriptional termination sites (TTSs) and operon organization. Using our method, we obtained 2,764 TSSs and 1,467 TTSs for 1331 and 844 different genes, respectively. Identification of TSSs facilitated further discrimination of 215 putative sigma 38 regulons and 863 potential sigma 70 regulons. Combining the TSSs and TTSs with intergenic distance and co-expression information, we comprehensively annotated the operon organization in S. Typhimurium 14028s. Our results show that directional RNA-seq can be used to detect transcriptional borders at an acceptable resolution of ±10-20 nucleotides. Technical limitations of the RNA-seq procedure may prevent single nucleotide resolution. The automatic transcript border detection methods, statistical models and operon organization pipeline that we have described could be widely applied to RNA-seq studies in other bacteria. Furthermore, the TSSs, TTSs, operons, promoters and unstranslated regions that we have defined for S. Typhimurium 14028s may constitute valuable resources that can be used for comparative analyses with other Salmonella serotypes.
De novo transcriptomic analysis and development of EST-SSRs for Sorbus pohuashanensis (Hance) Hedl.
Guan, Xuelian; Fu, Qiang; Zhang, Ze; Hu, Zenghui; Zheng, Jian; Lu, Yizeng; Li, Wei
2017-01-01
Sorbus pohuashanensis is a native tree species of northern China that is used for a variety of ecological purposes. The species is often grown as an ornamental landscape tree because of its beautiful form, silver flowers in early summer, attractive pinnate leaves in summer, and red leaves and fruits in autumn. However, development and further utilization of the species are hindered by the lack of comprehensive genetic information, which impedes research into its genetics and molecular biology. Recent advances in de novo transcriptome sequencing (RNA-seq) technology have provided an effective means to obtain genomic information from non-model species. Here, we applied RNA-seq for sequencing S. pohuashanensis leaves and obtained a total of 137,506 clean reads. After assembly, 96,213 unigenes with an average length of 770 bp were obtained. We found that 64.5% of the unigenes could be annotated using bioinformatics tools to analyze gene function and alignment with the NCBI database. Overall, 59,089 unigenes were annotated using the Nr database(non-redundant protein database), 35,225 unigenes were annotated using the GO (Gene Ontology categories) database, and 33,168 unigenes were annotated using COG (Cluster of Orthologous Groups). Analysis of the unigenes using the KEGG (Kyoto Encyclopedia of Genes and Genomes) database indicated that 13,953 unigenes were involved in 322 metabolic pathways. Finally, simple sequence repeat (SSR) site detection identified 6,604 unigenes that included EST-SSRs and a total of 7,473 EST-SSRs in the unigene sequences. Fifteen polymorphic SSRs were screened and found to be of use for future genetic research. These unigene sequences will provide important genetic resources for genetic improvement and investigation of biochemical processes in S. pohuashanensis. PMID:28614366
Stelpflug, Scott C.; Sekhon, Rajandeep S.; Vaillancourt, Brieanne; ...
2015-12-30
Comprehensive and systematic transcriptome profiling provides valuable insight into biological and developmental processes that occur throughout the life cycle of a plant. We have enhanced our previously published microarray-based gene atlas of maize ( Zea mays L.) inbred B73 to now include 79 distinct replicated samples that have been interrogated using RNA sequencing (RNA-seq). The current version of the atlas includes 50 original array-based gene atlas samples, a time-course of 12 stalk and leaf samples postflowering, and an additional set of 17 samples from the maize seedling and adult root system. The entire dataset contains 4.6 billion mapped reads, withmore » an average of 20.5 million mapped reads per biological replicate, allowing for detection of genes with lower transcript abundance. As the new root samples represent key additions to the previously examined tissues, we highlight insights into the root transcriptome, which is represented by 28,894 (73.2%) annotated genes in maize. Additionally, we observed remarkable expression differences across both the longitudinal (four zones) and radial gradients (cortical parenchyma and stele) of the primary root supported by fourfold differential expression of 9353 and 4728 genes, respectively. Among the latter were 1110 genes that encode transcription factors, some of which are orthologs of previously characterized transcription factors known to regulate root development in Arabidopsis thaliana (L.) Heynh., while most are novel, and represent attractive targets for reverse genetics approaches to determine their roles in this important organ. As a result, this comprehensive transcriptome dataset is a powerful tool toward understanding maize development, physiology, and phenotypic diversity.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stelpflug, Scott C.; Sekhon, Rajandeep S.; Vaillancourt, Brieanne
Comprehensive and systematic transcriptome profiling provides valuable insight into biological and developmental processes that occur throughout the life cycle of a plant. We have enhanced our previously published microarray-based gene atlas of maize ( Zea mays L.) inbred B73 to now include 79 distinct replicated samples that have been interrogated using RNA sequencing (RNA-seq). The current version of the atlas includes 50 original array-based gene atlas samples, a time-course of 12 stalk and leaf samples postflowering, and an additional set of 17 samples from the maize seedling and adult root system. The entire dataset contains 4.6 billion mapped reads, withmore » an average of 20.5 million mapped reads per biological replicate, allowing for detection of genes with lower transcript abundance. As the new root samples represent key additions to the previously examined tissues, we highlight insights into the root transcriptome, which is represented by 28,894 (73.2%) annotated genes in maize. Additionally, we observed remarkable expression differences across both the longitudinal (four zones) and radial gradients (cortical parenchyma and stele) of the primary root supported by fourfold differential expression of 9353 and 4728 genes, respectively. Among the latter were 1110 genes that encode transcription factors, some of which are orthologs of previously characterized transcription factors known to regulate root development in Arabidopsis thaliana (L.) Heynh., while most are novel, and represent attractive targets for reverse genetics approaches to determine their roles in this important organ. As a result, this comprehensive transcriptome dataset is a powerful tool toward understanding maize development, physiology, and phenotypic diversity.« less
Amin, Shivam V; Roberts, Justin T; Patterson, Dillon G; Coley, Alexander B; Allred, Jonathan A; Denner, Jason M; Johnson, Justin P; Mullen, Genevieve E; O'Neal, Trenton K; Smith, Jason T; Cardin, Sara E; Carr, Hank T; Carr, Stacie L; Cowart, Holly E; DaCosta, David H; Herring, Brendon R; King, Valeria M; Polska, Caroline J; Ward, Erin E; Wise, Alice A; McAllister, Kathleen N; Chevalier, David; Spector, Michael P; Borchert, Glen M
2016-01-01
Small RNAs (sRNAs) are short (∼50-200 nucleotides) noncoding RNAs that regulate cellular activities across bacteria. Salmonella enterica starved of a carbon-energy (C) source experience a host of genetic and physiological changes broadly referred to as the starvation-stress response (SSR). In an attempt to identify novel sRNAs contributing to SSR control, we grew log-phase, 5-h C-starved and 24-h C-starved cultures of the virulent Salmonella enterica subspecies enterica serovar Typhimurium strain SL1344 and comprehensively sequenced their small RNA transcriptomes. Strikingly, after employing a novel strategy for sRNA discovery based on identifying dynamic transcripts arising from "gene-empty" regions, we identify 58 wholly undescribed Salmonella sRNA genes potentially regulating SSR averaging an ∼1,000-fold change in expression between log-phase and C-starved cells. Importantly, the expressions of individual sRNA loci were confirmed by both comprehensive transcriptome analyses and northern blotting of select candidates. Of note, we find 43 candidate sRNAs share significant sequence identity to characterized sRNAs in other bacteria, and ∼70% of our sRNAs likely assume characteristic sRNA structural conformations. In addition, we find 53 of our 58 candidate sRNAs either overlap neighboring mRNA loci or share significant sequence complementarity to mRNAs transcribed elsewhere in the SL1344 genome strongly suggesting they regulate the expression of transcripts via antisense base-pairing. Finally, in addition to this work resulting in the identification of 58 entirely novel Salmonella enterica genes likely participating in the SSR, we also find evidence suggesting that sRNAs are significantly more prevalent than currently appreciated and that Salmonella sRNAs may actually number in the thousands.
Amin, Shivam V.; Roberts, Justin T.; Patterson, Dillon G.; Coley, Alexander B.; Allred, Jonathan A.; Denner, Jason M.; Johnson, Justin P.; Mullen, Genevieve E.; O'Neal, Trenton K.; Smith, Jason T.; Cardin, Sara E.; Carr, Hank T.; Carr, Stacie L.; Cowart, Holly E.; DaCosta, David H.; Herring, Brendon R.; King, Valeria M.; Polska, Caroline J.; Ward, Erin E.; Wise, Alice A.; McAllister, Kathleen N.; Chevalier, David; Spector, Michael P.; Borchert, Glen M.
2016-01-01
ABSTRACT Small RNAs (sRNAs) are short (∼50–200 nucleotides) noncoding RNAs that regulate cellular activities across bacteria. Salmonella enterica starved of a carbon-energy (C) source experience a host of genetic and physiological changes broadly referred to as the starvation-stress response (SSR). In an attempt to identify novel sRNAs contributing to SSR control, we grew log-phase, 5-h C-starved and 24-h C-starved cultures of the virulent Salmonella enterica subspecies enterica serovar Typhimurium strain SL1344 and comprehensively sequenced their small RNA transcriptomes. Strikingly, after employing a novel strategy for sRNA discovery based on identifying dynamic transcripts arising from “gene-empty” regions, we identify 58 wholly undescribed Salmonella sRNA genes potentially regulating SSR averaging an ∼1,000-fold change in expression between log-phase and C-starved cells. Importantly, the expressions of individual sRNA loci were confirmed by both comprehensive transcriptome analyses and northern blotting of select candidates. Of note, we find 43 candidate sRNAs share significant sequence identity to characterized sRNAs in other bacteria, and ∼70% of our sRNAs likely assume characteristic sRNA structural conformations. In addition, we find 53 of our 58 candidate sRNAs either overlap neighboring mRNA loci or share significant sequence complementarity to mRNAs transcribed elsewhere in the SL1344 genome strongly suggesting they regulate the expression of transcripts via antisense base-pairing. Finally, in addition to this work resulting in the identification of 58 entirely novel Salmonella enterica genes likely participating in the SSR, we also find evidence suggesting that sRNAs are significantly more prevalent than currently appreciated and that Salmonella sRNAs may actually number in the thousands. PMID:26853797
Manteniotis, Stavros; Lehmann, Ramona; Flegel, Caroline; Vogel, Felix; Hofreuter, Adrian; Schreiner, Benjamin S. P.; Altmüller, Janine; Becker, Christian; Schöbel, Nicole; Hatt, Hanns; Gisselmann, Günter
2013-01-01
The specific functions of sensory systems depend on the tissue-specific expression of genes that code for molecular sensor proteins that are necessary for stimulus detection and membrane signaling. Using the Next Generation Sequencing technique (RNA-Seq), we analyzed the complete transcriptome of the trigeminal ganglia (TG) and dorsal root ganglia (DRG) of adult mice. Focusing on genes with an expression level higher than 1 FPKM (fragments per kilobase of transcript per million mapped reads), we detected the expression of 12984 genes in the TG and 13195 in the DRG. To analyze the specific gene expression patterns of the peripheral neuronal tissues, we compared their gene expression profiles with that of the liver, brain, olfactory epithelium, and skeletal muscle. The transcriptome data of the TG and DRG were scanned for virtually all known G-protein-coupled receptors (GPCRs) as well as for ion channels. The expression profile was ranked with regard to the level and specificity for the TG. In total, we detected 106 non-olfactory GPCRs and 33 ion channels that had not been previously described as expressed in the TG. To validate the RNA-Seq data, in situ hybridization experiments were performed for several of the newly detected transcripts. To identify differences in expression profiles between the sensory ganglia, the RNA-Seq data of the TG and DRG were compared. Among the differentially expressed genes (> 1 FPKM), 65 and 117 were expressed at least 10-fold higher in the TG and DRG, respectively. Our transcriptome analysis allows a comprehensive overview of all ion channels and G protein-coupled receptors that are expressed in trigeminal ganglia and provides additional approaches for the investigation of trigeminal sensing as well as for the physiological and pathophysiological mechanisms of pain. PMID:24260241
Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa.
Shahin, Arwa; van Kaauwen, Martijn; Esselink, Danny; Bargsten, Joachim W; van Tuyl, Jaap M; Visser, Richard G F; Arens, Paul
2012-11-20
Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Two transcriptome sets were built that are valuable resources for marker development, comparative genomic studies and candidate gene approaches. Next generation sequencing of leaf transcriptome is very effective; however, deeper sequencing and using more tissues and stages is advisable for extended comparative studies.
Li, Hang; Jiang, Weihua; Zhang, Zan; Xing, Yanru; Li, Fei
2013-01-01
The beet armyworm, Spodoptera exigua (Hübner), is a serious pest worldwide that causes significant losses in crops. Unfortunately, genetic resources for the beet armyworm is extremely scarce. To improve these resources we sequenced the transcriptome of S. exigua representing all stages including eggs, 1(st) to 5(th) instar larvae, pupae, male and female adults using the Illumina Solexa platform. We assembled the transcriptome with Trinity that yielded 31,414 contigs. Of these contigs, 18,592 were annotated as protein coding genes by Blast searches against the NCBI nr database. It has been shown that knockdown of important insect genes by dsRNAs or siRNAs is a feasible mechanism to control insect pests. The first key step towards developing an efficient RNAi-mediated pest control technique is to find suitable target genes. To screen for effective target genes in the beet armyworm, we selected nine candidate genes. The sequences of these genes were amplified using the RACE strategy. Then, siRNAs were designed and chemically synthesized. We injected 2 µl siRNA (2 µg/µl) into the 4(th) instar larvae to knock down the respective target genes. The mRNA abundance of target genes decreased to different levels (∼20-94.3%) after injection of siRNAs. Knockdown of eight genes including chitinase7, PGCP, chitinase1, ATPase, tubulin1, arf2, tubulin2 and arf1 caused a significantly high level of mortality compared to the negative control (P<0.05). About 80% of the surviving insects in the siRNA-treated group of five genes (PGCP, chitinase1, tubulin1, tubulin2 and helicase) showed retarded development. In chitinase1-siRNA and chitinase7-siRNA administered groups, 12.5% survivors exhibited "half-ecdysis". In arf1-siRNA and arf2-siRNA groups, the body color of 15% became black 48 h after injections. In summary, the transcriptome could be a valuable genetic resource for identification of genes in S. exigua and this study provided putative targets for RNAi pest control.
Zhang, Zan; Xing, Yanru; Li, Fei
2013-01-01
The beet armyworm, Spodoptera exigua (Hübner), is a serious pest worldwide that causes significant losses in crops. Unfortunately, genetic resources for the beet armyworm is extremely scarce. To improve these resources we sequenced the transcriptome of S. exigua representing all stages including eggs, 1st to 5th instar larvae, pupae, male and female adults using the Illumina Solexa platform. We assembled the transcriptome with Trinity that yielded 31,414 contigs. Of these contigs, 18,592 were annotated as protein coding genes by Blast searches against the NCBI nr database. It has been shown that knockdown of important insect genes by dsRNAs or siRNAs is a feasible mechanism to control insect pests. The first key step towards developing an efficient RNAi-mediated pest control technique is to find suitable target genes. To screen for effective target genes in the beet armyworm, we selected nine candidate genes. The sequences of these genes were amplified using the RACE strategy. Then, siRNAs were designed and chemically synthesized. We injected 2 µl siRNA (2 µg/µl) into the 4th instar larvae to knock down the respective target genes. The mRNA abundance of target genes decreased to different levels (∼20–94.3%) after injection of siRNAs. Knockdown of eight genes including chitinase7, PGCP, chitinase1, ATPase, tubulin1, arf2, tubulin2 and arf1 caused a significantly high level of mortality compared to the negative control (P<0.05). About 80% of the surviving insects in the siRNA-treated group of five genes (PGCP, chitinase1, tubulin1, tubulin2 and helicase) showed retarded development. In chitinase1-siRNA and chitinase7-siRNA administered groups, 12.5% survivors exhibited “half-ecdysis”. In arf1-siRNA and arf2-siRNA groups, the body color of 15% became black 48 h after injections. In summary, the transcriptome could be a valuable genetic resource for identification of genes in S. exigua and this study provided putative targets for RNAi pest control. PMID:23823756
USDA-ARS?s Scientific Manuscript database
Winter dormancy is an important biological feature for tea plant to survive cold winters, and it also affects the economic output of tea plant, one of the few woody plants in the world whose leaves are harvested and one of the few non-conifer evergreen species with characterized dormancies. To disco...
Gao, Fan-Xiang; Wang, Yang; Zhang, Qi-Ya; Mou, Cheng-Yan; Li, Zhi; Deng, Yuan-Sheng; Zhou, Li; Gui, Jian-Fang
2017-07-24
Gibel carp is an important aquaculture species in China, and a herpesvirus, called as Carassius auratus herpesvirus (CaHV), has hampered the aquaculture development. Diverse gynogenetic clones of gibel carp have been identified or created, and some of them have been used as aquaculture varieties, but their resistances to herpesvirus and the underlying mechanism remain unknown. To reveal their susceptibility differences, we firstly performed herpesvirus challenge experiments in three gynogenetic clones of gibel carp, including the leading variety clone A + , candidate variety clone F and wild clone H. Three clones showed distinct resistances to CaHV. Moreover, 8772, 8679 and 10,982 differentially expressed unigenes (DEUs) were identified from comparative transcriptomes between diseased individuals and control individuals of clone A + , F and H, respectively. Comprehensive analysis of the shared DEUs in all three clones displayed common defense pathways to the herpesvirus infection, activating IFN system and suppressing complements. KEGG pathway analysis of specifically changed DEUs in respective clones revealed distinct immune responses to the herpesvirus infection. The DEU numbers identified from clone H in KEGG immune-related pathways, such as "chemokine signaling pathway", "Toll-like receptor signaling pathway" and others, were remarkably much more than those from clone A + and F. Several IFN-related genes, including Mx1, viperin, PKR and others, showed higher increases in the resistant clone H than that in the others. IFNphi3, IFI44-like and Gig2 displayed the highest expression in clone F and IRF1 uniquely increased in susceptible clone A + . In contrast to strong immune defense in resistant clone H, susceptible clone A + showed remarkable up-regulation of genes related to apoptosis or death, indicating that clone A + failed to resist virus offensive and evidently induced apoptosis or death. Our study is the first attempt to screen distinct resistances and immune responses of three gynogenetic gibel carp clones to herpesvirus infection by comprehensive transcriptomes. These differential DEUs, immune-related pathways and IFN system genes identified from susceptible and resistant clones will be beneficial to marker-assisted selection (MAS) breeding or molecular module-based resistance breeding in gibel carp.
The Co-regulation Data Harvester: Automating gene annotation starting from a transcriptome database
NASA Astrophysics Data System (ADS)
Tsypin, Lev M.; Turkewitz, Aaron P.
Identifying co-regulated genes provides a useful approach for defining pathway-specific machinery in an organism. To be efficient, this approach relies on thorough genome annotation, a process much slower than genome sequencing per se. Tetrahymena thermophila, a unicellular eukaryote, has been a useful model organism and has a fully sequenced but sparsely annotated genome. One important resource for studying this organism has been an online transcriptomic database. We have developed an automated approach to gene annotation in the context of transcriptome data in T. thermophila, called the Co-regulation Data Harvester (CDH). Beginning with a gene of interest, the CDH identifies co-regulated genes by accessing the Tetrahymena transcriptome database. It then identifies their closely related genes (orthologs) in other organisms by using reciprocal BLAST searches. Finally, it collates the annotations of those orthologs' functions, which provides the user with information to help predict the cellular role of the initial query. The CDH, which is freely available, represents a powerful new tool for analyzing cell biological pathways in Tetrahymena. Moreover, to the extent that genes and pathways are conserved between organisms, the inferences obtained via the CDH should be relevant, and can be explored, in many other systems.
Liu, Miaomiao; Zhu, Jinhang; Wu, Shengbing; Wang, Chenkai; Guo, Xingyi; Wu, Jiawen; Zhou, Meiqi
2018-04-11
Artemisia argyi Lev. et Vant. (A. argyi) is widely utilized for moxibustion in Chinese medicine, and the mechanism underlying terpenoid biosynthesis in its leaves is suggested to play an important role in its medicinal use. However, the A. argyi transcriptome has not been sequenced. Herein, we performed RNA sequencing for A. argyi leaf, root and stem tissues to identify as many as possible of the transcribed genes. In total, 99,807 unigenes were assembled by analysing the expression profiles generated from the three tissue types, and 67,446 of those unigenes were annotated in public databases. We further performed differential gene expression analysis to compare leaf tissue with the other two tissue types and identified numerous genes that were specifically expressed or up-regulated in leaf tissue. Specifically, we identified multiple genes encoding significant enzymes or transcription factors related to terpenoid synthesis. This study serves as a valuable resource for transcriptome information, as many transcribed genes related to terpenoid biosynthesis were identified in the A. argyi transcriptome, providing a functional genomic basis for additional studies on molecular mechanisms underlying the medicinal use of A. argyi.
Park, So Young; Patnaik, Bharat Bhusan; Kang, Se Won; Hwang, Hee-Ju; Chung, Jong Min; Song, Dae Kwon; Sang, Min Kyu; Patnaik, Hongray Howrelia; Lee, Jae Bong; Noh, Mi Young; Kim, Changmu; Kim, Soonok; Park, Hong Seog; Lee, Jun Sang; Han, Yeon Soo; Lee, Yong Seok
2016-01-01
An aquatic gastropod belonging to the family Neritidae, Clithon retropictus is listed as an endangered class II species in South Korea. The lack of information on its genomic background limits the ability to obtain functional data resources and inhibits informed conservation planning for this species. In the present study, the transcriptomic sequencing and de novo assembly of C. retropictus generated a total of 241,696,750 high-quality reads. These assembled to 282,838 unigenes with mean and N50 lengths of 736.9 and 1201 base pairs, respectively. Of these, 125,616 unigenes were subjected to annotation analysis with known proteins in Protostome DB, COG, GO, and KEGG protein databases (BLASTX; E ≤ 0.00001) and with known nucleotides in the Unigene database (BLASTN; E ≤ 0.00001). The GO analysis indicated that cellular process, cell, and catalytic activity are the predominant GO terms in the biological process, cellular component, and molecular function categories, respectively. In addition, 2093 unigenes were distributed in 107 different KEGG pathways. Furthermore, 49,280 simple sequence repeats were identified in the unigenes (>1 kilobase sequences). This is the first report on the identification of transcriptomic and microsatellite resources for C. retropictus, which opens up the possibility of exploring traits related to the adaptation and acclimatization of this species. PMID:27455329
Schmid, Christoph; Bauer, Sibylle; Müller, Benedikt; Bartelheimer, Maik
2013-01-01
Root-root interactions are much more sophisticated than previously thought, yet the mechanisms of belowground neighbor perception remain largely obscure. Genome-wide transcriptome analyses allow detailed insight into plant reactions to environmental cues. A root interaction trial was set up to explore both morphological and whole genome transcriptional responses in roots of Arabidopsis thaliana in the presence or absence of an inferior competitor, Hieracium pilosella. Neighbor perception was indicated by Arabidopsis roots predominantly growing away from the neighbor (segregation), while solitary plants placed more roots toward the middle of the pot. Total biomass remained unaffected. Database comparisons in transcriptome analysis revealed considerable similarity between Arabidopsis root reactions to neighbors and reactions to pathogens. Detailed analyses of the functional category “biotic stress” using MapMan tools found the sub-category “pathogenesis-related proteins” highly significantly induced. A comparison to a study on intraspecific competition brought forward a core of genes consistently involved in reactions to neighbor roots. We conclude that beyond resource depletion roots perceive neighboring roots or their associated microorganisms by a relatively uniform mechanism that involves the strong induction of pathogenesis-related proteins. In an ecological context the findings reveal that belowground neighbor detection may occur independently of resource depletion, allowing for a time advantage for the root to prepare for potential interactions. PMID:23967000
Koning-Boucoiran, Carole F S; Esselink, G Danny; Vukosavljev, Mirjana; van 't Westende, Wendy P C; Gitonga, Virginia W; Krens, Frans A; Voorrips, Roeland E; van de Weg, W Eric; Schulz, Dietmar; Debener, Thomas; Maliepaard, Chris; Arens, Paul; Smulders, Marinus J M
2015-01-01
In order to develop a versatile and large SNP array for rose, we set out to mine ESTs from diverse sets of rose germplasm. For this RNA-Seq libraries containing about 700 million reads were generated from tetraploid cut and garden roses using Illumina paired-end sequencing, and from diploid Rosa multiflora using 454 sequencing. Separate de novo assemblies were performed in order to identify single nucleotide polymorphisms (SNPs) within and between rose varieties. SNPs among tetraploid roses were selected for constructing a genotyping array that can be employed for genetic mapping and marker-trait association discovery in breeding programs based on tetraploid germplasm, both from cut roses and from garden roses. In total 68,893 SNPs were included on the WagRhSNP Axiom array. Next, an orthology-guided assembly was performed for the construction of a non-redundant rose transcriptome database. A total of 21,740 transcripts had significant hits with orthologous genes in the strawberry (Fragaria vesca L.) genome. Of these 13,390 appeared to contain the full-length coding regions. This newly established transcriptome resource adds considerably to the currently available sequence resources for the Rosaceae family in general and the genus Rosa in particular.
Auerbach, Scott S; Phadke, Dhiral P; Mav, Deepak; Holmgren, Stephanie; Gao, Yuan; Xie, Bin; Shin, Joo Heon; Shah, Ruchir R; Merrick, B Alex; Tice, Raymond R
2015-07-01
Formalin-fixed, paraffin-embedded (FFPE) pathology specimens represent a potentially vast resource for transcriptomic-based biomarker discovery. We present here a comparison of results from a whole transcriptome RNA-Seq analysis of RNA extracted from fresh frozen and FFPE livers. The samples were derived from rats exposed to aflatoxin B1 (AFB1 ) and a corresponding set of control animals. Principal components analysis indicated that samples were separated in the two groups representing presence or absence of chemical exposure, both in fresh frozen and FFPE sample types. Sixty-five percent of the differentially expressed transcripts (AFB1 vs. controls) in fresh frozen samples were also differentially expressed in FFPE samples (overlap significance: P < 0.0001). Genomic signature and gene set analysis of AFB1 differentially expressed transcript lists indicated highly similar results between fresh frozen and FFPE at the level of chemogenomic signatures (i.e., single chemical/dose/duration elicited transcriptomic signatures), mechanistic and pathology signatures, biological processes, canonical pathways and transcription factor networks. Overall, our results suggest that similar hypotheses about the biological mechanism of toxicity would be formulated from fresh frozen and FFPE samples. These results indicate that phenotypically anchored archival specimens represent a potentially informative resource for signature-based biomarker discovery and mechanistic characterization of toxicity. Copyright © 2014 John Wiley & Sons, Ltd.
Liu, Kaidong; Li, Haili; Li, Weijin; Zhong, Jundi; Chen, Yan; Shen, Chenjia; Yuan, Changchun
2017-10-23
Sugar apple (Annona squamosa L.), a popular fruit with high medicinal and nutritional properties, is widely cultivated in tropical South Asia and America. The malformed flower is a major cause for a reduction in production of sugar apple. However, little information is available on the differences between normal and malformed flowers of sugar apple. To gain a comprehensive perspective on the differences between normal and malformed flowers of sugar apple, cDNA libraries from normal and malformation flowers were prepared independently for Illumina sequencing. The data generated a total of 70,189,896 reads that were integrated and assembled into 55,097 unigenes with a mean length of 783 bp. A large number of differentially expressed genes (DEGs) were identified. Among these DEGs, 701 flower development-associated transcript factor encoding genes were included. Furthermore, a large number of flowering- and hormone-related DEGs were also identified, and most of these genes were down-regulated expressed in the malformation flowers. The expression levels of 15 selected genes were validated using quantitative-PCR. The contents of several endogenous hormones were measured. The malformed flowers displayed lower endogenous hormone levels compared to the normal flowers. The expression data as well as hormone levels in our study will serve as a comprehensive resource for investigating the regulation mechanism involved in floral organ development in sugar apple.
Systems biology: A tool for charting the antiviral landscape.
Bowen, James R; Ferris, Martin T; Suthar, Mehul S
2016-06-15
The host antiviral programs that are initiated following viral infection form a dynamic and complex web of responses that we have collectively termed as "the antiviral landscape". Conventional approaches to studying antiviral responses have primarily used reductionist systems to assess the function of a single or a limited subset of molecules. Systems biology is a holistic approach that considers the entire system as a whole, rather than individual components or molecules. Systems biology based approaches facilitate an unbiased and comprehensive analysis of the antiviral landscape, while allowing for the discovery of emergent properties that are missed by conventional approaches. The antiviral landscape can be viewed as a hierarchy of complexity, beginning at the whole organism level and progressing downward to isolated tissues, populations of cells, and single cells. In this review, we will discuss how systems biology has been applied to better understand the antiviral landscape at each of these layers. At the organismal level, the Collaborative Cross is an invaluable genetic resource for assessing how genetic diversity influences the antiviral response. Whole tissue and isolated bulk cell transcriptomics serves as a critical tool for the comprehensive analysis of antiviral responses at both the tissue and cellular levels of complexity. Finally, new techniques in single cell analysis are emerging tools that will revolutionize our understanding of how individual cells within a bulk infected cell population contribute to the overall antiviral landscape. Copyright © 2016 Elsevier B.V. All rights reserved.
Zhang, Shufang; Liu, Yanxuan; Liu, Zhenxiang; Zhang, Chong; Cao, Hui; Ye, Yongqing; Wang, Shunlan; Zhang, Ying'ai; Xiao, Sifang; Yang, Peng; Li, Jindong; Bai, Zhiming
2014-01-01
Urothelial carcinoma of the bladder (UCB) is one of the commonly diagnosed cancers in the world. The UCB has the highest rate of recurrence of any malignancy. A genome-wide screening of transcriptome dysregulation between cancer and normal tissue would provide insight into the molecular basis of UCB recurrence and is a key step to discovering biomarkers for diagnosis and therapeutic targets. Compared with microarray technology, which is commonly used to identify expression level changes, the recently developed RNA-seq technique has the ability to detect other abnormal regulations in the cancer transcriptome, such as alternative splicing. In this study, we performed high-throughput transcriptome sequencing at ∼50× coverage on a recurrent muscle-invasive cisplatin-resistance UCB tissue and the adjacent non-tumor tissue. The results revealed cancer-specific differentially expressed genes between the tumor and non-tumor tissue enriched in the cell adhesion molecules, focal adhesion and ECM-receptor interaction pathway. Five dysregulated genes, including CDH1, VEGFA, PTPRF, CLDN7, and MMP2 were confirmed by Real time qPCR in the sequencing samples and the additional eleven samples. Our data revealed that more than three hundred genes showed differential splicing patterns between tumor tissue and non-tumor tissue. Among these genes, we filtered 24 cancer-associated alternative splicing genes with differential exon usage. The findings from RNA-Seq were validated by Real time qPCR for CD44, PDGFA, NUMB, and LPHN2. This study provides a comprehensive survey of the UCB transcriptome, which provides better insight into the complexity of regulatory changes during recurrence and metastasis. PMID:24622401
Bhattarai, Sunil; Aly, Ahmed; Garcia, Kristy; Ruiz, Diandra; Pontarelli, Fabrizio; Dharap, Ashutosh
2018-06-03
Gene expression in cerebral ischemia has been a subject of intense investigations for several years. Studies utilizing probe-based high-throughput methodologies such as microarrays have contributed significantly to our existing knowledge but lacked the capacity to dissect the transcriptome in detail. Genome-wide RNA-sequencing (RNA-seq) enables comprehensive examinations of transcriptomes for attributes such as strandedness, alternative splicing, alternative transcription start/stop sites, and sequence composition, thus providing a very detailed account of gene expression. Leveraging this capability, we conducted an in-depth, genome-wide evaluation of the protein-coding transcriptome of the adult mouse cortex after transient focal ischemia at 6, 12, or 24 h of reperfusion using RNA-seq. We identified a total of 1007 transcripts at 6 h, 1878 transcripts at 12 h, and 1618 transcripts at 24 h of reperfusion that were significantly altered as compared to sham controls. With isoform-level resolution, we identified 23 splice variants arising from 23 genes that were novel mRNA isoforms. For a subset of genes, we detected reperfusion time-point-dependent splice isoform switching, indicating an expression and/or functional switch for these genes. Finally, for 286 genes across all three reperfusion time-points, we discovered multiple, distinct, simultaneously expressed and differentially altered isoforms per gene that were generated via alternative transcription start/stop sites. Of these, 165 isoforms derived from 109 genes were novel mRNAs. Together, our data unravel the protein-coding transcriptome of the cerebral cortex at an unprecedented depth to provide several new insights into the flexibility and complexity of stroke-related gene transcription and transcript organization.
Transcriptome analysis of stimulated PBMC from Mycobacterium bovis infected cattle
USDA-ARS?s Scientific Manuscript database
Immunological responses of cattle to Mycobacterium bovis (M. bovis) infection are of interest in terms of understanding the biology of M. bovis infection and for the development of improved diagnostic techniques. Although considerable time and resources have been invested in understanding immune re...
Spletter, Maria L; Barz, Christiane; Yeroslaviz, Assa; Zhang, Xu; Lemke, Sandra B; Bonnard, Adrien; Brunner, Erich; Cardone, Giovanni; Basler, Konrad; Habermann, Bianca H; Schnorrer, Frank
2018-05-30
Muscles organise pseudo-crystalline arrays of actin, myosin and titin filaments to build force-producing sarcomeres. To study sarcomerogenesis, we have generated a transcriptomics resource of developing Drosophila flight muscles and identified 40 distinct expression profile clusters. Strikingly, most sarcomeric components group in two clusters, which are strongly induced after all myofibrils have been assembled, indicating a transcriptional transition during myofibrillogenesis. Following myofibril assembly, many short sarcomeres are added to each myofibril. Subsequently, all sarcomeres mature, reaching 1.5 µm diameter and 3.2 µm length and acquiring stretch-sensitivity. The efficient induction of the transcriptional transition during myofibrillogenesis, including the transcriptional boost of sarcomeric components, requires in part the transcriptional regulator Spalt major. As a consequence of Spalt knock-down, sarcomere maturation is defective and fibers fail to gain stretch-sensitivity. Together, this defines an ordered sarcomere morphogenesis process under precise transcriptional control - a concept that may also apply to vertebrate muscle or heart development. © 2018, Spletter et al.
Transcriptomic resources for environmental risk assessment: a case study in the Venice lagoon.
Milan, M; Pauletto, M; Boffo, L; Carrer, C; Sorrentino, F; Ferrari, G; Pavan, L; Patarnello, T; Bargelloni, L
2015-02-01
The development of new resources to evaluate the environmental status is becoming increasingly important representing a key challenge for ocean and coastal management. Recently, the employment of transcriptomics in aquatic toxicology has led to increasing initiatives proposing to integrate eco-toxicogenomics in the evaluation of marine ecosystem health. However, several technical issues need to be addressed before introducing genomics as a reliable tool in regulatory ecotoxicology. The Venice lagoon constitutes an excellent case, in which the assessment of environmental risks derived from the nearby industrial activities represents a crucial task. In this context, the potential role of genomics to assist environmental monitoring was investigated through the definition of reliable gene expression markers associated to chemical contamination in Manila clams, and their subsequent employment for the classification of Venice lagoon areas. Overall, the present study addresses key issues to evaluate the future outlooks of genomics in the environmental monitoring and risk assessment. Copyright © 2014 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasin-Brumshtein, Yehudit; Khan, Arshad H.; Hormozdiari, Farhad
2016-09-13
Previous studies had shown that the integration of genome wide expression profiles, in metabolic tissues, with genetic and phenotypic variance, provided valuable insight into the underlying molecular mechanisms. We used RNA-Seq to characterize hypothalamic transcriptome in 99 inbred strains of mice from the Hybrid Mouse Diversity Panel (HMDP), a reference resource population for cardiovascular and metabolic traits. We report numerous novel transcripts supported by proteomic analyses, as well as novel non coding RNAs. High resolution genetic mapping of transcript levels in HMDP, reveals bothlocalandtransexpression Quantitative Trait Loci (eQTLs) demonstrating 2transeQTL 'hotspots' associated with expression of hundreds of genes. We alsomore » report thousands of alternative splicing events regulated by genetic variants. Finally, comparison with about 150 metabolic and cardiovascular traits revealed many highly significant associations. Our data provide a rich resource for understanding the many physiologic functions mediated by the hypothalamus and their genetic regulation.« less
Arsenomics: omics of arsenic metabolism in plants
Tripathi, Rudra Deo; Tripathi, Preeti; Dwivedi, Sanjay; Dubey, Sonali; Chatterjee, Sandipan; Chakrabarty, Debasis; Trivedi, Prabodh K.
2012-01-01
Arsenic (As) contamination of drinking water and groundwater used for irrigation can lead to contamination of the food chain and poses serious health risk to people worldwide. To reduce As intake through the consumption of contaminated food, identification of the mechanisms for As accumulation and detoxification in plant is a prerequisite to develop efficient phytoremediation methods and safer crops with reduced As levels. Transcriptome, proteome, and metabolome analysis of any organism reflects the total biological activities at any given time which are responsible for the adaptation of the organism to the surrounding environmental conditions. As these approaches are very important in analyzing plant As transport and accumulation, we termed “Arsenomics” as approach which deals transcriptome, proteome, and metabolome alterations during As exposure. Although, various studies have been performed to understand modulation in transcriptome in response to As, many important questions need to be addressed regarding the translated proteins of plants at proteomic and metabolomic level, resulting in various ecophysiological responses. In this review, the comprehensive knowledge generated in this area has been compiled and analyzed. There is a need to strengthen Arsenomics which will lead to build up tools to develop As-free plants for safe consumption. PMID:22934029
Zhang, Chenghao; Dong, Wenqi; Gen, Wei; Xu, Baoyu; Shen, Chenjia
2018-01-01
Abelmoschus esculentus (okra or lady’s fingers) is a vegetable with high nutritional value, as well as having certain medicinal effects. It is widely used as food, in the food industry, and in herbal medicinal products, but also as an ornamental, in animal feed, and in other commercial sectors. Okra is rich in bioactive compounds, such as flavonoids, polysaccharides, polyphenols, caffeine, and pectin. In the present study, the concentrations of total flavonoids and polysaccharides in five organs of okra were determined and compared. Transcriptome sequencing was used to explore the biosynthesis pathways associated with the active constituents in okra. Transcriptome sequencing of five organs (roots, stem, leaves, flowers, and fruits) of okra enabled us to obtain 293,971 unigenes, of which 232,490 were annotated. Unigenes related to the enzymes involved in the flavonoid biosynthetic pathway or in fructose and mannose metabolism were identified, based on Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. All of the transcriptional datasets were uploaded to Sequence Read Archive (SRA). In summary, our comprehensive analysis provides important information at the molecular level about the flavonoid and polysaccharide biosynthesis pathways in okra. PMID:29495525
Amano, Ikuko; Kitajima, Sakihito; Suzuki, Hideyuki; Koeduka, Takao
2018-01-01
The biosynthesis of plant secondary metabolites is associated with morphological and metabolic differentiation. As a consequence, gene expression profiles can change drastically, and primary and secondary metabolites, including intermediate and end-products, move dynamically within and between cells. However, little is known about the molecular mechanisms underlying differentiation and transport mechanisms. In this study, we performed a transcriptome analysis of Petunia axillaris subsp. parodii, which produces various volatiles in its corolla limbs and emits metabolites to attract pollinators. RNA-sequencing from leaves, buds, and limbs identified 53,243 unigenes. Analysis of differentially expressed genes, combined with gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses, showed that many biological processes were highly enriched in limbs. These included catabolic processes and signaling pathways of hormones, such as gibberellins, and metabolic pathways, including phenylpropanoids and fatty acids. Moreover, we identified five transporter genes that showed high expression in limbs, and we performed spatiotemporal expression analyses and homology searches to infer their putative functions. Our systematic analysis provides comprehensive transcriptomic information regarding morphological differentiation and metabolite transport in the Petunia flower and lays the foundation for establishing the specific mechanisms that control secondary metabolite biosynthesis in plants. PMID:29902274
Park, C Sehwan; Valomon, Amandine; Welzl, Hans
2015-01-01
Environmental enrichment has been reported to delay or restore age-related cognitive deficits, however, a mechanism to account for the cause and progression of normal cognitive decline and its preservation by environmental enrichment is lacking. Using genome-wide SAGE-Seq, we provide a global assessment of differentially expressed genes altered with age and environmental enrichment in the hippocampus. Qualitative and quantitative proteomics in naïve young and aged mice was used to further identify phosphorylated proteins differentially expressed with age. We found that increased expression of endogenous protein phosphatase-1 inhibitors in aged mice may be characteristic of long-term environmental enrichment and improved cognitive status. As such, hippocampus-dependent performances in spatial, recognition, and associative memories, which are sensitive to aging, were preserved by environmental enrichment and accompanied by decreased protein phosphatase activity. Age-associated phosphorylated proteins were also found to correspond to the functional categories of age-associated genes identified through transcriptome analysis. Together, this study provides a comprehensive map of the transcriptome and proteome in the aging brain, and elucidates endogenous protein phosphatase-1 inhibition as a potential means through which environmental enrichment may ameliorate age-related cognitive deficits.
NASA Astrophysics Data System (ADS)
Hui, Min; Cui, Zhaoxia; Liu, Yuan; Song, Chengwen
2017-07-01
In crab, embryogenesis is a complicated developmental program marked by a series of critical events. RNA-Sequencing technology offers developmental biologists a way to identify many more developmental genes than ever before. Here, we present a comprehensive analysis of the transcriptomes of Eriocheir sinensis oosperms (Os) and embryos at the 2-4 cell stage (Cs), which are separated by a cleavage event. A total of 18 923 unigenes were identified, and 403 genes matched with gene ontology (GO) terms related to developmental processes. In total, 432 differentially expressed genes (DEGs) were detected between the two stages. Nine DEGs were specifically expressed at only one stage. These DEGs may be relevant to stage-specific molecular events during development. A number of DEGs related to `hedgehog signaling pathway', `Wnt signaling pathway' `germplasm', `nervous system', `sensory perception' and `segment polarity' were identified as being up-regulated at the Cs stage. The results suggest that these embryonic developmental events begin before the early cleavage event in crabs, and that many of the genes expressed in the two transcriptomes might be maternal genes. Our study provides ample information for further research on the molecular mechanisms underlying crab development.
Rey, Benjamin; Dégletagne, Cyril; Duchamp, Claude
2016-12-01
In this article, we present differentially expressed gene profiles in the pectoralis muscle of wild juvenile king penguins that were either naturally acclimated to cold marine environment or experimentally immersed in cold water as compared with penguin juveniles that never experienced cold water immersion. Transcriptomic data were obtained by hybridizing penguins total cDNA on Affymetrix GeneChip Chicken Genome arrays and analyzed using maxRS algorithm , " Transcriptome analysis in non-model species: a new method for the analysis of heterologous hybridization on microarrays " (Dégletagne et al., 2010) [1] . We focused on genes involved in multiple antioxidant pathways. For better clarity, these differentially expressed genes were clustered into six functional groups according to their role in controlling redox homeostasis. The data are related to a comprehensive research study on the ontogeny of antioxidant functions in king penguins, "Hormetic response triggers multifaceted anti-oxidant strategies in immature king penguins (Aptenodytes patagonicus)" (Rey et al., 2016) [2] . The raw microarray dataset supporting the present analyses has been deposited at the Gene Expression Omnibus (GEO) repository under accessions GEO: GSE17725 and GEO: GSE82344.
Hu, Lisong; Wu, Gang; Hao, Chaoyun; Yu, Huan; Tan, Lehe
2016-07-01
Artocarpus heterophyllus Lam., commonly known as jackfruit, produces the largest tree-borne fruit known thus far. The edible part of the fruit develops from the perianths, and contains many sugar-derived compounds. However, its sugar metabolism is poorly understood. A fruit perianth transcriptome was sequenced on an Illumina HiSeq 2500 platform, producing 32,459 unigenes with an average length of 1345nt. Sugar metabolism was characterized by comparing expression patterns of genes related to sugar metabolism and evaluating correlations with enzyme activity and sugar accumulation during fruit perianth development. During early development, high expression levels of acid invertases and corresponding enzyme activities were responsible for the rapid utilization of imported sucrose for fruit growth. The differential expression of starch metabolism-related genes and corresponding enzyme activities were responsible for starch accumulated before fruit ripening but decreased during ripening. Sucrose accumulated during ripening, when the expression levels of genes for sucrose synthesis were elevated and high enzyme activity was observed. The comprehensive transcriptome analysis presents fundamental information on sugar metabolism and will be a useful reference for further research on fruit perianth development in jackfruit. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
UniVIO: A Multiple Omics Database with Hormonome and Transcriptome Data from Rice
Sakurai, Tetsuya; Sakakibara, Hitoshi
2013-01-01
Plant hormones play important roles as signaling molecules in the regulation of growth and development by controlling the expression of downstream genes. Since the hormone signaling system represents a complex network involving functional cross-talk through the mutual regulation of signaling and metabolism, a comprehensive and integrative analysis of plant hormone concentrations and gene expression is important for a deeper understanding of hormone actions. We have developed a database named Uniformed Viewer for Integrated Omics (UniVIO: http://univio.psc.riken.jp/), which displays hormone-metabolome (hormonome) and transcriptome data in a single formatted (uniformed) heat map. At the present time, hormonome and transcriptome data obtained from 14 organ parts of rice plants at the reproductive stage and seedling shoots of three gibberellin signaling mutants are included in the database. The hormone concentration and gene expression data can be searched by substance name, probe ID, gene locus ID or gene description. A correlation search function has been implemented to enable users to obtain information of correlated substance accumulation and gene expression. In the correlation search, calculation method, range of correlation coefficient and plant samples can be selected freely. PMID:23314752
Lu, Taofeng; Sun, Yujiao; Ma, Qin; Zhu, Minghao; Liu, Dan; Ma, Jianzhang; Ma, Yuehui; Chen, Hongyan; Guan, Weijun
2016-12-01
The Siberian tiger, Panthera tigris altaica, is an endangered species, and much more work is needed to protect this species, which is still vulnerable to extinction. Conservation efforts may be supported by the genetic assessment of wild populations, for which highly specific microsatellite markers are required. However, only a limited amount of genetic sequence data is available for this species. To identify the genes involved in the lung transcriptome and to develop additional simple sequence repeat (SSR) markers for the Siberian tiger, we used high-throughput RNA-Seq to characterize the Siberian tiger transcriptome in lung tissue (designated 'PTA-lung') and a pooled tissue sample (designated 'PTA'). Approximately 47.5 % (33,187/69,836) of the lung transcriptome was annotated in four public databases (Nr, Swiss-Prot, KEGG, and COG). The annotated genes formed a potential pool for gene identification in the tiger. An analysis of the genes differentially expressed in the PTA lung, and PTA samples revealed that the tiger may have suffered a series of diseases before death. In total, 1062 non-redundant SSRs were identified in the Siberian tiger transcriptome. Forty-three primer pairs were randomly selected for amplification reactions, and 26 of the 43 pairs were also used to evaluate the levels of genetic polymorphism. Fourteen primer pairs (32.56 %) amplified products that were polymorphic in size in P. tigris altaica. In conclusion, the transcriptome sequences will provide a valuable genomic resource for genetic research, and these new SSR markers comprise a reasonable number of loci for the genetic analysis of wild and captive populations of P. tigris altaica.
Carmona, Rosario; Zafra, Adoración; Seoane, Pedro; Castro, Antonio J.; Guerrero-Fernández, Darío; Castillo-Castillo, Trinidad; Medina-García, Ana; Cánovas, Francisco M.; Aldana-Montes, José F.; Navas-Delgado, Ismael; Alché, Juan de Dios; Claros, M. Gonzalo
2015-01-01
Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species. PMID:26322066
Luo, Hui; Xiao, Shijun; Ye, Hua; Zhang, Zhengshi; Lv, Changhuan; Zheng, Shuming; Wang, Zhiyong; Wang, Xiaoqing
2016-01-01
Schizothorax prenanti (S. prenanti) is mainly distributed in the upstream regions of the Yangtze River and its tributaries in China. This species is indigenous and commercially important. However, in recent years, wild populations and aquacultures have faced the serious challenges of germplasm variation loss and an increased susceptibility to a range of pathogens. Currently, the genetics and immune mechanisms of S. prenanti are unknown, partly due to a lack of genome and transcriptome information. Here, we sought to identify genes related to immune functions and to identify molecular markers to study the function of these genes and for trait mapping. To this end, the transcriptome from spleen tissues of S. prenanti was analyzed and sequenced. Using paired-end reads from the Illumina Hiseq2500 platform, 48,517 transcripts were isolated from the spleen transcriptome. These transcripts could be clustered into 37,785 unigenes with an N50 length of 2,539 bp. The majority of the unigenes (35,653, 94.4%) were successfully annotated using non-redundant nucleotide sequence analysis (nt), and the non-redundant protein (nr), Swiss-Prot, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. KEGG pathway assignment identified more than 500 immune-related genes. Furthermore, 7,545 putative simple sequence repeats (SSRs), 857,535 single nucleotide polymorphisms (SNPs), and 53,481 insertion/deletion (InDels) were detected from the transcriptome. This is the first reported high-throughput transcriptome analysis of S. prenanti, and it provides valuable genetic resources for the investigation of immune mechanisms, conservation of germplasm, and molecular marker-assisted breeding of S. prenanti.
Impact of Transcriptomics on Our Understanding of Pulmonary Fibrosis
Vukmirovic, Milica; Kaminski, Naftali
2018-01-01
Idiopathic pulmonary fibrosis (IPF) is a lethal fibrotic lung disease characterized by aberrant remodeling of the lung parenchyma with extensive changes to the phenotypes of all lung resident cells. The introduction of transcriptomics, genome scale profiling of thousands of RNA transcripts, caused a significant inversion in IPF research. Instead of generating hypotheses based on animal models of disease, or biological plausibility, with limited validation in humans, investigators were able to generate hypotheses based on unbiased molecular analysis of human samples and then use animal models of disease to test their hypotheses. In this review, we describe the insights made from transcriptomic analysis of human IPF samples. We describe how transcriptomic studies led to identification of novel genes and pathways involved in the human IPF lung such as: matrix metalloproteinases, WNT pathway, epithelial genes, role of microRNAs among others, as well as conceptual insights such as the involvement of developmental pathways and deep shifts in epithelial and fibroblast phenotypes. The impact of lung and transcriptomic studies on disease classification, endotype discovery, and reproducible biomarkers is also described in detail. Despite these impressive achievements, the impact of transcriptomic studies has been limited because they analyzed bulk tissue and did not address the cellular and spatial heterogeneity of the IPF lung. We discuss new emerging technologies and applications, such as single-cell RNAseq and microenvironment analysis that may address cellular and spatial heterogeneity. We end by making the point that most current tissue collections and resources are not amenable to analysis using the novel technologies. To take advantage of the new opportunities, we need new efforts of sample collections, this time focused on access to all the microenvironments and cells in the IPF lung. PMID:29670881
Hoek, Kristen L; Samir, Parimal; Howard, Leigh M; Niu, Xinnan; Prasad, Nripesh; Galassie, Allison; Liu, Qi; Allos, Tara M; Floyd, Kyle A; Guo, Yan; Shyr, Yu; Levy, Shawn E; Joyce, Sebastian; Edwards, Kathryn M; Link, Andrew J
2015-01-01
Systems biology is an approach to comprehensively study complex interactions within a biological system. Most published systems vaccinology studies have utilized whole blood or peripheral blood mononuclear cells (PBMC) to monitor the immune response after vaccination. Because human blood is comprised of multiple hematopoietic cell types, the potential for masking responses of under-represented cell populations is increased when analyzing whole blood or PBMC. To investigate the contribution of individual cell types to the immune response after vaccination, we established a rapid and efficient method to purify human T and B cells, natural killer (NK) cells, myeloid dendritic cells (mDC), monocytes, and neutrophils from fresh venous blood. Purified cells were fractionated and processed in a single day. RNA-Seq and quantitative shotgun proteomics were performed to determine expression profiles for each cell type prior to and after inactivated seasonal influenza vaccination. Our results show that transcriptomic and proteomic profiles generated from purified immune cells differ significantly from PBMC. Differential expression analysis for each immune cell type also shows unique transcriptomic and proteomic expression profiles as well as changing biological networks at early time points after vaccination. This cell type-specific information provides a more comprehensive approach to monitor vaccine responses.
Wu, Jing-Shan; Lo, Hsin-Yi; Li, Chia-Cheng; Chen, Feng-Yuan; Hsiang, Chien-Yun; Ho, Tin-Yun
2017-08-15
Electroacupuncture (EA) has been applied to treat and prevent diseases for years. However, molecular events happened in both the acupunctured site and the internal organs after EA stimulation have not been clarified. Here we applied transcriptomic analysis to explore the gene expression signatures after EA stimulation. Mice were applied EA stimulation at ST36 for 15 min and nine tissues were collected three hours later for microarray analysis. We found that EA affected the expression of genes not only in the acupunctured site but also in the internal organs. EA commonly affected biological networks involved in cytoskeleton and cell adhesion, and also regulated unique process networks in specific organs, such as γ-aminobutyric acid-ergic neurotransmission in brain and inflammation process in lung. In addition, EA affected the expression of genes related to various diseases, such as neurodegenerative diseases in brain and obstructive pulmonary diseases in lung. This report applied, for the first time, a global comprehensive genome-wide approach to analyze the gene expression profiling of acupunctured site and internal organs after EA stimulation. The connection between gene expression signatures, biological processes, and diseases might provide a basis for prediction and explanation on the therapeutic potentials of acupuncture in organs.
Chen, Ziyi; Quan, Lijun; Huang, Anfei; Zhao, Qiang; Yuan, Yao; Yuan, Xuye; Shen, Qin; Shang, Jingzhe; Ben, Yinyin; Qin, F Xiao-Feng; Wu, Aiping
2018-01-01
The RNA sequencing approach has been broadly used to provide gene-, pathway-, and network-centric analyses for various cell and tissue samples. However, thus far, rich cellular information carried in tissue samples has not been thoroughly characterized from RNA-Seq data. Therefore, it would expand our horizons to better understand the biological processes of the body by incorporating a cell-centric view of tissue transcriptome. Here, a computational model named seq-ImmuCC was developed to infer the relative proportions of 10 major immune cells in mouse tissues from RNA-Seq data. The performance of seq-ImmuCC was evaluated among multiple computational algorithms, transcriptional platforms, and simulated and experimental datasets. The test results showed its stable performance and superb consistency with experimental observations under different conditions. With seq-ImmuCC, we generated the comprehensive landscape of immune cell compositions in 27 normal mouse tissues and extracted the distinct signatures of immune cell proportion among various tissue types. Furthermore, we quantitatively characterized and compared 18 different types of mouse tumor tissues of distinct cell origins with their immune cell compositions, which provided a comprehensive and informative measurement for the immune microenvironment inside tumor tissues. The online server of seq-ImmuCC are freely available at http://wap-lab.org:3200/immune/.
Babineau, Marielle; Mahmood, Khalid; Mathiassen, Solvejg K; Kudsk, Per; Kristensen, Michael
2017-02-06
Loose silky bentgrass (Apera spica-venti) is an important weed in Europe with a recent increase in herbicide resistance cases. The lack of genetic information about this noxious weed limits its biological understanding such as growth, reproduction, genetic variation, molecular ecology and metabolic herbicide resistance. This study produced a reference transcriptome for A. spica-venti from different tissues (leaf, root, stem) and various growth stages (seed at phenological stages 05, 07, 08, 09). The de novo assembly was performed on individual and combined dataset followed by functional annotations. Individual transcripts and gene families involved in metabolic based herbicide resistance were identified. Eight separate transcriptome assemblies were performed and compared. The combined transcriptome assembly consists of 83,349 contigs with an N50 and average contig length of 762 and 658 bp, respectively. This dataset contains 74,724 transcripts consisting of total 54,846,111 bp. Among them 94% had a homologue to UniProtKB, 73% retrieved a GO mapping, and 50% were functionally annotated. Compared with other grass species, A. spica-venti has 26% proteins in common to Brachypodium distachyon, and 41% to Lolium spp. Glycosyltransferases had the highest number of transcripts in each tissue followed by the cytochrome P450s. The GSTF1 and CYP89A2 transcripts were recovered from the majority of tissues and aligned at a maximum of 66 and 30% to proven herbicide resistant allele from Alopecurus myosuroides and Lolium rigidum, respectively. De novo transcriptome assembly enabled the generation of the first reference transcriptome of A. spica-venti. This can serve as stepping stone for understanding the metabolic herbicide resistance as well as the general biology of this problematic weed. Furthermore, this large-scale sequence data is a valuable scientific resource for comparative transcriptome analysis for Poaceae grasses.
Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes.
Kumar, Vikas; Kutschera, Verena E; Nilsson, Maria A; Janke, Axel
2015-08-07
The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated and analyzed for signatures of positive selection. In addition, the data allowed for a phylogenetic analysis and divergence time estimate between the two fox species. The de novo assembly of reads resulted in more than 160,000 contigs/transcripts per individual. Approximately 17,000 homologous genes were identified using human and the non-redundant databases. Positive selection analyses revealed several genes involved in various metabolic and molecular processes such as energy metabolism, cardiac gene regulation, apoptosis and blood coagulation to be under positive selection in foxes. Branch site tests identified four genes to be under positive selection in the Arctic fox transcriptome, two of which are fat metabolism genes. In the red fox transcriptome eight genes are under positive selection, including molecular process genes, notably genes involved in ATP metabolism. Analysis of the three transcriptomes and five Sanger re-sequenced genes in additional individuals identified a lower genetic variability within Arctic foxes compared to red foxes, which is consistent with distribution range differences and demographic responses to past climatic fluctuations. A phylogenomic analysis estimated that the Arctic and red fox lineages diverged about three million years ago. Transcriptome data are an economic way to generate genomic resources for evolutionary studies. Despite not representing an entire genome, this transcriptome analysis identified numerous genes that are relevant to arctic adaptation in foxes. Similar to polar bears, fat metabolism seems to play a central role in adaptation of Arctic foxes to the cold climate, as has been identified in the polar bear, another arctic specialist.
Zhang, X J; Jiang, H Y; Li, L M; Yuan, L H; Chen, J P
2016-06-20
The aim of this study was to provide comprehensive insights into the genetic background of sturgeon by transcriptome study. We performed a de novo assembly of the Amur sturgeon Acipenser schrenckii transcriptome using Illumina Hiseq 2000 sequencing. A total of 148,817 non-redundant unigenes with base length of approximately 121,698,536 bp and ranges from 201 to 26,789 bp were obtained. All the unigenes were classified into 3368 distinct categories and 145,449 singletons by homologous transcript cluster analysis. In all, 46,865 (31.49%) unigenes showed homologous matches with Nr database and 32,214 (21.65%) unigenes were matched to Nt database. In total, 24,862 unigenes were categorized into significantly enriched 52 function groups by GO analysis, and 38,436 unigenes were classified into 25 groups by KOG prediction, as well as 128 enriched KEGG pathways were identified by 45,598 unigenes (P < 0.05). Subsequently, a total of 19,860 SSRs markers were identified with the abundant di-nucleotide type (10,658; 53.67%) and the most AT/TA motif repeats (2689; 13.54%). A total of 1341 conserved lncRNAs were identified by a customized pipeline. Our study provides new sequence and function information for A. schrenckii, which will be the basis for further genetic studies on sturgeon species. The huge number of potential SSRs and putatively conserved lncRNAs isolated by the transcriptome also shed light on research in many fields, including the evolution, conservation management, and biological processes in sturgeon.
De Novo Transcriptome Analysis of Allium cepa L. (Onion) Bulb to Identify Allergens and Epitopes
Rajkumar, Hemalatha; Ramagoni, Ramesh Kumar; Anchoju, Vijayendra Chary; Vankudavath, Raju Naik; Syed, Arshi Uz Zaman
2015-01-01
Allium cepa (onion) is a diploid plant with one of the largest nuclear genomes among all diploids. Onion is an example of an under-researched crop which has a complex heterozygous genome. There are no allergenic proteins and genomic data available for onions. This study was conducted to establish a transcriptome catalogue of onion bulb that will enable us to study onion related genes involved in medicinal use and allergies. Transcriptome dataset generated from onion bulb using the Illumina HiSeq 2000 technology showed a total of 99,074,309 high quality raw reads (~20 Gb). Based on sequence homology onion genes were categorized into 49 different functional groups. Most of the genes however, were classified under 'unknown' in all three gene ontology categories. Of the categorized genes, 61.2% showed metabolic functions followed by cellular components such as binding, cellular processes; catalytic activity and cell part. With BLASTx top hit analysis, a total of 2,511 homologous allergenic sequences were found, which had 37–100% similarity with 46 different types of allergens existing in the database. From the 46 contigs or allergens, 521 B-cell linear epitopes were identified using BepiPred linear epitope prediction tool. This is the first comprehensive insight into the transcriptome of onion bulb tissue using the NGS technology, which can be used to map IgE epitopes and prediction of structures and functions of various proteins. PMID:26284934
Valenzuela-Muñoz, Valentina; Sturm, Armin; Gallardo-Escárate, Cristian
2015-04-09
ATP-binding cassette (ABC) protein family encode for membrane proteins involved in the transport of various biomolecules through the cellular membrane. These proteins have been identified in all taxa and present important physiological functions, including the process of insecticide detoxification in arthropods. For that reason the ectoparasite Caligus rogercresseyi represents a model species for understanding the molecular underpinnings involved in insecticide drug resistance. llumina sequencing was performed using sea lice exposed to 2 and 3 ppb of deltamethrin and azamethiphos. Contigs obtained from de novo assembly were annotated by Blastx. RNA-Seq analysis was performed and validated by qPCR analysis. From the transcriptome database of C. rogercresseyi, 57 putative members of ABC protein sequences were identified and phylogenetically classified into the eight subfamilies described for ABC transporters in arthropods. Transcriptomic profiles for ABC proteins subfamilies were evaluated throughout C. rogercresseyi development. Moreover, RNA-Seq analysis was performed for adult male and female salmon lice exposed to the delousing drugs azamethiphos and deltamethrin. High transcript levels of the ABCB and ABCC subfamilies were evidenced. Furthermore, SNPs mining was carried out for the ABC proteins sequences, revealing pivotal genomic information. The present study gives a comprehensive transcriptome analysis of ABC proteins from C. rogercresseyi, providing relevant information about transporter roles during ontogeny and in relation to delousing drug responses in salmon lice. This genomic information represents a valuable tool for pest management in the Chilean salmon aquaculture industry.
Chen, Hanting; Deng, Cao; Nie, Hu; Fan, Gang; He, Yang
2017-01-01
Coptis chinensis Franch., the Chinese goldthread ('Weilian' in Chinese), one of the most important medicinal plants from the family Ranunculaceae, and its rhizome has been widely used in Traditional Chinese Medicine for centuries. Here, we analyzed the chemical components and the transcriptome of the Chinese goldthread from three biotopes, including Zhenping, Zunyi and Shizhu. We built comprehensive, high-quality de novo transcriptome assemblies of the Chinese goldthread from short-read RNA-Sequencing data, obtaining 155,710 transcripts and 56,071 unigenes. More than 98.39% and 95.97% of core eukaryotic genes were found in the transcripts and unigenes respectively, indicating that this unigene set capture the majority of the coding genes. A total of 520,462, 493,718, and 507,247 heterozygous SNPs were identified in the three accessions from Zhenping, Zunyi, and Shizhu respectively, indicating high polymorphism in coding regions of the Chinese goldthread (∼1%). Chemical analyses of the rhizome identified six major components, including berberine, palmatine, coptisine, epiberberine, columbamine, and jatrorrhizine. Berberine has the highest concentrations, followed by coptisine, palmatine, and epiberberine sequentially for all the three accessions. The drug quality of the accession from Shizhu may be the highest among these accessions. Differential analyses of the transcriptome identified four pivotal candidate enzymes, including aspartate aminotransferaseprotein, polyphenol oxidase, primary-amine oxidase, and tyrosine decarboxylase, were significantly differentially expressed and may be responsible for the difference of alkaloids contents in the accessions from different biotopes.
Song, Yan-Chun; Yu, Dan
2014-10-01
With the development of the society and economy, the contradictions among population, resources and environment are increasingly worse. As a result, the capacity of resources and environment becomes one of the focal issues for many countries and regions. Through investigating and analyzing the present situation and the existing problems of resources and environment in Poyang Lake Eco-economic Zone, seven factors were chosen as the evaluation criterion layer, namely, land resources, water resources, biological resources, mineral resources, ecological-geological environment, water environment and atmospheric environment. Based on the single factor evaluation results and with the county as the evaluation unit, the comprehensive capacity of resources and environment was evaluated by using the state space method in Poyang Lake Eco-economic Zone. The results showed that it boasted abundant biological resources, quality atmosphere and water environment, and relatively stable geological environment, while restricted by land resource, water resource and mineral resource. Currently, although the comprehensive capacity of the resources and environments in Poyang Lake Eco-economic Zone was not overloaded as a whole, it has been the case in some counties/districts. State space model, with clear indication and high accuracy, could serve as another approach to evaluating comprehensive capacity of regional resources and environment.
Li, Yunfeng; Zhou, Zunchun; Tian, Meilin; Tian, Yi; Dong, Ying; Li, Shilei; Liu, Weidong; He, Chongbo
2017-08-01
In this study, single nucleotide polymorphism (SNP), microsatellite (SSR) and differentially expressed genes (DEGs) in the oral parts, gonads, and umbrella parts of the jellyfish Rhopilema esculentum were analyzed by RNA-Seq technology. A total of 76.4 million raw reads and 72.1 million clean reads were generated from deep sequencing. Approximately 119,874 tentative unigenes and 149,239 transcripts were obtained. A total of 1,034,708 SNP markers were detected in the three tissues. For microsatellite mining, 5088 SSRs were identified from the unigene sequences. The most frequent repeat motifs were mononucleotide repeats, which accounted for 61.93%. Transcriptome comparison of the three tissues yielded a total of 8841 DEGs, of which 3560 were up-regulated and 5281 were down-regulated. This study represents the greatest sequencing effort carried out for a jellyfish and provides the first high-throughput transcriptomic resource for jellyfish. Copyright © 2017 Elsevier B.V. All rights reserved.
A large-scale full-length cDNA analysis to explore the budding yeast transcriptome
Miura, Fumihito; Kawaguchi, Noriko; Sese, Jun; Toyoda, Atsushi; Hattori, Masahira; Morishita, Shinichi; Ito, Takashi
2006-01-01
We performed a large-scale cDNA analysis to explore the transcriptome of the budding yeast Saccharomyces cerevisiae. We sequenced two cDNA libraries, one from the cells exponentially growing in a minimal medium and the other from meiotic cells. Both libraries were generated by using a vector-capping method that allows the accurate mapping of transcription start sites (TSSs). Consequently, we identified 11,575 TSSs associated with 3,638 annotated genomic features, including 3,599 ORFs, to suggest that most yeast genes have two or more TSSs. In addition, we identified 45 previously undescribed introns, including those affecting current ORF annotations and those spliced alternatively. Furthermore, the analysis revealed 667 transcription units in the intergenic regions and transcripts derived from antisense strands of 367 known features. We also found that 348 ORFs carry TSSs in their 3′-halves to generate sense transcripts starting from inside the ORFs. These results indicate that the budding yeast transcriptome is considerably more complex than previously thought, and it shares many recently revealed characteristics with the transcriptomes of mammals and other higher eukaryotes. Thus, the genome-wide active transcription that generates novel classes of transcripts appears to be an intrinsic feature of the eukaryotic cells. The budding yeast will serve as a versatile model for the studies on these aspects of transcriptome, and the full-length cDNA clones can function as an invaluable resource in such studies. PMID:17101987
Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng
2016-01-01
Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides.
Duan, Xinle; Wang, Kang; Su, Sha; Tian, Ruizheng; Li, Yuting; Chen, Maohua
2017-01-01
The bird cherry-oat aphid, Rhopalosiphum padi (L.), is one of the most abundant aphid pests of cereals and has a global distribution. Next-generation sequencing (NGS) is a rapid and efficient method for developing molecular markers. However, transcriptomic and genomic resources of R. padi have not been investigated. In this study, we used transcriptome information obtained by RNA-Seq to develop polymorphic microsatellites for investigating population genetics in this species. The transcriptome of R. padi was sequenced on an Illumina HiSeq 2000 platform. A total of 114.4 million raw reads with a GC content of 40.03% was generated. The raw reads were cleaned and assembled into 29,467 unigenes with an N50 length of 1,580 bp. Using several public databases, 82.47% of these unigenes were annotated. Of the annotated unigenes, 8,022 were assigned to COG pathways, 9,895 were assigned to GO pathways, and 14,586 were mapped to 257 KEGG pathways. A total of 7,936 potential microsatellites were identified in 5,564 unigenes, 60 of which were selected randomly and amplified using specific primer pairs. Fourteen loci were found to be polymorphic in the four R. padi populations. The transcriptomic data presented herein will facilitate gene discovery, gene analyses, and development of molecular markers for future studies of R. padi and other closely related aphid species.
Loftus, Stacie K
2018-05-01
The number of melanocyte- and melanoma-derived next generation sequence genome-scale datasets have rapidly expanded over the past several years. This resource guide provides a summary of publicly available sources of melanocyte cell derived whole genome, exome, mRNA and miRNA transcriptome, chromatin accessibility and epigenetic datasets. Also highlighted are bioinformatic resources and tools for visualization and data queries which allow researchers a genome-scale view of the melanocyte. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
Ribas, Laia; Pardo, Belén G; Fernández, Carlos; Alvarez-Diós, José Antonio; Gómez-Tato, Antonio; Quiroga, María Isabel; Planas, Josep V; Sitjà-Bobadilla, Ariadna; Martínez, Paulino; Piferrer, Francesc
2013-03-15
Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry. Expressed sequence tags were generated by Sanger sequencing of cDNA libraries from different immune-related tissues after several parasitic challenges. The resulting database ("Turbot 2 database") was enlarged with sequences generated from a 454 sequencing run of brain-hypophysis-gonadal axis-derived RNA obtained from turbot at different development stages. The assembly of Sanger and 454 sequences generated 52,427 consensus sequences ("Turbot 3 database"), of which 23,661 were successfully annotated. A total of 1,410 sequences were confirmed to be related to reproduction and key genes involved in sex differentiation and maturation were identified for the first time in turbot (AR, AMH, SRY-related genes, CYP19A, ZPGs, STAR FSHR, etc.). Similarly, 2,241 sequences were related to the immune system and several novel key immune genes were identified (BCL, TRAF, NCK, CD28 and TOLLIP, among others). The number of genes of many relevant reproduction- and immune-related pathways present in the database was 50-90% of the total gene count of each pathway. In addition, 1,237 microsatellites and 7,362 single nucleotide polymorphisms (SNPs) were also compiled. Further, 2,976 putative natural antisense transcripts (NATs) including microRNAs were also identified. The combined sequencing strategies employed here significantly increased the turbot genomic resources available, including 34,400 novel sequences. The generated database contains a larger number of genes relevant for reproduction- and immune-associated studies, with an excellent coverage of most genes present in many relevant physiological pathways. This database also allowed the identification of many microsatellites and SNP markers that will be very useful for population and genome screening and a valuable aid in marker assisted selection programs.
2013-01-01
Background Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry. Results Expressed sequence tags were generated by Sanger sequencing of cDNA libraries from different immune-related tissues after several parasitic challenges. The resulting database (“Turbot 2 database”) was enlarged with sequences generated from a 454 sequencing run of brain-hypophysis-gonadal axis-derived RNA obtained from turbot at different development stages. The assembly of Sanger and 454 sequences generated 52,427 consensus sequences (“Turbot 3 database”), of which 23,661 were successfully annotated. A total of 1,410 sequences were confirmed to be related to reproduction and key genes involved in sex differentiation and maturation were identified for the first time in turbot (AR, AMH, SRY-related genes, CYP19A, ZPGs, STAR FSHR, etc.). Similarly, 2,241 sequences were related to the immune system and several novel key immune genes were identified (BCL, TRAF, NCK, CD28 and TOLLIP, among others). The number of genes of many relevant reproduction- and immune-related pathways present in the database was 50–90% of the total gene count of each pathway. In addition, 1,237 microsatellites and 7,362 single nucleotide polymorphisms (SNPs) were also compiled. Further, 2,976 putative natural antisense transcripts (NATs) including microRNAs were also identified. Conclusions The combined sequencing strategies employed here significantly increased the turbot genomic resources available, including 34,400 novel sequences. The generated database contains a larger number of genes relevant for reproduction- and immune-associated studies, with an excellent coverage of most genes present in many relevant physiological pathways. This database also allowed the identification of many microsatellites and SNP markers that will be very useful for population and genome screening and a valuable aid in marker assisted selection programs. PMID:23497389
Use of archival resources has been limited to date by inconsistent methods for genomic profiling of degraded RNA from formalin-fixed paraffin-embedded (FFPE) samples. RNA-sequencing offers a promising way to address this problem. Here we evaluated transcriptomic dose responses us...
Integrated and translational genomics for analysis of complex traits in crops
USDA-ARS?s Scientific Manuscript database
We report here on integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of translating gems from these resources into useable DNA markers in the ...
Castoe, Todd A; de Koning, Jason A P; Hall, Kathryn T; Yokoyama, Ken D; Gu, Wanjun; Smith, Eric N; Feschotte, Cédric; Uetz, Peter; Ray, David A; Dobry, Jason; Bogden, Robert; Mackessy, Stephen P; Bronikowski, Anne M; Warren, Wesley C; Secor, Stephen M; Pollock, David D
2011-07-28
The Consortium for Snake Genomics is in the process of sequencing the genome and creating transcriptomic resources for the Burmese python. Here, we describe how this will be done, what analyses this work will include, and provide a timeline.
USDA-ARS?s Scientific Manuscript database
Swertia mussotii Franch. is an important traditional Tibetan medicinal plant with pharmacological properties useful for the treatment of various ailments, such as hepatitis. Secoiridoids, including swertiamarin, are the major bioactive compounds in S. mussotii. The development of genomic resources ...
Tian, Fei; Zhao, Kai
2017-01-01
Environmental acclimation is important episode in wildlife occupation of the high-altitude Tibetan Plateau (TP). Transcriptome-wide studies on thermal acclimation mechanism in fish species are rarely revealed in Tibetan Plateau fish at high altitude. Thus, we used mRNA and miRNA transcriptome sequencing to investigate regulation of thermal acclimation in larval Tibetan naked carp, Gymnocypris przewalskii. We first remodeled the regulation network of mRNA and miRNA in thermal acclimation, and then identified differential expression of miRNAs and target mRNAs enriched in metabolic and digestive pathways. Interestingly, we identified two candidate genes contributed to normal skeletal development. The altered expression of these gene groups could potentially be associated with the developmental issues of deformity and induced larval death. Our results have three important implications: first, these findings provide strong evidences to support our hypothesis that G. przewalskii possess ability to build heat-tolerance against the controversial issue. Second, this study shows that transcriptional and post-transcriptional regulations are extensively involved in thermal acclimation. Third, the integrated mRNA and microRNA transcriptome analyses provide a large number of valuable genetic resources for future studies on environmental stress response in G. przewalskii and as a case study in Tibetan Schizothoracine fish. PMID:29045433
Hwang, Young Sun; Seo, Minseok; Choi, Hee Jung; Kim, Sang Kyung; Kim, Heebal; Han, Jae Yong
2018-04-01
The chicken is a valuable model organism, especially in evolutionary and embryology research because its embryonic development occurs in the egg. However, despite its scientific importance, no transcriptome data have been generated for deciphering the early developmental stages of the chicken because of practical and technical constraints in accessing pre-oviposited embryos. Here, we determine the entire transcriptome of pre-oviposited avian embryos, including oocyte, zygote, and intrauterine embryos from Eyal-giladi and Kochav stage I (EGK.I) to EGK.X collected using a noninvasive approach for the first time. We also compare RNA-sequencing data obtained using a bulked embryo sequencing and single embryo/cell sequencing technique. The raw sequencing data were preprocessed with two genome builds, Galgal4 and Galgal5, and the expression of 17,108 and 26,102 genes was quantified in the respective builds. There were some differences between the two techniques, as well as between the two genome builds, and these were affected by the emergence of long intergenic noncoding RNA annotations. The first transcriptome datasets of pre-oviposited early chicken embryos based on bulked and single embryo sequencing techniques will serve as a valuable resource for investigating early avian embryogenesis, for comparative studies among vertebrates, and for novel gene annotation in the chicken genome.
Varshney, Rajeev K; Mohan, S Murali; Gaur, Pooran M; Gangarao, N V P R; Pandey, Manish K; Bohra, Abhishek; Sawargaonkar, Shrikant L; Chitikineni, Annapurna; Kimurto, Paul K; Janila, Pasupuleti; Saxena, K B; Fikre, Asnake; Sharma, Mamta; Rathore, Abhishek; Pratap, Aditya; Tripathi, Shailesh; Datta, Subhojit; Chaturvedi, S K; Mallikarjuna, Nalini; Anuradha, G; Babbar, Anita; Choudhary, Arbind K; Mhase, M B; Bharadwaj, Ch; Mannur, D M; Harer, P N; Guo, Baozhu; Liang, Xuanqiang; Nadarajan, N; Gowda, C L L
2013-12-01
Advances in next-generation sequencing and genotyping technologies have enabled generation of large-scale genomic resources such as molecular markers, transcript reads and BAC-end sequences (BESs) in chickpea, pigeonpea and groundnut, three major legume crops of the semi-arid tropics. Comprehensive transcriptome assemblies and genome sequences have either been developed or underway in these crops. Based on these resources, dense genetic maps, QTL maps as well as physical maps for these legume species have also been developed. As a result, these crops have graduated from 'orphan' or 'less-studied' crops to 'genomic resources rich' crops. This article summarizes the above-mentioned advances in genomics and genomics-assisted breeding applications in the form of marker-assisted selection (MAS) for hybrid purity assessment in pigeonpea; marker-assisted backcrossing (MABC) for introgressing QTL region for drought-tolerance related traits, Fusarium wilt (FW) resistance and Ascochyta blight (AB) resistance in chickpea; late leaf spot (LLS), leaf rust and nematode resistance in groundnut. We critically present the case of use of other modern breeding approaches like marker-assisted recurrent selection (MARS) and genomic selection (GS) to utilize the full potential of genomics-assisted breeding for developing superior cultivars with enhanced tolerance to various environmental stresses. In addition, this article recommends the use of advanced-backcross (AB-backcross) breeding and development of specialized populations such as multi-parents advanced generation intercross (MAGIC) for creating new variations that will help in developing superior lines with broadened genetic base. In summary, we propose the use of integrated genomics and breeding approach in these legume crops to enhance crop productivity in marginal environments ensuring food security in developing countries. Copyright © 2012 Elsevier Inc. All rights reserved.