Science.gov

Sample records for genome scale transcriptome

  1. Genome Scale Transcriptomics of Baculovirus-Insect Interactions

    PubMed Central

    Nguyen, Quan; Nielsen, Lars K.; Reid, Steven

    2013-01-01

    Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors‚ and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system‚ which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies. PMID:24226166

  2. Genome scale transcriptomics of baculovirus-insect interactions.

    PubMed

    Nguyen, Quan; Nielsen, Lars K; Reid, Steven

    2013-11-12

    Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors' and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system' which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.

  3. Methods for integration of transcriptomic data in genome-scale metabolic models

    PubMed Central

    Kim, Min Kyung; Lun, Desmond S.

    2014-01-01

    Several computational methods have been developed that integrate transcriptomic data with genome-scale metabolic reconstructions to infer condition-specific system-wide intracellular metabolic flux distributions. In this mini-review, we describe each of these methods published to date with categorizing them based on four different grouping criteria (requirement for multiple gene expression datasets as input, requirement for a threshold to define a gene's high and low expression, requirement for a priori assumption of an appropriate objective function, and validation of predicted fluxes directly against measured intracellular fluxes). Then, we recommend which group of methods would be more suitable from a practical perspective. PMID:25379144

  4. Genome-scale transcriptome analysis in response to nitric oxide in birch cells: implications of the triterpene biosynthetic pathway.

    PubMed

    Zeng, Fansuo; Sun, Fengkun; Li, Leilei; Liu, Kun; Zhan, Yaguang

    2014-01-01

    Evidence supporting nitric oxide (NO) as a mediator of plant biochemistry continues to grow, but its functions at the molecular level remains poorly understood and, in some cases, controversial. To study the role of NO at the transcriptional level in Betula platyphylla cells, we conducted a genome-scale transcriptome analysis of these cells. The transcriptome of untreated birch cells and those treated by sodium nitroprusside (SNP) were analyzed using the Solexa sequencing. Data were collected by sequencing cDNA libraries of birch cells, which had a long period to adapt to the suspension culture conditions before SNP-treated cells and untreated cells were sampled. Among the 34,100 UniGenes detected, BLASTX search revealed that 20,631 genes showed significant (E-values≤10-5) sequence similarity with proteins from the NR-database. Numerous expressed sequence tags (i.e., 1374) were identified as differentially expressed between the 12 h SNP-treated cells and control cells samples: 403 up-regulated and 971 down-regulated. From this, we specifically examined a core set of NO-related transcripts. The altered expression levels of several transcripts, as determined by transcriptome analysis, was confirmed by qRT-PCR. The results of transcriptome analysis, gene expression quantification, the content of triterpenoid and activities of defensive enzymes elucidated NO has a significant effect on many processes including triterpenoid production, carbohydrate metabolism and cell wall biosynthesis.

  5. Genome-Scale Transcriptome Analysis in Response to Nitric Oxide in Birch Cells: Implications of the Triterpene Biosynthetic Pathway

    PubMed Central

    Zeng, Fansuo; Sun, Fengkun; Li, Leilei; Liu, Kun; Zhan, Yaguang

    2014-01-01

    Evidence supporting nitric oxide (NO) as a mediator of plant biochemistry continues to grow, but its functions at the molecular level remains poorly understood and, in some cases, controversial. To study the role of NO at the transcriptional level in Betula platyphylla cells, we conducted a genome-scale transcriptome analysis of these cells. The transcriptome of untreated birch cells and those treated by sodium nitroprusside (SNP) were analyzed using the Solexa sequencing. Data were collected by sequencing cDNA libraries of birch cells, which had a long period to adapt to the suspension culture conditions before SNP-treated cells and untreated cells were sampled. Among the 34,100 UniGenes detected, BLASTX search revealed that 20,631 genes showed significant (E-values≤10−5) sequence similarity with proteins from the NR-database. Numerous expressed sequence tags (i.e., 1374) were identified as differentially expressed between the 12 h SNP-treated cells and control cells samples: 403 up-regulated and 971 down-regulated. From this, we specifically examined a core set of NO-related transcripts. The altered expression levels of several transcripts, as determined by transcriptome analysis, was confirmed by qRT-PCR. The results of transcriptome analysis, gene expression quantification, the content of triterpenoid and activities of defensive enzymes elucidated NO has a significant effect on many processes including triterpenoid production, carbohydrate metabolism and cell wall biosynthesis. PMID:25551661

  6. Genome-Scale Transcriptome Analysis of the Desert Shrub Artemisia sphaerocephala

    PubMed Central

    Zhang, Lijing; Hu, Xiaowei; Miao, Xiumei; Chen, Xiaolong; Nan, Shuzhen; Fu, Hua

    2016-01-01

    Background Artemisia sphaerocephala, a semi-shrub belonging to the Artemisia genus of the Compositae family, is an important pioneer plant that inhabits moving and semi-stable sand dunes in the deserts and steppes of northwest and north-central China. It is very resilient in extreme environments. Additionally, its seeds have excellent nutritional value, and the abundant lipids and polysaccharides in the seeds make this plant a potential valuable source of bio-energy. However, partly due to the scarcity of genetic information, the genetic mechanisms controlling the traits and environmental adaptation capacity of A. sphaerocephala are unknown. Results Here, we present the first in-depth transcriptomic analysis of A. sphaerocephala. To maximize the representation of conditional transcripts, mRNA was obtained from 17 samples, including living tissues of desert-growing A. sphaerocephala, seeds germinated in the laboratory, and calli subjected to no stress (control) and high and low temperature, high and low osmotic, and salt stresses. De novo transcriptome assembly performed using an Illumina HiSeq 2500 platform resulted in the generation of 68,373 unigenes. We analyzed the key genes involved in the unsaturated fatty acid synthesis pathway and identified 26 A. sphaerocephala fad2 genes, which is the largest fad2 gene family reported to date. Furthermore, a set of genes responsible for resistance to extreme temperatures, salt, drought and a combination of stresses was identified. Conclusion The present work provides abundant genomic information for functional dissection of the important traits of A. sphaerocephala and contributes to the current understanding of molecular adaptive mechanisms of A. sphaerocephala in the desert environment. Identification of the key genes in the unsaturated fatty acid synthesis pathway could increase understanding of the biological regulatory mechanisms of fatty acid composition traits in plants and facilitate genetic manipulation of the

  7. Genome-scale DNA methylome and transcriptome profiling of human neutrophils.

    PubMed

    Chatterjee, Aniruddha; Stockwell, Peter A; Rodger, Euan J; Morison, Ian M

    2016-01-01

    Methylation of DNA molecules is a key mechanism associated with human disease, altered gene expression and phenotype. Using reduced representation bisulphite sequencing (RRBS) technology we have analysed DNA methylation patterns in healthy individuals and identified genes showing significant inter-individual variation. Further, using whole genome transcriptome analysis (RNA-Seq) on the same individuals we showed a local and specific relationship of exon inclusion and variable DNA methylation pattern. For RRBS, 363 million, 100-bp reads were generated from 13 samples using Illumina GAII and HiSeq2000 platforms. Here we also present additional RRBS data for a female pair of monozygotic twins that was not described in our original publication. Further, We performed RNA-Seq on four of these individuals, generating 174 million, 51-bp high quality reads on an Illumina HiSeq2000 platform. The current data set could be exploited as a comprehensive resource for understanding the nature and mechanism of variable phenotypic traits and altered disease susceptibility due to variable DNA methylation and gene expression patterns in healthy individuals. PMID:26978482

  8. Genome-scale DNA methylome and transcriptome profiling of human neutrophils

    PubMed Central

    Chatterjee, Aniruddha; Stockwell, Peter A.; Rodger, Euan J.; Morison, Ian M.

    2016-01-01

    Methylation of DNA molecules is a key mechanism associated with human disease, altered gene expression and phenotype. Using reduced representation bisulphite sequencing (RRBS) technology we have analysed DNA methylation patterns in healthy individuals and identified genes showing significant inter-individual variation. Further, using whole genome transcriptome analysis (RNA-Seq) on the same individuals we showed a local and specific relationship of exon inclusion and variable DNA methylation pattern. For RRBS, 363 million, 100-bp reads were generated from 13 samples using Illumina GAII and HiSeq2000 platforms. Here we also present additional RRBS data for a female pair of monozygotic twins that was not described in our original publication. Further, We performed RNA-Seq on four of these individuals, generating 174 million, 51-bp high quality reads on an Illumina HiSeq2000 platform. The current data set could be exploited as a comprehensive resource for understanding the nature and mechanism of variable phenotypic traits and altered disease susceptibility due to variable DNA methylation and gene expression patterns in healthy individuals. PMID:26978482

  9. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales

    PubMed Central

    2012-01-01

    Background To date, exon capture has largely been restricted to species with fully sequenced genomes, which has precluded its application to lineages that lack high quality genomic resources. We developed a novel strategy for designing array-based exon capture in chipmunks (Tamias) based on de novo transcriptome assemblies. We evaluated the performance of our approach across specimens from four chipmunk species. Results We selectively targeted 11,975 exons (~4 Mb) on custom capture arrays, and enriched over 99% of the targets in all libraries. The percentage of aligned reads was highly consistent (24.4-29.1%) across all specimens, including in multiplexing up to 20 barcoded individuals on a single array. Base coverage among specimens and within targets in each species library was uniform, and the performance of targets among independent exon captures was highly reproducible. There was no decrease in coverage among chipmunk species, which showed up to 1.5% sequence divergence in coding regions. We did observe a decline in capture performance of a subset of targets designed from a much more divergent ground squirrel genome (30 My), however, over 90% of the targets were also recovered. Final assemblies yielded over ten thousand orthologous loci (~3.6 Mb) with thousands of fixed and polymorphic SNPs among species identified. Conclusions Our study demonstrates the potential of a transcriptome-enabled, multiplexed, exon capture method to create thousands of informative markers for population genomic and phylogenetic studies in non-model species across the tree of life. PMID:22900609

  10. Unstable genomes elevate transcriptome dynamics

    PubMed Central

    Stevens, Joshua B.; Liu, Guo; Abdallah, Batoul Y.; Horne, Steven D.; Ye, Karen J.; Bremer, Steven W.; Ye, Christine J.; Krawetz, Stephen A.; Heng, Henry H.

    2015-01-01

    The challenge of identifying common expression signatures in cancer is well known, however the reason behind this is largely unclear. Traditionally variation in expression signatures has been attributed to technological problems, however recent evidence suggests that chromosome instability (CIN) and resultant karyotypic heterogeneity may be a large contributing factor. Using a well-defined model of immortalization, we systematically compared the pattern of genome alteration and expression dynamics during somatic evolution. Co-measurement of global gene expression and karyotypic alteration throughout the immortalization process reveals that karyotype changes influence gene expression as major structural and numerical karyotypic alterations result in large gene expression deviation. Replicate samples from stages with stable genomes are more similar to each other than are replicate samples with karyotypic heterogeneity. Karyotypic and gene expression change during immortalization is dynamic as each stage of progression has a unique expression pattern. This was further verified by comparing global expression in two replicates grown in one flask with known karyotypes. Replicates with higher karyotypic instability were found to be less similar than replicates with stable karyotypes. This data illustrates the karyotype, transcriptome, and transcriptome determined pathways are in constant flux during somatic cellular evolution (particularly during the macroevolutionary phase) and this flux is an inextricable feature of CIN and essential for cancer formation. The findings presented here underscore the importance of understanding the evolutionary process of cancer in order to design improved treatment modalities. PMID:24122714

  11. Genome-Scale Transcriptomic Insights into Early-Stage Fruit Development in Woodland Strawberry Fragaria vesca[C][W

    PubMed Central

    Kang, Chunying; Darwish, Omar; Geretz, Aviva; Shahan, Rachel; Alkharouf, Nadim; Liu, Zhongchi

    2013-01-01

    Fragaria vesca, a diploid woodland strawberry with a small and sequenced genome, is an excellent model for studying fruit development. The strawberry fruit is unique in that the edible flesh is actually enlarged receptacle tissue. The true fruit are the numerous dry achenes dotting the receptacle’s surface. Auxin produced from the achene is essential for the receptacle fruit set, a paradigm for studying crosstalk between hormone signaling and development. To investigate the molecular mechanism underlying strawberry fruit set, next-generation sequencing was employed to profile early-stage fruit development with five fruit tissue types and five developmental stages from floral anthesis to enlarged fruits. This two-dimensional data set provides a systems-level view of molecular events with precise spatial and temporal resolution. The data suggest that the endosperm and seed coat may play a more prominent role than the embryo in auxin and gibberellin biosynthesis for fruit set. A model is proposed to illustrate how hormonal signals produced in the endosperm and seed coat coordinate seed, ovary wall, and receptacle fruit development. The comprehensive fruit transcriptome data set provides a wealth of genomic resources for the strawberry and Rosaceae communities as well as unprecedented molecular insight into fruit set and early stage fruit development. PMID:23898027

  12. Identification of candidate network hubs involved in metabolic adjustments of rice under drought stress by integrating transcriptome data and genome-scale metabolic network.

    PubMed

    Mohanty, Bijayalaxmi; Kitazumi, Ai; Cheung, C Y Maurice; Lakshmanan, Meiyappan; de los Reyes, Benildo G; Jang, In-Cheol; Lee, Dong-Yup

    2016-01-01

    In this study, we have integrated a rice genome-scale metabolic network and the transcriptome of a drought-tolerant rice line, DK151, to identify the major transcriptional regulators involved in metabolic adjustments necessary for adaptation to drought. This was achieved by examining the differential expressions of transcription factors and metabolic genes in leaf, root and young panicle of rice plants subjected to drought stress during tillering, booting and panicle elongation stages. Critical transcription factors such as AP2/ERF, bZIP, MYB and NAC that control the important nodes in the gene regulatory pathway were identified through correlative analysis of the patterns of spatio-temporal expression and cis-element enrichment. We showed that many of the candidate transcription factors involved in metabolic adjustments were previously linked to phenotypic variation for drought tolerance. This approach represents the first attempt to integrate models of transcriptional regulation and metabolic pathways for the identification of candidate regulatory genes for targeted selection in rice breeding. PMID:26566840

  13. Multi-Scale Genomic, Transcriptomic and Proteomic Analysis of Colorectal Cancer Cell Lines to Identify Novel Biomarkers

    PubMed Central

    Briffa, Romina; Um, Inhwa; Faratian, Dana; Zhou, Ying; Turnbull, Arran K.; Langdon, Simon P.; Harrison, David J.

    2015-01-01

    Selecting colorectal cancer (CRC) patients likely to respond to therapy remains a clinical challenge. The objectives of this study were to establish which genes were differentially expressed with respect to treatment sensitivity and relate this to copy number in a panel of 15 CRC cell lines. Copy number variations of the identified genes were assessed in a cohort of CRCs. IC50’s were measured for 5-fluorouracil, oxaliplatin, and BEZ-235, a PI3K/mTOR inhibitor. Cell lines were profiled using array comparative genomic hybridisation, Illumina gene expression analysis, reverse phase protein arrays, and targeted sequencing of KRAS hotspot mutations. Frequent gains were observed at 2p, 3q, 5p, 7p, 7q, 8q, 12p, 13q, 14q, and 17q and losses at 2q, 3p, 5q, 8p, 9p, 9q, 14q, 18q, and 20p. Frequently gained regions contained EGFR, PIK3CA, MYC, SMO, TRIB1, FZD1, and BRCA2, while frequently lost regions contained FHIT and MACROD2. TRIB1 was selected for further study. Gene enrichment analysis showed that differentially expressed genes with respect to treatment response were involved in Wnt signalling, EGF receptor signalling, apoptosis, cell cycle, and angiogenesis. Stepwise integration of copy number and gene expression data yielded 47 candidate genes that were significantly correlated. PDCD6 was differentially expressed in all three treatment responses. Tissue microarrays were constructed for a cohort of 118 CRC patients and TRIB1 and MYC amplifications were measured using fluorescence in situ hybridisation. TRIB1 and MYC were amplified in 14.5% and 7.4% of the cohort, respectively, and these amplifications were significantly correlated (p≤0.0001). TRIB1 protein expression in the patient cohort was significantly correlated with pERK, Akt, and Caspase 3 expression. In conclusion, a set of candidate predictive biomarkers for 5-fluorouracil, oxaliplatin, and BEZ235 are described that warrant further study. Amplification of the putative oncogene TRIB1 has been described for

  14. Genome-wide transcriptome analysis of 150 cell samples.

    PubMed

    Irimia, Daniel; Mindrinos, Michael; Russom, Aman; Xiao, Wenzhong; Wilhelmy, Julie; Wang, Shenglong; Heath, Joe Don; Kurn, Nurith; Tompkins, Ronald G; Davis, Ronald W; Toner, Mehmet

    2009-01-01

    A major challenge in molecular biology is interrogating the human transcriptome on a genome wide scale when only a limited amount of biological sample is available for analysis. Current methodologies using microarray technologies for simultaneously monitoring mRNA transcription levels require nanogram amounts of total RNA. To overcome the sample size limitation of current technologies, we have developed a method to probe the global gene expression in biological samples as small as 150 cells, or the equivalent of approximately 300 pg total RNA. The new method employs microfluidic devices for the purification of total RNA from mammalian cells and ultra-sensitive whole transcriptome amplification techniques. We verified that the RNA integrity is preserved through the isolation process, accomplished highly reproducible whole transcriptome analysis, and established high correlation between repeated isolations of 150 cells and the same cell culture sample. We validated the technology by demonstrating that the combined microfluidic and amplification protocol is capable of identifying biological pathways perturbed by stimulation, which are consistent with the information recognized in bulk-isolated samples.

  15. Genome-wide transcriptome analysis of 150 cell samples†

    PubMed Central

    Russom, Aman; Xiao, Wenzhong; Wilhelmy, Julie; Wang, Shenglong; Heath, Joe Don; Kurn, Nurith; Tompkins, Ronald G.; Davis, Ronald W.; Toner, Mehmet

    2013-01-01

    A major challenge in molecular biology is interrogating the human transcriptome on a genome wide scale when only a limited amount of biological sample is available for analysis. Current methodologies using microarray technologies for simultaneously monitoring mRNA transcription levels require nanogram amounts of total RNA. To overcome the sample size limitation of current technologies, we have developed a method to probe the global gene expression in biological samples as small as 150 cells, or the equivalent of approximately 300 pg total RNA. The new method employs microfluidic devices for the purification of total RNA from mammalian cells and ultra-sensitive whole transcriptome amplification techniques. We verified that the RNA integrity is preserved through the isolation process, accomplished highly reproducible whole transcriptome analysis, and established high correlation between repeated isolations of 150 cells and the same cell culture sample. We validated the technology by demonstrating that the combined microfluidic and amplification protocol is capable of identifying biological pathways perturbed by stimulation, which are consistent with the information recognized in bulk-isolated samples. PMID:20023796

  16. The Anadara trapezia transcriptome: a resource for molluscan physiological genomics.

    PubMed

    Prentis, Peter J; Pavasovic, Ana

    2014-12-01

    In this study we undertook deep sequencing of the blood cockle, Anadara trapezia, transcriptome to generate genomic resources for future functional genomics analyses. Over 27 million high quality paired end reads were assembled into 75024 contigs. Of these contigs, 29013 (38.7%) received significant BLASTx hits and gene ontology (GO) terms were assigned to 13718 of these sequences. This resource will facilitate physiological genomic studies to test the gene expression response of A. trapezia to various environmental stresses. PMID:25151889

  17. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics.

    PubMed

    Tzika, Athanasia C; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C

    2015-06-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the "Reptilian Transcriptomes Database 2.0," which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. PMID:26133641

  18. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics

    PubMed Central

    Tzika, Athanasia C.; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C.

    2015-01-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the “Reptilian Transcriptomes Database 2.0,” which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. PMID:26133641

  19. Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.

    PubMed

    Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong

    2014-05-01

    We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.

  20. CarrotDB: a genomic and transcriptomic database for carrot.

    PubMed

    Xu, Zhi-Sheng; Tan, Hua-Wei; Wang, Feng; Hou, Xi-Lin; Xiong, Ai-Sheng

    2014-01-01

    Carrot (Daucus carota L.) is an economically important vegetable worldwide and is the largest source of carotenoids and provitamin A in the human diet. Given the importance of this vegetable to humans, research and breeding communities on carrot should obtain useful genomic and transcriptomic information. The first whole-genome sequences of 'DC-27' carrot were de novo assembled and analyzed. Transcriptomic sequences of 14 carrot genotypes were downloaded from the Sequence Read Archive (SRA) database of National Center for Biotechnology Information (NCBI) and mapped to the whole-genome sequence before assembly. Based on these data sets, the first Web-based genomic and transcriptomic database for D. carota (CarrotDB) was developed (database homepage: http://apiaceae.njau.edu.cn/car rotdb). CarrotDB offers the tools of Genome Map and Basic Local Alignment Search Tool. Using these tools, users can search certain target genes and simple sequence repeats along with designed primers of 'DC-27'. Assembled transcriptomic sequences along with fragments per kilobase of transcript sequence per millions base pairs sequenced information (FPKM) information of 14 carrot genotypes are also provided. Users can download de novo assembled whole-genome sequences, putative gene sequences and putative protein sequences of 'DC-27'. Users can also download transcriptome sequence assemblies of 14 carrot genotypes along with their FPKM information. A total of 2826 transcription factor (TF) genes classified into 57 families were identified in the entire genome sequences. These TF genes were embedded in CarrotDB as an interface. The 'GERMPLASM' part of CarrotDB also offers taproot photos of 45 carrot genotypes and a table containing accession numbers, names, countries of origin and colors of cortex, phloem and xylem parts of taproots corresponding to each carrot genotype. CarrotDB will be continuously updated with new information. Database URL: http://apiaceae.njau.edu.cn/carrotdb/ PMID

  1. CarrotDB: a genomic and transcriptomic database for carrot

    PubMed Central

    Xu, Zhi-Sheng; Tan, Hua-Wei; Wang, Feng; Hou, Xi-Lin; Xiong, Ai-Sheng

    2014-01-01

    Carrot (Daucus carota L.) is an economically important vegetable worldwide and is the largest source of carotenoids and provitamin A in the human diet. Given the importance of this vegetable to humans, research and breeding communities on carrot should obtain useful genomic and transcriptomic information. The first whole-genome sequences of ‘DC-27’ carrot were de novo assembled and analyzed. Transcriptomic sequences of 14 carrot genotypes were downloaded from the Sequence Read Archive (SRA) database of National Center for Biotechnology Information (NCBI) and mapped to the whole-genome sequence before assembly. Based on these data sets, the first Web-based genomic and transcriptomic database for D. carota (CarrotDB) was developed (database homepage: http://apiaceae.njau.edu.cn/car rotdb). CarrotDB offers the tools of Genome Map and Basic Local Alignment Search Tool. Using these tools, users can search certain target genes and simple sequence repeats along with designed primers of ‘DC-27’. Assembled transcriptomic sequences along with fragments per kilobase of transcript sequence per millions base pairs sequenced information (FPKM) information of 14 carrot genotypes are also provided. Users can download de novo assembled whole-genome sequences, putative gene sequences and putative protein sequences of ‘DC-27’. Users can also download transcriptome sequence assemblies of 14 carrot genotypes along with their FPKM information. A total of 2826 transcription factor (TF) genes classified into 57 families were identified in the entire genome sequences. These TF genes were embedded in CarrotDB as an interface. The ‘GERMPLASM’ part of CarrotDB also offers taproot photos of 45 carrot genotypes and a table containing accession numbers, names, countries of origin and colors of cortex, phloem and xylem parts of taproots corresponding to each carrot genotype. CarrotDB will be continuously updated with new information. Database URL: http

  2. Integrated Analysis of Whole Genome and Transcriptome Sequencing Reveals Diverse Transcriptomic Aberrations Driven by Somatic Genomic Changes in Liver Cancers

    PubMed Central

    Shiraishi, Yuichi; Fujimoto, Akihiro; Furuta, Mayuko; Tanaka, Hiroko; Chiba, Ken-ichi; Boroevich, Keith A.; Abe, Tetsuo; Kawakami, Yoshiiku; Ueno, Masaki; Gotoh, Kunihito; Ariizumi, Shun-ichi; Shibuya, Tetsuo; Nakano, Kaoru; Sasaki, Aya; Maejima, Kazuhiro; Kitada, Rina; Hayami, Shinya; Shigekawa, Yoshinobu; Marubashi, Shigeru; Yamada, Terumasa; Kubo, Michiaki; Ishikawa, Osamu; Aikata, Hiroshi; Arihiro, Koji; Ohdan, Hideki; Yamamoto, Masakazu; Yamaue, Hiroki; Chayama, Kazuaki; Tsunoda, Tatsuhiko; Miyano, Satoru; Nakagawa, Hidewaki

    2014-01-01

    Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV)-related hepatocellular carcinomas (HCCs) and their matched controls. Comparison of whole genome sequence (WGS) and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3), and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome. PMID:25526364

  3. Genome interplay in the grain transcriptome of hexaploid bread wheat.

    PubMed

    Pfeifer, Matthias; Kugler, Karl G; Sandve, Simen R; Zhan, Bujie; Rudi, Heidi; Hvidsten, Torgeir R; Mayer, Klaus F X; Olsen, Odd-Arne

    2014-07-18

    Allohexaploid bread wheat (Triticum aestivum L.) provides approximately 20% of calories consumed by humans. Lack of genome sequence for the three homeologous and highly similar bread wheat genomes (A, B, and D) has impeded expression analysis of the grain transcriptome. We used previously unknown genome information to analyze the cell type-specific expression of homeologous genes in the developing wheat grain and identified distinct co-expression clusters reflecting the spatiotemporal progression during endosperm development. We observed no global but cell type- and stage-dependent genome dominance, organization of the wheat genome into transcriptionally active chromosomal regions, and asymmetric expression in gene families related to baking quality. Our findings give insight into the transcriptional dynamics and genome interplay among individual grain cell types in a polyploid cereal genome.

  4. InsectBase: a resource for insect genomes and transcriptomes

    PubMed Central

    Yin, Chuanlin; Shen, Gengyu; Guo, Dianhao; Wang, Shuping; Ma, Xingzhou; Xiao, Huamei; Liu, Jinding; Zhang, Zan; Liu, Ying; Zhang, Yiqun; Yu, Kaixiang; Huang, Shuiqing; Li, Fei

    2016-01-01

    The genomes and transcriptomes of hundreds of insects have been sequenced. However, insect community lacks an integrated, up-to-date collection of insect gene data. Here, we introduce the first release of InsectBase, available online at http://www.insect-genome.com. The database encompasses 138 insect genomes, 116 insect transcriptomes, 61 insect gene sets, 36 gene families of 60 insects, 7544 miRNAs of 69 insects, 96 925 piRNAs of Drosophila melanogaster and Chilo suppressalis, 2439 lncRNA of Nilaparvata lugens, 22 536 pathways of 78 insects, 678 881 untranslated regions (UTR) of 84 insects and 160 905 coding sequences (CDS) of 70 insects. This release contains over 12 million sequences and provides search functionality, a BLAST server, GBrowse, insect pathway construction, a Facebook-like network for the insect community (iFacebook), and phylogenetic analysis of selected genes. PMID:26578584

  5. Whole genome and transcriptome sequencing of a B3 thymoma.

    PubMed

    Petrini, Iacopo; Rajan, Arun; Pham, Trung; Voeller, Donna; Davis, Sean; Gao, James; Wang, Yisong; Giaccone, Giuseppe

    2013-01-01

    Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina) and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37). Copy number (CN) aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X) was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs) and 2 insertion/deletions (INDELs) were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  6. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    SciTech Connect

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.; Bergman, Nicholas; Borodovsky, Mark

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

  7. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree.

    PubMed

    Kuravadi, Nagesh A; Yenagi, Vijay; Rangiah, Kannan; Mahesh, H B; Rajamani, Anantharamanan; Shirke, Meghana D; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, B N; Gowda, Malali

    2015-01-01

    Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC-600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways.

  8. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree

    PubMed Central

    Rangiah, Kannan; Mahesh, HB; Rajamani, Anantharamanan; Shirke, Meghana D.; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, BN

    2015-01-01

    Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways. PMID:26290780

  9. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree.

    PubMed

    Kuravadi, Nagesh A; Yenagi, Vijay; Rangiah, Kannan; Mahesh, H B; Rajamani, Anantharamanan; Shirke, Meghana D; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, B N; Gowda, Malali

    2015-01-01

    Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC-600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways. PMID:26290780

  10. A practical guide to sequencing genomes and transcriptomes.

    PubMed

    Sanchez-Flores, Alejandro; Abreu-Goodger, Cei

    2014-01-01

    The emergence of new DNA sequencing technologies has allowed an exponential growth of genomic and transcriptomic data that ultimately yielded important results to several areas such as medicine and biology. This continuous technological progress presents several advantages and caveats that have to be considered for each new method. In this review, we describe the so-called second and third generation DNA sequencing technologies, how they changed the study of genomes and transcriptomes, and most importantly, what are the key factors that should be considered in a sequencing project. Taken together, we present a "sequencing project map" that includes a practical and graphical cost-benefit analysis for genome and transcriptome projects which allows scientist to easily classify their workflow into one of our proposed templates according to the goals and experimental design of the project at hand. In all, this review reflects the pros and cons of the most widely adopted experimental designs, sequencing technologies, and exposes them to help scientists interested in these tools to choose the best strategy for their project.

  11. Status of duckweed genomics and transcriptomics.

    PubMed

    Wang, W; Messing, J

    2015-01-01

    Duckweeds belong to the smallest flowering plants that undergo fast vegetative growth in an aquatic environment. They are commonly used in wastewater treatment and animal feed. Whereas duckweeds have been studied at the biochemical level, their reduced morphology and wide environmental adaption had not been subjected to molecular analysis until recently. Here, we review the progress that has been made in using a DNA barcode system and the sequences of chloroplast and mitochondrial genomes to identify duckweed species at the species or population level. We also review analysis of the nuclear genome sequence of Spirodela that provides new insights into fundamental biological questions. Indeed, reduced gene families and missing genes are consistent with its compact morphogenesis, aquatic floating and suppression of juvenile-to-adult transition. Furthermore, deep RNA sequencing of Spirodela at the onset of dormancy and Landoltia in exposure of nutrient deficiency illustrate the molecular network for environmental adaption and stress response, constituting major progress towards a post-genome sequencing phase, where further functional genomic details can be explored. Rapid advances in sequencing technologies could continue to promote a proliferation of genome sequences for additional ecotypes as well as for other duckweed species.

  12. Single Cell Genomics and Transcriptomics for Unicellular Eukaryotes

    SciTech Connect

    Ciobanu, Doina; Clum, Alicia; Singh, Vasanth; Salamov, Asaf; Han, James; Copeland, Alex; Grigoriev, Igor; James, Timothy; Singer, Steven; Woyke, Tanja; Malmstrom, Rex; Cheng, Jan-Fang

    2014-03-14

    Despite their small size, unicellular eukaryotes have complex genomes with a high degree of plasticity that allow them to adapt quickly to environmental changes. Unicellular eukaryotes live with prokaryotes and higher eukaryotes, frequently in symbiotic or parasitic niches. To this day their contribution to the dynamics of the environmental communities remains to be understood. Unfortunately, the vast majority of eukaryotic microorganisms are either uncultured or unculturable, making genome sequencing impossible using traditional approaches. We have developed an approach to isolate unicellular eukaryotes of interest from environmental samples, and to sequence and analyze their genomes and transcriptomes. We have tested our methods with six species: an uncharacterized protist from cellulose-enriched compost identified as Platyophrya, a close relative of P. vorax; the fungus Metschnikowia bicuspidate, a parasite of water flea Daphnia; the mycoparasitic fungi Piptocephalis cylindrospora, a parasite of Cokeromyces and Mucor; Caulochytrium protosteloides, a parasite of Sordaria; Rozella allomycis, a parasite of the water mold Allomyces; and the microalgae Chlamydomonas reinhardtii. Here, we present the four components of our approach: pre-sequencing methods, sequence analysis for single cell genome assembly, sequence analysis of single cell transcriptomes, and genome annotation. This technology has the potential to uncover the complexity of single cell eukaryotes and their role in the environmental samples.

  13. Transcriptome and Genome Size Analysis of the Venus Flytrap

    PubMed Central

    Bressendorff, Simon; Seguin-Orlando, Andaine; Petersen, Morten; Sicheritz-Pontén, Thomas; Mundy, John

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin’s studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations. PMID:25886597

  14. Transcriptome and genome size analysis of the Venus flytrap.

    PubMed

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon; Seguin-Orlando, Andaine; Petersen, Morten; Sicheritz-Pontén, Thomas; Mundy, John

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations. PMID:25886597

  15. Transcriptome and genome size analysis of the Venus flytrap.

    PubMed

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon; Seguin-Orlando, Andaine; Petersen, Morten; Sicheritz-Pontén, Thomas; Mundy, John

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations.

  16. The draft genome and transcriptome of Cannabis sativa

    PubMed Central

    2011-01-01

    Background Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. Results We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of Δ9-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid Δ9-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp. Conclusions The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics. PMID:22014239

  17. Comparative genomics and transcriptomics of trait-gene association

    PubMed Central

    2012-01-01

    Background The Order Rickettsiales includes important tick-borne pathogens, from Rickettsia rickettsii, which causes Rocky Mountain spotted fever, to Anaplasma marginale, the most prevalent vector-borne pathogen of cattle. Although most pathogens in this Order are transmitted by arthropod vectors, little is known about the microbial determinants of transmission. A. marginale provides unique tools for studying the determinants of transmission, with multiple strain sequences available that display distinct and reproducible transmission phenotypes. The closed core A. marginale genome suggests that any phenotypic differences are due to single nucleotide polymorphisms (SNPs). We combined DNA/RNA comparative genomic approaches using strains with different tick transmission phenotypes and identified genes that segregate with transmissibility. Results Comparison of seven strains with different transmission phenotypes generated a list of SNPs affecting 18 genes and nine promoters. Transcriptional analysis found two candidate genes downstream from promoter SNPs that were differentially transcribed. To corroborate the comparative genomics approach we used three RNA-seq platforms to analyze the transcriptomes from two A. marginale strains with different transmission phenotypes. RNA-seq analysis confirmed the comparative genomics data and found 10 additional genes whose transcription between strains with distinct transmission efficiencies was significantly different. Six regions of the genome that contained no annotation were found to be transcriptionally active, and two of these newly identified transcripts were differentially transcribed. Conclusions This approach identified 30 genes and two novel transcripts potentially involved in tick transmission. We describe the transcriptome of an obligate intracellular bacterium in depth, while employing massive parallel sequencing to dissect an important trait in bacterial pathogenesis. PMID:23181781

  18. The capsicum transcriptome DB: a "hot" tool for genomic research.

    PubMed

    Góngora-Castillo, Elsa; Fajardo-Jaime, Rubén; Fernández-Cortes, Araceli; Jofre-Garfias, Alba E; Lozoya-Gloria, Edmundo; Martínez, Octavio; Ochoa-Alejo, Neftalí; Rivera-Bustamante, Rafael

    2012-01-01

    Chili pepper (Capsicum annuum) is an economically important crop with no available public genome sequence. We describe a genomic resource to facilitate Capsicum annuum research. A collection of Expressed Sequence Tags (ESTs) derived from five C. annuum organs (root, stem, leaf, flower and fruit) were sequenced using the Sanger method and multiple leaf transcriptomes were deeply sampled using with GS-pyrosequencing. A hybrid assembly of 1,324,516 raw reads yielded 32,314 high quality contigs as validated by coverage and identity analysis with existing pepper sequences. Overall, 75.5% of the contigs had significant sequence similarity to entries in nucleic acid and protein databases; 23% of the sequences have not been previously reported for C. annuum and expand sequence resources for this species. A MySQL database and a user-friendly Web interface were constructed with search-tools that permit queries of the ESTs including sequence, functional annotation, Gene Ontology classification, metabolic pathways, and assembly information. The Capsicum Transcriptome DB is free available from http://www.bioingenios.ira.cinvestav.mx:81/Joomla/

  19. The non-human primate reference transcriptome resource (NHPRTR) for comparative functional genomics

    PubMed Central

    Pipes, Lenore; Li, Sheng; Bozinoski, Marjan; Palermo, Robert; Peng, Xinxia; Blood, Phillip; Kelly, Sara; Weiss, Jeffrey M.; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Zumbo, Paul; Chen, Ronghua; Schroth, Gary P.; Mason, Christopher E.; Katze, Michael G.

    2013-01-01

    RNA-based next-generation sequencing (RNA-Seq) provides a tremendous amount of new information regarding gene and transcript structure, expression and regulation. This is particularly true for non-coding RNAs where whole transcriptome analyses have revealed that the much of the genome is transcribed and that many non-coding transcripts have widespread functionality. However, uniform resources for raw, cleaned and processed RNA-Seq data are sparse for most organisms and this is especially true for non-human primates (NHPs). Here, we describe a large-scale RNA-Seq data and analysis infrastructure, the NHP reference transcriptome resource (http://nhprtr.org); it presently hosts data from12 species of primates, to be expanded to 15 species/subspecies spanning great apes, old world monkeys, new world monkeys and prosimians. Data are collected for each species using pools of RNA from comparable tissues. We provide data access in advance of its deposition at NCBI, as well as browsable tracks of alignments against the human genome using the UCSC genome browser. This resource will continue to host additional RNA-Seq data, alignments and assemblies as they are generated over the coming years and provide a key resource for the annotation of NHP genomes as well as informing primate studies on evolution, reproduction, infection, immunity and pharmacology. PMID:23203872

  20. Transcriptome and genome sequencing uncovers functional variation in humans

    PubMed Central

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-01-01

    Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

  1. Parallel WGA and WTA for Comparative Genome and Transcriptome NGS Analysis Using Tiny Cell Numbers.

    PubMed

    Korfhage, Christian; Fricke, Evelyn; Meier, Andreas

    2015-07-01

    Genomic DNA determines how and when the transcriptome is changed by a trigger or environmental change and how cellular metabolism is influenced. Comparative genome and transcriptome analysis of the same cell sample links a defined genome with all changes in the bases, structure, or numbers of the transcriptome. However, comparative genome and transcriptome analysis using next-generation sequencing (NGS) or real-time PCR is often limited by the small amount of sample available. In mammals, the amount of DNA and RNA in a single cell is ∼10 picograms, but deep analysis of the genome and transcriptome currently requires several hundred nanograms of nucleic acids for library preparation for NGS sequencing. Consequently, accurate whole-genome amplification (WGA) and whole-transcriptome amplification (WTA) is required for such quantitative analysis. This unit describes how the genome and the transcriptome of a tiny number of cells can be amplified in a highly parallel and comparable process. Protocols for quality control of amplified DNA and application of amplified DNA for NGS are included.

  2. Genome-wide transcriptome analysis of human epidermal melanocytes

    PubMed Central

    Haltaufderhyde, Kirk D.; Oancea, Elena

    2015-01-01

    Because human epidermal melanocytes (HEMs) provide critical protection against skin cancer, sunburn, and photoaging, a genome-wide perspective of gene expression in these cells is vital to understanding human skin physiology. In this study we performed high throughput sequencing of HEMs to obtain a complete data set of transcript sizes, abundances, and splicing. As expected, we found that melanocyte specific genes that function in pigmentation were among the highest expressed genes. We analyzed receptor, ion channel and transcription factor gene families to get a better understanding of the cell signalling pathways used by melanocytes. We also performed a comparative transcriptomic analysis of lightly versus darkly pigmented HEMs and found 16 genes differentially expressed in the two pigmentation phenotypes; of those, only one putative melanosomal transporter (SLC45A2) has known function in pigmentation. In addition, we found 166 genes with splice isoforms expressed exclusively in one pigmentation phenotype, 17 of which are genes involved in signal transduction. Our melanocyte transcriptome study provides a comprehensive view and may help identify novel pigmentation genes and potential pharmacological targets. PMID:25451175

  3. Transcriptome-wide investigation of genomic imprinting in chicken.

    PubMed

    Frésard, Laure; Leroux, Sophie; Servin, Bertrand; Gourichon, David; Dehais, Patrice; Cristobal, Magali San; Marsaud, Nathalie; Vignoles, Florence; Bed'hom, Bertrand; Coville, Jean-Luc; Hormozdiari, Farhad; Beaumont, Catherine; Zerjal, Tatiana; Vignal, Alain; Morisson, Mireille; Lagarrigue, Sandrine; Pitel, Frédérique

    2014-04-01

    Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken. PMID:24452801

  4. Transcriptome-wide investigation of genomic imprinting in chicken

    PubMed Central

    Frésard, Laure; Leroux, Sophie; Servin, Bertrand; Gourichon, David; Dehais, Patrice; Cristobal, Magali San; Marsaud, Nathalie; Vignoles, Florence; Bed'hom, Bertrand; Coville, Jean-Luc; Hormozdiari, Farhad; Beaumont, Catherine; Zerjal, Tatiana; Vignal, Alain; Morisson, Mireille; Lagarrigue, Sandrine; Pitel, Frédérique

    2014-01-01

    Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken. PMID:24452801

  5. Gigabase-scale transcriptome analysis on four species of pearl oysters.

    PubMed

    Huang, Xian-De; Zhao, Mi; Liu, Wen-Guang; Guan, Yun-Yan; Shi, Yu; Wang, Qi; Wu, Shan-Zeng; He, Mao-Xian

    2013-06-01

    Pearl oysters have been found to secrete nacre and form pearls with good quality and significant commercial interest. However, the transcriptomic and genomic resources for pearl oysters are still limited. To improve this situation, transcriptome sequencing was conducted from four species of pearl oysters with Illumina HiSeq™ 2000. There were four gigabase-scale transcriptomes for four species of pearl oysters, ∼26.3 million reads with ∼2.37 gigabase base pairs (Gbp) in Pinctada fucata, ∼26.5 million reads with ∼2.39 Gbp in Pinctada margaritifera, ∼27.0 million reads with ∼2.43 Gbp in Pinctada maxima, and ∼25.9 million reads with ∼2.33 Gbp in Pteria penguin, respectively. After sequence assembly and blastx alignment, the numbers of annotated unigenes ≥200 bp were 33,882 in P. fucata, 30,666 in P. margaritifera, 26,420 in P. maxima, and 29,928 in P. penguin. Based on these annotated unigenes among four species of pearl oysters, CDSs were extracted and predicted and furthermore, analyses of GO and KEGG assignments were performed. In addition, 60 putative genes of growth factors and their receptors from four species of pearl oysters were predicted. This study established an excellent resource for gene discovery and expression in pearl oysters, but also offered a significant platform for functional genomics and comparative genomic studies for mollusks.

  6. Comprehensive Annotation of the Parastagonospora nodorum Reference Genome Using Next-Generation Genomics, Transcriptomics and Proteogenomics

    PubMed Central

    Dodhia, Kejal; Stoll, Thomas; Hastie, Marcus; Furuki, Eiko; Ellwood, Simon R.; Williams, Angela H.; Tan, Yew-Foon; Testa, Alison C.; Gorman, Jeffrey J.; Oliver, Richard P.

    2016-01-01

    Parastagonospora nodorum, the causal agent of Septoria nodorum blotch (SNB), is an economically important pathogen of wheat (Triticum spp.), and a model for the study of necrotrophic pathology and genome evolution. The reference P. nodorum strain SN15 was the first Dothideomycete with a published genome sequence, and has been used as the basis for comparison within and between species. Here we present an updated reference genome assembly with corrections of SNP and indel errors in the underlying genome assembly from deep resequencing data as well as extensive manual annotation of gene models using transcriptomic and proteomic sources of evidence (https://github.com/robsyme/Parastagonospora_nodorum_SN15). The updated assembly and annotation includes 8,366 genes with modified protein sequence and 866 new genes. This study shows the benefits of using a wide variety of experimental methods allied to expert curation to generate a reliable set of gene models. PMID:26840125

  7. Comprehensive Annotation of the Parastagonospora nodorum Reference Genome Using Next-Generation Genomics, Transcriptomics and Proteogenomics.

    PubMed

    Syme, Robert A; Tan, Kar-Chun; Hane, James K; Dodhia, Kejal; Stoll, Thomas; Hastie, Marcus; Furuki, Eiko; Ellwood, Simon R; Williams, Angela H; Tan, Yew-Foon; Testa, Alison C; Gorman, Jeffrey J; Oliver, Richard P

    2016-01-01

    Parastagonospora nodorum, the causal agent of Septoria nodorum blotch (SNB), is an economically important pathogen of wheat (Triticum spp.), and a model for the study of necrotrophic pathology and genome evolution. The reference P. nodorum strain SN15 was the first Dothideomycete with a published genome sequence, and has been used as the basis for comparison within and between species. Here we present an updated reference genome assembly with corrections of SNP and indel errors in the underlying genome assembly from deep resequencing data as well as extensive manual annotation of gene models using transcriptomic and proteomic sources of evidence (https://github.com/robsyme/Parastagonospora_nodorum_SN15). The updated assembly and annotation includes 8,366 genes with modified protein sequence and 866 new genes. This study shows the benefits of using a wide variety of experimental methods allied to expert curation to generate a reliable set of gene models.

  8. Transcriptome profiling reveals mosaic genomic origins of modern cultivated barley.

    PubMed

    Dai, Fei; Chen, Zhong-Hua; Wang, Xiaolei; Li, Zefeng; Jin, Gulei; Wu, Dezhi; Cai, Shengguan; Wang, Ning; Wu, Feibo; Nevo, Eviatar; Zhang, Guoping

    2014-09-16

    The domestication of cultivated barley has been used as a model system for studying the origins and early spread of agrarian culture. Our previous results indicated that the Tibetan Plateau and its vicinity is one of the centers of domestication of cultivated barley. Here we reveal multiple origins of domesticated barley using transcriptome profiling of cultivated and wild-barley genotypes. Approximately 48-Gb of clean transcript sequences in 12 Hordeum spontaneum and 9 Hordeum vulgare accessions were generated. We reported 12,530 de novo assembled transcripts in all of the 21 samples. Population structure analysis showed that Tibetan hulless barley (qingke) might have existed in the early stage of domestication. Based on the large number of unique genomic regions showing the similarity between cultivated and wild-barley groups, we propose that the genomic origin of modern cultivated barley is derived from wild-barley genotypes in the Fertile Crescent (mainly in chromosomes 1H, 2H, and 3H) and Tibet (mainly in chromosomes 4H, 5H, 6H, and 7H). This study indicates that the domestication of barley may have occurred over time in geographically distinct regions. PMID:25197090

  9. Reference genomes and transcriptomes of Nicotiana sylvestris and Nicotiana tomentosiformis

    PubMed Central

    2013-01-01

    Background Nicotiana sylvestris and Nicotiana tomentosiformis are members of the Solanaceae family that includes tomato, potato, eggplant and pepper. These two Nicotiana species originate from South America and exhibit different alkaloid and diterpenoid production. N. sylvestris is cultivated largely as an ornamental plant and it has been used as a diploid model system for studies of terpenoid production, plastid engineering, and resistance to biotic and abiotic stress. N. sylvestris and N. tomentosiformis are considered to be modern descendants of the maternal and paternal donors that formed Nicotiana tabacum about 200,000 years ago through interspecific hybridization. Here we report the first genome-wide analysis of these two Nicotiana species. Results Draft genomes of N. sylvestris and N. tomentosiformis were assembled to 82.9% and 71.6% of their expected size respectively, with N50 sizes of about 80 kb. The repeat content was 72-75%, with a higher proportion of retrotransposons and copia-like long terminal repeats in N. tomentosiformis. The transcriptome assemblies showed that 44,000-53,000 transcripts were expressed in the roots, leaves or flowers. The key genes involved in terpenoid metabolism, alkaloid metabolism and heavy metal transport showed differential expression in the leaves, roots and flowers of N. sylvestris and N. tomentosiformis. Conclusions The reference genomes of N. sylvestris and N. tomentosiformis represent a significant contribution to the SOL100 initiative because, as members of the Nicotiana genus of Solanaceae, they strengthen the value of the already existing resources by providing additional comparative information, thereby helping to improve our understanding of plant metabolism and evolution. PMID:23773524

  10. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  11. The genome- and transcriptome-wide analysis of innate immunity in the brown planthopper, Nilaparvata lugens

    PubMed Central

    2013-01-01

    Background The brown planthopper (Nilaparvata lugens) is one of the most serious rice plant pests in Asia. N. lugens causes extensive rice damage by sucking rice phloem sap, which results in stunted plant growth and the transmission of plant viruses. Despite the importance of this insect pest, little is known about the immunological mechanisms occurring in this hemimetabolous insect species. Results In this study, we performed a genome- and transcriptome-wide analysis aiming at the immune-related genes. The transcriptome datasets include the N. lugens intestine, the developmental stage, wing formation, and sex-specific expression information that provided useful gene expression sequence data for the genome-wide analysis. As a result, we identified a large number of genes encoding N. lugens pattern recognition proteins, modulation proteins in the prophenoloxidase (proPO) activating cascade, immune effectors, and the signal transduction molecules involved in the immune pathways, including the Toll, Immune deficiency (Imd) and Janus kinase signal transducers and activators of transcription (JAK-STAT) pathways. The genome scale analysis revealed detailed information of the gene structure, distribution and transcription orientations in scaffolds. A comparison of the genome-available hemimetabolous and metabolous insect species indicate the differences in the immune-related gene constitution. We investigated the gene expression profiles with regards to how they responded to bacterial infections and tissue, as well as development and sex expression specificity. Conclusions The genome- and transcriptome-wide analysis of immune-related genes including pattern recognition and modulation molecules, immune effectors, and the signal transduction molecules involved in the immune pathways is an important step in determining the overall architecture and functional network of the immune components in N. lugens. Our findings provide the comprehensive gene sequence resource and

  12. Dissecting the genome of the polyploid crop oilseed rape by transcriptome sequencing.

    PubMed

    Bancroft, Ian; Morgan, Colin; Fraser, Fiona; Higgins, Janet; Wells, Rachel; Clissold, Leah; Baker, David; Long, Yan; Meng, Jinling; Wang, Xiaowu; Liu, Shengyi; Trick, Martin

    2011-08-01

    Polyploidy complicates genomics-based breeding of many crops, including wheat, potato, cotton, oat and sugarcane. To address this challenge, we sequenced leaf transcriptomes across a mapping population of the polyploid crop oilseed rape (Brassica napus) and representative ancestors of the parents of the population. Analysis of sequence variation and transcript abundance enabled us to construct twin single nucleotide polymorphism linkage maps of B. napus, comprising 23,037 markers. We used these to align the B. napus genome with that of a related species, Arabidopsis thaliana, and to genome sequence assemblies of its progenitor species, Brassica rapa and Brassica oleracea. We also developed methods to detect genome rearrangements and track inheritance of genomic segments, including the outcome of an interspecific cross. By revealing the genetic consequences of breeding, cost-effective, high-resolution dissection of crop genomes by transcriptome sequencing will increase the efficiency of predictive breeding even in the absence of a complete genome sequence.

  13. Metaplastic breast carcinomas display genomic and transcriptomic heterogeneity [corrected]. .

    PubMed

    Weigelt, Britta; Ng, Charlotte K Y; Shen, Ronglai; Popova, Tatiana; Schizas, Michail; Natrajan, Rachael; Mariani, Odette; Stern, Marc-Henri; Norton, Larry; Vincent-Salomon, Anne; Reis-Filho, Jorge S

    2015-03-01

    features of metaplastic breast carcinomas is reflected at the transcriptomic level, and an association between molecular subtypes and histology was observed. BRCA1-like genomic profiles were found only in a subset (31%) of metaplastic breast cancers, and were not associated with a specific molecular or histologic subtype.

  14. Genome and Transcriptome Analyses Provide Insight into the Euryhaline Adaptation Mechanism of Crassostrea gigas

    PubMed Central

    Zhang, Linlin; Li, Chunyan; Li, Li; She, Zhicai; Huang, Baoyu; Zhang, Guofan

    2013-01-01

    the most important effectors for oyster euryhaline adaptation. This study was the first to explain oyster euryhaline adaptation at a genome-wide scale in C. gigas. PMID:23554902

  15. Construction of Brassica A and C genome-based ordered pan-transcriptomes for use in rapeseed genomic research.

    PubMed

    He, Zhesi; Cheng, Feng; Li, Yi; Wang, Xiaowu; Parkin, Isobel A P; Chalhoub, Boulos; Liu, Shengyi; Bancroft, Ian

    2015-09-01

    This data article reports the establishment of the first pan-transcriptome resources for the Brassica A and C genomes. These were developed using existing coding DNA sequence (CDS) gene models from the now-published Brassica oleracea TO1000 and Brassica napus Darmor-bzh genome sequence assemblies representing the chromosomes of these species, along with preliminary CDS models from an updated Brassica rapa Chiifu genome sequence assembly. The B. rapa genome sequence scaffolds required splitting and re-ordering to match the expected genome organisation based on a high density SNP linkage map, but the B. oleracea assembly was used unchanged. The resulting B. rapa (A genome) pseudomolecules contained 47,656 ordered CDS models and the B. oleracea (C genome) pseudomolecules contained 54,766 ordered CDS models. Interpolation of B. napus CDS models not already represented by orthologues resulted in 52,790 and 63,308 ordered CDS models in the A and C pan-transcriptomes, an increase of 13,676 overall. Comparison of the organisation of this resource with publicly available genome sequences for B. napus showed excellent consistency for the B. napus Darmor-bzh resource, but more breakdown of collinearity for the B. napus ZS11 resource. CDS datasets comprising the pan-transcriptomes are available with this article (B. rapa) or from public repositories (B. oleracea and B. napus).

  16. Construction of Brassica A and C genome-based ordered pan-transcriptomes for use in rapeseed genomic research

    PubMed Central

    He, Zhesi; Cheng, Feng; Li, Yi; Wang, Xiaowu; Parkin, Isobel A.P.; Chalhoub, Boulos; Liu, Shengyi; Bancroft, Ian

    2015-01-01

    This data article reports the establishment of the first pan-transcriptome resources for the Brassica A and C genomes. These were developed using existing coding DNA sequence (CDS) gene models from the now-published Brassica oleracea TO1000 and Brassica napus Darmor-bzh genome sequence assemblies representing the chromosomes of these species, along with preliminary CDS models from an updated Brassica rapa Chiifu genome sequence assembly. The B. rapa genome sequence scaffolds required splitting and re-ordering to match the expected genome organisation based on a high density SNP linkage map, but the B. oleracea assembly was used unchanged. The resulting B. rapa (A genome) pseudomolecules contained 47,656 ordered CDS models and the B. oleracea (C genome) pseudomolecules contained 54,766 ordered CDS models. Interpolation of B. napus CDS models not already represented by orthologues resulted in 52,790 and 63,308 ordered CDS models in the A and C pan-transcriptomes, an increase of 13,676 overall. Comparison of the organisation of this resource with publicly available genome sequences for B. napus showed excellent consistency for the B. napus Darmor-bzh resource, but more breakdown of collinearity for the B. napus ZS11 resource. CDS datasets comprising the pan-transcriptomes are available with this article (B. rapa) or from public repositories (B. oleracea and B. napus). PMID:26217816

  17. Novel genomic resources for a climate change sensitive mammal: characterization of the American pika transcriptome

    PubMed Central

    2013-01-01

    a positive match with the hemoglobin alpha chain from the plateau pika, a species restricted to high elevation steppes in Asia. Elevation-specific contigs may represent candidate regions subject to differential levels of gene expression along this elevation gradient. Conclusions To our knowledge, this is the first broad-scale, transcriptome-level study conducted within the Ochotonidae, providing novel genomic resources for studying pika ecology, behaviour and population history. PMID:23663654

  18. Genomic and transcriptomic alterations following hybridisation and genome doubling in trigenomic allohexaploid Brassica carinata × Brassica rapa.

    PubMed

    Xu, Y; Zhao, Q; Mei, S; Wang, J

    2012-09-01

    Allopolyploidisation is a prominent evolutionary force that involves two major events: interspecific hybridisation and genome doubling. Both events have important functional consequences in shaping the genomic architecture of the neo-allopolyploids. The respective effects of hybridisation and genome doubling upon genomic and transcriptomic changes in Brassica allopolyploids are unresolved. In this study, amplified fragment length polymorphism (AFLP), methylation-sensitive amplification polymorphism (MSAP) and cDNA-AFLP approaches were used to track genetic, epigenetic and transcriptional changes in both allohexaploid Brassica (ArArBcBcCcCc genome) and triploid hybrids (ArBcCc genome). Results from these groups were compared with each other and also to their parents Brassica carinata (BBCC genome) and Brassica rapa (AA genome). Rapid and dramatic genetic, DNA methylation and gene expression changes were detected in the triploid hybrids. During the shift from triploidy to allohexaploidy, some of the hybridisation-induced alterations underwent reversion. Additionally, novel genetic, epigenetic and transcriptional alterations were also detected. The proportions of A-genome-specific DNA methylation and gene expression alterations were significantly greater than those of BC-genome-specific alterations in the triploid hybrids. However, the two parental genomes were equally affected during the ploidy shift. Hemi-CCG methylation changes induced by hybridisation were recovered after genome doubling. Full-CG methylation changes were a more general process initiated in the hybrid and continued after genome doubling. These results indicate that genome doubling could ameliorate genomic and transcriptomic alterations induced by hybridisation and instigate additional alterations in trigenomic Brassica allohexaploids. Moreover, genome doubling also modified hybridisation-induced progenitor genome-biased alterations and epigenetic alteration characteristics.

  19. Marine Genomics: A clearing-house for genomic and transcriptomic data of marine organisms

    PubMed Central

    McKillen, David J; Chen, Yian A; Chen, Chuming; Jenny, Matthew J; Trent, Harold F; Robalino, Javier; McLean, David C; Gross, Paul S; Chapman, Robert W; Warr, Gregory W; Almeida, Jonas S

    2005-01-01

    Background The Marine Genomics project is a functional genomics initiative developed to provide a pipeline for the curation of Expressed Sequence Tags (ESTs) and gene expression microarray data for marine organisms. It provides a unique clearing-house for marine specific EST and microarray data and is currently available at . Description The Marine Genomics pipeline automates the processing, maintenance, storage and analysis of EST and microarray data for an increasing number of marine species. It currently contains 19 species databases (over 46,000 EST sequences) that are maintained by registered users from local and remote locations in Europe and South America in addition to the USA. A collection of analysis tools are implemented. These include a pipeline upload tool for EST FASTA file, sequence trace file and microarray data, an annotative text search, automated sequence trimming, sequence quality control (QA/QC) editing, sequence BLAST capabilities and a tool for interactive submission to GenBank. Another feature of this resource is the integration with a scientific computing analysis environment implemented by MATLAB. Conclusion The conglomeration of multiple marine organisms with integrated analysis tools enables users to focus on the comprehensive descriptions of transcriptomic responses to typical marine stresses. This cross species data comparison and integration enables users to contain their research within a marine-oriented data management and analysis environment. PMID:15760464

  20. Systems perspectives on erythromycin biosynthesis by comparative genomic and transcriptomic analyses of S. erythraea E3 and NRRL23338 strains

    PubMed Central

    2013-01-01

    Background S. erythraea is a Gram-positive filamentous bacterium used for the industrial-scale production of erythromycin A which is of high clinical importance. In this work, we sequenced the whole genome of a high-producing strain (E3) obtained by random mutagenesis and screening from the wild-type strain NRRL23338, and examined time-series expression profiles of both E3 and NRRL23338. Based on the genomic data and transcriptpmic data of these two strains, we carried out comparative analysis of high-producing strain and wild-type strain at both the genomic level and the transcriptomic level. Results We observed a large number of genetic variants including 60 insertions, 46 deletions and 584 single nucleotide variations (SNV) in E3 in comparison with NRRL23338, and the analysis of time series transcriptomic data indicated that the genes involved in erythromycin biosynthesis and feeder pathways were significantly up-regulated during the 60 hours time-course. According to our data, BldD, a previously identified ery cluster regulator, did not show any positive correlations with the expression of ery cluster, suggesting the existence of alternative regulation mechanisms of erythromycin synthesis in S. erythraea. Several potential regulators were then proposed by integration analysis of genomic and transcriptomic data. Conclusion This is a demonstration of the functional comparative genomics between an industrial S. erythraea strain and the wild-type strain. These findings help to understand the global regulation mechanisms of erythromycin biosynthesis in S. erythraea, providing useful clues for genetic and metabolic engineering in the future. PMID:23902230

  1. Primary analysis of repeat elements of the Asian seabass (Lates calcarifer) transcriptome and genome.

    PubMed

    Kuznetsova, Inna S; Thevasagayam, Natascha M; Sridatta, Prakki S R; Komissarov, Aleksey S; Saju, Jolly M; Ngoh, Si Y; Jiang, Junhui; Shen, Xueyan; Orbán, László

    2014-01-01

    As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n = 24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8-14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionarily conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates. PMID:25120555

  2. Primary analysis of repeat elements of the Asian seabass (Lates calcarifer) transcriptome and genome.

    PubMed

    Kuznetsova, Inna S; Thevasagayam, Natascha M; Sridatta, Prakki S R; Komissarov, Aleksey S; Saju, Jolly M; Ngoh, Si Y; Jiang, Junhui; Shen, Xueyan; Orbán, László

    2014-01-01

    As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n = 24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8-14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionarily conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates.

  3. Transcriptome complexity in a genome-reduced bacterium.

    PubMed

    Güell, Marc; van Noort, Vera; Yus, Eva; Chen, Wei-Hua; Leigh-Bell, Justine; Michalodimitrakis, Konstantinos; Yamada, Takuji; Arumugam, Manimozhiyan; Doerks, Tobias; Kühner, Sebastian; Rode, Michaela; Suyama, Mikita; Schmidt, Sabine; Gavin, Anne-Claude; Bork, Peer; Serrano, Luis

    2009-11-27

    To study basic principles of transcriptome organization in bacteria, we analyzed one of the smallest self-replicating organisms, Mycoplasma pneumoniae. We combined strand-specific tiling arrays, complemented by transcriptome sequencing, with more than 252 spotted arrays. We detected 117 previously undescribed, mostly noncoding transcripts, 89 of them in antisense configuration to known genes. We identified 341 operons, of which 139 are polycistronic; almost half of the latter show decaying expression in a staircase-like manner. Under various conditions, operons could be divided into 447 smaller transcriptional units, resulting in many alternative transcripts. Frequent antisense transcripts, alternative transcripts, and multiple regulators per gene imply a highly dynamic transcriptome, more similar to that of eukaryotes than previously thought.

  4. Identifying characteristic scales in the human genome

    NASA Astrophysics Data System (ADS)

    Carpena, P.; Bernaola-Galván, P.; Coronado, A. V.; Hackenberg, M.; Oliver, J. L.

    2007-03-01

    The scale-free, long-range correlations detected in DNA sequences contrast with characteristic lengths of genomic elements, being particularly incompatible with the isochores (long, homogeneous DNA segments). By computing the local behavior of the scaling exponent α of detrended fluctuation analysis (DFA), we discriminate between sequences with and without true scaling, and we find that no single scaling exists in the human genome. Instead, human chromosomes show a common compositional structure with two characteristic scales, the large one corresponding to the isochores and the other to small and medium scale genomic elements.

  5. De novo Transcriptome Assemblies of Rana (Lithobates) catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes.

    PubMed

    Birol, Inanc; Behsaz, Bahar; Hammond, S Austin; Kucuk, Erdi; Veldhoen, Nik; Helbing, Caren C

    2015-01-01

    In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates) catesbeiana) and the African clawed frog (Xenopus laevis). We used high throughput RNA sequencing (RNA-seq) data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences. PMID:26121473

  6. De novo Transcriptome Assemblies of Rana (Lithobates) catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes

    PubMed Central

    Birol, Inanc; Behsaz, Bahar; Hammond, S. Austin; Kucuk, Erdi; Veldhoen, Nik; Helbing, Caren C.

    2015-01-01

    In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates) catesbeiana) and the African clawed frog (Xenopus laevis). We used high throughput RNA sequencing (RNA-seq) data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences. PMID:26121473

  7. De novo Transcriptome Assemblies of Rana (Lithobates) catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes.

    PubMed

    Birol, Inanc; Behsaz, Bahar; Hammond, S Austin; Kucuk, Erdi; Veldhoen, Nik; Helbing, Caren C

    2015-01-01

    In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates) catesbeiana) and the African clawed frog (Xenopus laevis). We used high throughput RNA sequencing (RNA-seq) data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences.

  8. Genomic, transcriptomic and phenomic variation reveals the complex adaptation to stress response of modern maize breeding

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Early maize adaptation to different agricultural environments was an important process associated with the creation of a stable food supply that allowed the evolution of human civilization in the Americas. To explore the mechanisms of maize adaptation, genomic, transcriptomic and phenomic data were ...

  9. Allele identification for transcriptome-based population genomics in the invasive plant Centaurea solstitialis.

    PubMed

    Dlugosch, Katrina M; Lai, Zhao; Bonin, Aurélie; Hierro, José; Rieseberg, Loren H

    2013-02-01

    Transcriptome sequences are becoming more broadly available for multiple individuals of the same species, providing opportunities to derive population genomic information from these datasets. Using the 454 Life Science Genome Sequencer FLX and FLX-Titanium next-generation platforms, we generated 11-430 Mbp of sequence for normalized cDNA for 40 wild genotypes of the invasive plant Centaurea solstitialis, yellow starthistle, from across its worldwide distribution. We examined the impact of sequencing effort on transcriptome recovery and overlap among individuals. To do this, we developed two novel publicly available software pipelines: SnoWhite for read cleaning before assembly, and AllelePipe for clustering of loci and allele identification in assembled datasets with or without a reference genome. AllelePipe is designed specifically for cases in which read depth information is not appropriate or available to assist with disentangling closely related paralogs from allelic variation, as in transcriptome or previously assembled libraries. We find that modest applications of sequencing effort recover most of the novel sequences present in the transcriptome of this species, including single-copy loci and a representative distribution of functional groups. In contrast, the coverage of variable sites, observation of heterozygosity, and overlap among different libraries are all highly dependent on sequencing effort. Nevertheless, the information gained from overlapping regions was informative regarding coarse population structure and variation across our small number of population samples, providing the first genetic evidence in support of hypothesized invasion scenarios.

  10. Genome wide transcriptome profiling reveals differential gene expression in secondary metabolite pathway of Cymbopogon winterianus

    PubMed Central

    Devi, Kamalakshi; Mishra, Surajit K.; Sahu, Jagajjit; Panda, Debashis; Modi, Mahendra K.; Sen, Priyabrata

    2016-01-01

    Advances in transcriptome sequencing provide fast, cost-effective and reliable approach to generate large expression datasets especially suitable for non-model species to identify putative genes, key pathway and regulatory mechanism. Citronella (Cymbopogon winterianus) is an aromatic medicinal grass used for anti-tumoral, antibacterial, anti-fungal, antiviral, detoxifying and natural insect repellent properties. Despite of having number of utilities, the genes involved in terpenes biosynthetic pathway is not yet clearly elucidated. The present study is a pioneering attempt to generate an exhaustive molecular information of secondary metabolite pathway and to increase genomic resources in Citronella. Using high-throughput RNA-Seq technology, root and leaf transcriptome was analysed at an unprecedented depth (11.7 Gb). Targeted searches identified majority of the genes associated with metabolic pathway and other natural product pathway viz. antibiotics synthesis along with many novel genes. Terpenoid biosynthesis genes comparative expression results were validated for 15 unigenes by RT-PCR and qRT-PCR. Thus the coverage of these transcriptome is comprehensive enough to discover all known genes of major metabolic pathways. This transcriptome dataset can serve as important public information for gene expression, genomics and function genomics studies in Citronella and shall act as a benchmark for future improvement of the crop. PMID:26877149

  11. Genome wide transcriptome profiling reveals differential gene expression in secondary metabolite pathway of Cymbopogon winterianus.

    PubMed

    Devi, Kamalakshi; Mishra, Surajit K; Sahu, Jagajjit; Panda, Debashis; Modi, Mahendra K; Sen, Priyabrata

    2016-01-01

    Advances in transcriptome sequencing provide fast, cost-effective and reliable approach to generate large expression datasets especially suitable for non-model species to identify putative genes, key pathway and regulatory mechanism. Citronella (Cymbopogon winterianus) is an aromatic medicinal grass used for anti-tumoral, antibacterial, anti-fungal, antiviral, detoxifying and natural insect repellent properties. Despite of having number of utilities, the genes involved in terpenes biosynthetic pathway is not yet clearly elucidated. The present study is a pioneering attempt to generate an exhaustive molecular information of secondary metabolite pathway and to increase genomic resources in Citronella. Using high-throughput RNA-Seq technology, root and leaf transcriptome was analysed at an unprecedented depth (11.7 Gb). Targeted searches identified majority of the genes associated with metabolic pathway and other natural product pathway viz. antibiotics synthesis along with many novel genes. Terpenoid biosynthesis genes comparative expression results were validated for 15 unigenes by RT-PCR and qRT-PCR. Thus the coverage of these transcriptome is comprehensive enough to discover all known genes of major metabolic pathways. This transcriptome dataset can serve as important public information for gene expression, genomics and function genomics studies in Citronella and shall act as a benchmark for future improvement of the crop. PMID:26877149

  12. Allele Identification for Transcriptome-Based Population Genomics in the Invasive Plant Centaurea solstitialis

    PubMed Central

    Dlugosch, Katrina M.; Lai, Zhao; Bonin, Aurélie; Hierro, José; Rieseberg, Loren H.

    2013-01-01

    Transcriptome sequences are becoming more broadly available for multiple individuals of the same species, providing opportunities to derive population genomic information from these datasets. Using the 454 Life Science Genome Sequencer FLX and FLX-Titanium next-generation platforms, we generated 11−430 Mbp of sequence for normalized cDNA for 40 wild genotypes of the invasive plant Centaurea solstitialis, yellow starthistle, from across its worldwide distribution. We examined the impact of sequencing effort on transcriptome recovery and overlap among individuals. To do this, we developed two novel publicly available software pipelines: SnoWhite for read cleaning before assembly, and AllelePipe for clustering of loci and allele identification in assembled datasets with or without a reference genome. AllelePipe is designed specifically for cases in which read depth information is not appropriate or available to assist with disentangling closely related paralogs from allelic variation, as in transcriptome or previously assembled libraries. We find that modest applications of sequencing effort recover most of the novel sequences present in the transcriptome of this species, including single-copy loci and a representative distribution of functional groups. In contrast, the coverage of variable sites, observation of heterozygosity, and overlap among different libraries are all highly dependent on sequencing effort. Nevertheless, the information gained from overlapping regions was informative regarding coarse population structure and variation across our small number of population samples, providing the first genetic evidence in support of hypothesized invasion scenarios. PMID:23390612

  13. Genome wide transcriptome profiling reveals differential gene expression in secondary metabolite pathway of Cymbopogon winterianus.

    PubMed

    Devi, Kamalakshi; Mishra, Surajit K; Sahu, Jagajjit; Panda, Debashis; Modi, Mahendra K; Sen, Priyabrata

    2016-01-01

    Advances in transcriptome sequencing provide fast, cost-effective and reliable approach to generate large expression datasets especially suitable for non-model species to identify putative genes, key pathway and regulatory mechanism. Citronella (Cymbopogon winterianus) is an aromatic medicinal grass used for anti-tumoral, antibacterial, anti-fungal, antiviral, detoxifying and natural insect repellent properties. Despite of having number of utilities, the genes involved in terpenes biosynthetic pathway is not yet clearly elucidated. The present study is a pioneering attempt to generate an exhaustive molecular information of secondary metabolite pathway and to increase genomic resources in Citronella. Using high-throughput RNA-Seq technology, root and leaf transcriptome was analysed at an unprecedented depth (11.7 Gb). Targeted searches identified majority of the genes associated with metabolic pathway and other natural product pathway viz. antibiotics synthesis along with many novel genes. Terpenoid biosynthesis genes comparative expression results were validated for 15 unigenes by RT-PCR and qRT-PCR. Thus the coverage of these transcriptome is comprehensive enough to discover all known genes of major metabolic pathways. This transcriptome dataset can serve as important public information for gene expression, genomics and function genomics studies in Citronella and shall act as a benchmark for future improvement of the crop.

  14. Allele identification for transcriptome-based population genomics in the invasive plant Centaurea solstitialis.

    PubMed

    Dlugosch, Katrina M; Lai, Zhao; Bonin, Aurélie; Hierro, José; Rieseberg, Loren H

    2013-02-01

    Transcriptome sequences are becoming more broadly available for multiple individuals of the same species, providing opportunities to derive population genomic information from these datasets. Using the 454 Life Science Genome Sequencer FLX and FLX-Titanium next-generation platforms, we generated 11-430 Mbp of sequence for normalized cDNA for 40 wild genotypes of the invasive plant Centaurea solstitialis, yellow starthistle, from across its worldwide distribution. We examined the impact of sequencing effort on transcriptome recovery and overlap among individuals. To do this, we developed two novel publicly available software pipelines: SnoWhite for read cleaning before assembly, and AllelePipe for clustering of loci and allele identification in assembled datasets with or without a reference genome. AllelePipe is designed specifically for cases in which read depth information is not appropriate or available to assist with disentangling closely related paralogs from allelic variation, as in transcriptome or previously assembled libraries. We find that modest applications of sequencing effort recover most of the novel sequences present in the transcriptome of this species, including single-copy loci and a representative distribution of functional groups. In contrast, the coverage of variable sites, observation of heterozygosity, and overlap among different libraries are all highly dependent on sequencing effort. Nevertheless, the information gained from overlapping regions was informative regarding coarse population structure and variation across our small number of population samples, providing the first genetic evidence in support of hypothesized invasion scenarios. PMID:23390612

  15. Next generation transcriptomics and genomics elucidate biological complexity of microglia in health and disease.

    PubMed

    Wes, Paul D; Holtman, Inge R; Boddeke, Erik W G M; Möller, Thomas; Eggen, Bart J L

    2016-02-01

    Genome-wide expression profiling technology has resulted in detailed transcriptome data for a wide range of tissues, conditions and diseases. In neuroscience, expression datasets were mostly generated using whole brain tissue samples, resulting in data from a mixture of cell types, including glial cells and neurons. Over the past few years, a rapidly increasing number of expression profiling studies using isolated microglial cell populations have been reported. In these studies, the microglia transcriptome was compared to other cell types, such as other brain cells and peripheral tissue macrophages, and related to aging and neurodegenerative conditions. A commonality found in many of these studies was that microglia possess distinct gene expression signatures. This repertoire of selectively-expressed microglial genes highlight functions beyond immune responses, such as synaptic modulation and neurotrophic support, and open up avenues to explore as-yet-unexpected roles. These data provide improved understanding of disease pathology, and complement not only the aforementioned whole brain tissue transcriptome studies, but also genome- and epigenome-wide association studies. In this review, insights obtained from isolated microglia transcriptome studies are presented, and compared to studies using other genome-wide approaches. The relation of microglia to other tissue macrophages and glial cell populations, as well as the role of microglia in the aging brain and in neurodegenerative conditions, will be discussed. Many more of these types of studies are expected in the near future, hopefully leading to the identification of novel genes and targets for neurodegenerative conditions.

  16. Maize pan-transcriptome provides novel insights into genome complexity and quantitative trait variation

    PubMed Central

    Jin, Minliang; Liu, Haijun; He, Cheng; Fu, Junjie; Xiao, Yingjie; Wang, Yuebin; Xie, Weibo; Wang, Guoying; Yan, Jianbing

    2016-01-01

    Gene expression variation largely contributes to phenotypic diversity and constructing pan-transcriptome is considered necessary for species with complex genomes. However, the regulation mechanisms and functional consequences of pan-transcriptome is unexplored systematically. By analyzing RNA-seq data from 368 maize diverse inbred lines, we identified almost one-third nuclear genes under expression presence and absence variation, which tend to play regulatory roles and are likely regulated by distant eQTLs. The ePAV was directly used as “genotype” to perform GWAS for 15 agronomic phenotypes and 526 metabolic traits to efficiently explore the associations between transcriptomic and phenomic variations. Through a modified assembly strategy, 2,355 high-confidence novel sequences with total 1.9 Mb lengths were found absent within reference genome. Ten randomly selected novel sequences were fully validated with genomic PCR, including another two NBS_LRR candidates potentially affect flavonoids and disease-resistance. A simulation analysis suggested that the pan-transcriptome of the maize whole kernel is approaching a maximum value of 63,000 genes, and through developing two test-cross populations and surveying several most important yield traits, the dispensable genes were shown to contribute to heterosis. Novel perspectives and resources to discover maize quantitative trait variations were provided to better understand the kernel regulation networks and to enhance maize breeding. PMID:26729541

  17. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    PubMed Central

    2012-01-01

    Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro. PMID:22901030

  18. The Arabidopsis root transcriptome by serial analysis of gene expression. Gene identification using the genome sequence.

    PubMed

    Fizames, Cécile; Muños, Stéphane; Cazettes, Céline; Nacry, Philippe; Boucherez, Jossia; Gaymard, Frédéric; Piquemal, David; Delorme, Valérie; Commes, Thérèse; Doumas, Patrick; Cooke, Richard; Marti, Jacques; Sentenac, Hervé; Gojon, Alain

    2004-01-01

    Large-scale identification of genes expressed in roots of the model plant Arabidopsis was performed by serial analysis of gene expression (SAGE), on a total of 144,083 sequenced tags, representing at least 15,964 different mRNAs. For tag to gene assignment, we developed a computational approach based on 26,620 genes annotated from the complete sequence of the genome. The procedure selected warrants the identification of the genes corresponding to the majority of the tags found experimentally, with a high level of reliability, and provides a reference database for SAGE studies in Arabidopsis. This new resource allowed us to characterize the expression of more than 3,000 genes, for which there is no expressed sequence tag (EST) or cDNA in the databases. Moreover, 85% of the tags were specific for one gene. To illustrate this advantage of SAGE for functional genomics, we show that our data allow an unambiguous analysis of most of the individual genes belonging to 12 different ion transporter multigene families. These results indicate that, compared with EST-based tag to gene assignment, the use of the annotated genome sequence greatly improves gene identification in SAGE studies. However, more than 6,000 different tags remained with no gene match, suggesting that a significant proportion of transcripts present in the roots originate from yet unknown or wrongly annotated genes. The root transcriptome characterized in this study markedly differs from those obtained in other organs, and provides a unique resource for investigating the functional specificities of the root system. As an example of the use of SAGE for transcript profiling in Arabidopsis, we report here the identification of 270 genes differentially expressed between roots of plants grown either with NO3- or NH4NO3 as N source.

  19. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

    PubMed Central

    2012-01-01

    Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331

  20. Genomic and Transcriptomic Studies in Mycobacterium avium subspecies paratuberculosis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microarray technology is an important tool in functional genomic research. It has enabled a deeper analysis of genomic diversity among bacteria belonging to the Mycobacterium avium Complex (MAC). In addition, the expression of thousands of genes can be studied simultaneously in a single experiment...

  1. Chapter 4 genomics, transcriptomics, and epigenomics in traumatic brain injury research.

    PubMed

    Puccio, Ava M; Alexander, Sheila

    2015-01-01

    The long-term effects and significant impact of the full spectrum of traumatic brain injury (TBI) has received increased attention in recent years. Despite increased research efforts, there has been little movement toward improving outcomes for the survivors of TBI. TBI is a heterogeneous condition with a complex biological response, and significant variability in human recovery contributes to the difficulty in identifying therapeutics that improve outcomes. Personalized medicine, identifying the best course of treatment for a given individual based on individual characteristics, has great potential to improve recovery for TBI survivors. The advances in medical genetics and genomics over the past 20 years have increased our understanding of many biological processes. A substantial amount of research has focused on the genomic, transcriptomic, and epigenomic profiles in many health and disease states, including recovery from TBI. The focus of this review chapter is to describe the current state of the science in genomic, transcriptomic, and epigenomic research in the TBI population. There have been some advancements toward understanding the genomic, transcriptomic, and epigenomic processes in humans, but much of this work remains at the preclinical stage. This current evidence does improve our understanding of TBI recovery, but also serves as an excellent platform upon which to build further study toward improved outcomes for this population.

  2. Recent advances in genomics and transcriptomics of cnidarians.

    PubMed

    Technau, Ulrich; Schwaiger, Michaela

    2015-12-01

    The advent of the genomic era has provided important and surprising insights into the deducted genetic composition of the common ancestor of cnidarians and bilaterians. This has changed our view of how genomes of metazoans evolve and when crucial gene families arose and diverged in animal evolution. Sequencing of several cnidarian genomes showed that cnidarians share a great part of their gene repertoire as well as genome synteny with vertebrates, with less gene losses in the anthozoan cnidarian lineage than for example in ecdysozoans like Drosophila melanogaster or Caenorhabditis elegans. The Hydra genome on the other hand has evolved more rapidly indicated by more divergent sequences, more cases of gene losses and many taxonomically restricted genes. Cnidarian genomes also contain a rich repertoire of transcription factors, including those that in bilaterian model organisms regulate the development of key bilaterian traits such as mesoderm, nervous system development and bilaterality. The sea anemone Nematostella vectensis, and possibly cnidarians in general, does not only share its complex gene repertoire with bilaterians, but also the regulation of crucial developmental regulatory genes via distal enhancer elements. In addition, epigenetic modifications on DNA and chromatin are shared among eumetazoans. This suggests that most conserved genes present in our genomes today, as well as the mechanisms guiding their expression, evolved before the divergence of cnidarians and bilaterians about 600 Myr ago. PMID:26421490

  3. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts

    PubMed Central

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M.; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G.; Schroeder, Steven; Scheffler, Brian; Duke, Mary V.; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L.; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C.

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  4. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts.

    PubMed

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G; Schroeder, Steven; Scheffler, Brian; Duke, Mary V; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  5. Reliable transformation system for Microbotryum lychnidis-dioicae informed by genome and transcriptome project.

    PubMed

    Toh, Su San; Treves, David S; Barati, Michelle T; Perlin, Michael H

    2016-10-01

    Microbotryum lychnidis-dioicae is a member of a species complex infecting host plants in the Caryophyllaceae. It is used as a model system in many areas of research, but attempts to make this organism tractable for reverse genetic approaches have not been fruitful. Here, we exploited the recently obtained genome sequence and transcriptome analysis to inform our design of constructs for use in Agrobacterium-mediated transformation techniques currently available for other fungi. Reproducible transformation was demonstrated at the genomic, transcriptional and functional levels. Moreover, these initial proof-of-principle experiments provide evidence that supports the findings from initial global transcriptome analysis regarding expression from the respective promoters under different growth conditions of the fungus. The technique thus provides for the first time the ability to stably introduce transgenes and over-express target M. lychnidis-dioicae genes. PMID:27215216

  6. Oil Accumulation by the Oleaginous Diatom Fistulifera solaris as Revealed by the Genome and Transcriptome

    PubMed Central

    Veluchamy, Alaguraj; Tanaka, Michihiro; Abida, Heni; Maréchal, Eric; Bowler, Chris; Muto, Masaki; Sunaga, Yoshihiko; Tanaka, Masayoshi; Taniguchi, Takeaki; Fukuda, Yorikane; Nemoto, Michiko; Matsumoto, Mitsufumi; Wong, Pui Shan; Aburatani, Sachiyo; Fujibuchi, Wataru

    2015-01-01

    Oleaginous photosynthetic organisms such as microalgae are promising sources for biofuel production through the generation of carbon-neutral sustainable energy. However, the metabolic mechanisms driving high-rate lipid production in these oleaginous organisms remain unclear, thus impeding efforts to improve productivity through genetic modifications. We analyzed the genome and transcriptome of the oleaginous diatom Fistulifera solaris JPCC DA0580. Next-generation sequencing technology provided evidence of an allodiploid genome structure, suggesting unorthodox molecular evolutionary and genetic regulatory systems for reinforcing metabolic efficiencies. Although major metabolic pathways were shared with nonoleaginous diatoms, transcriptome analysis revealed unique expression patterns, such as concomitant upregulation of fatty acid/triacylglycerol biosynthesis and fatty acid degradation (β-oxidation) in concert with ATP production. This peculiar pattern of gene expression may account for the simultaneous growth and oil accumulation phenotype and may inspire novel biofuel production technology based on this oleaginous microalga. PMID:25634988

  7. Oil accumulation by the oleaginous diatom Fistulifera solaris as revealed by the genome and transcriptome.

    PubMed

    Tanaka, Tsuyoshi; Maeda, Yoshiaki; Veluchamy, Alaguraj; Tanaka, Michihiro; Abida, Heni; Maréchal, Eric; Bowler, Chris; Muto, Masaki; Sunaga, Yoshihiko; Tanaka, Masayoshi; Yoshino, Tomoko; Taniguchi, Takeaki; Fukuda, Yorikane; Nemoto, Michiko; Matsumoto, Mitsufumi; Wong, Pui Shan; Aburatani, Sachiyo; Fujibuchi, Wataru

    2015-01-01

    Oleaginous photosynthetic organisms such as microalgae are promising sources for biofuel production through the generation of carbon-neutral sustainable energy. However, the metabolic mechanisms driving high-rate lipid production in these oleaginous organisms remain unclear, thus impeding efforts to improve productivity through genetic modifications. We analyzed the genome and transcriptome of the oleaginous diatom Fistulifera solaris JPCC DA0580. Next-generation sequencing technology provided evidence of an allodiploid genome structure, suggesting unorthodox molecular evolutionary and genetic regulatory systems for reinforcing metabolic efficiencies. Although major metabolic pathways were shared with nonoleaginous diatoms, transcriptome analysis revealed unique expression patterns, such as concomitant upregulation of fatty acid/triacylglycerol biosynthesis and fatty acid degradation (β-oxidation) in concert with ATP production. This peculiar pattern of gene expression may account for the simultaneous growth and oil accumulation phenotype and may inspire novel biofuel production technology based on this oleaginous microalga.

  8. Large-Scale Transcriptome Analysis of Retroelements in the Migratory Locust, Locusta migratoria

    PubMed Central

    Guo, Wei; Wang, Xianhui; Kang, Le

    2012-01-01

    Background Retroelements can successfully colonize eukaryotic genome through RNA-mediated transposition, and are considered to be some of the major mediators of genome size. The migratory locust Locusta migratoria is an insect with a large genome size, and its genome is probably subject to the proliferation of retroelements. An analysis of deep-sequencing transcriptome data will elucidate the structure, diversity and expression characteristics of retroelements. Results We performed a de novo assembly from deep sequencing RNA-seq data and identified 105 retroelements in the locust transcriptome. Phylogenetic analysis of reverse transcriptase sequences revealed 1 copia, 1 BEL, 8 gypsy and 23 non-long terminal repeat (LTR) retroelements in the locust transcriptome. A novel approach was developed to identify full-length LTR retroelements. A total of 5 full-length LTR retroelements and 2 full-length non-LTR retroelements that contained complete structures for retrotransposition were identified. Structural analysis indicated that all these retroelements may have been activated or deprived of retrotransposition activities very recently. Expression profiling analysis revealed that the retroelements exhibited a unique expression pattern at the egg stage and showed differential expression profiles between the solitarious and gregarious phases at the fifth instar and adult stage. Conclusion We hereby present the first de novo transcriptome analysis of retroelements in a species whose genome is not available. This work contributes to a comprehensive understanding of the landscape of retroelements in the locust transcriptome. More importantly, the results reveal that non-LTR retroelements are abundant and diverse in the locust transcriptome. PMID:22792363

  9. Finding genome-transcriptome-phenome association with structured association mapping and visualization in GenAMap.

    PubMed

    Curtis, Ross E; Yin, Junming; Kinnaird, Peter; Xing, Eric P

    2012-01-01

    Despite the success of genome-wide association studies in detecting novel disease variants, we are still far from a complete understanding of the mechanisms through which variants cause disease. Most of previous studies have considered only genome-phenome associations. However, the integration of transcriptome data may help further elucidate the mechanisms through which genetic mutations lead to disease and uncover potential pathways to target for treatment. We present a novel structured association mapping strategy for finding genome-transcriptome-phenome associations when SNP, gene-expression, and phenotype data are available for the same cohort. We do so via a two-step procedure where genome-transcriptome associations are identified by GFlasso, a sparse regression technique presented previously. Transcriptome-phenome associations are then found by a novel proposed method called gGFlasso, which leverages structure inherent in the genes and phenotypic traits. Due to the complex nature of three-way association results, visualization tools can aid in the discovery of causal SNPs and regulatory mechanisms affecting diseases. Using wellgrounded visualization techniques, we have designed new visualizations that filter through large three-way association results to detect interesting SNPs and associated genes and traits. The two-step GFlasso-gGFlasso algorithmic approach and new visualizations are integrated into GenAMap, a visual analytics system for structured association mapping. Results on simulated datasets show that our approach has the potential to increase the sensitivity and specificity of association studies, compared to existing procedures that do not exploit the full structural information of the data. We report results from an analysis on a publically available mouse dataset, showing that identified SNP-gene-trait associations are compatible with known biology.

  10. KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella

    PubMed Central

    2013-01-01

    Background The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to facilitate the development of insecticides with a novel mode of action for more effective and environmentally less harmful insecticide rotation. To contribute to this goal, we developed KONAGAbase, a genomic and transcriptomic database for DBM (KONAGA is the Japanese word for DBM). Description KONAGAbase provides (1) transcriptomic sequences of 37,340 ESTs/mRNAs and 147,370 RNA-seq contigs which were clustered and assembled into 84,570 unigenes (30,695 contigs, 50,548 pseudo singletons, and 3,327 singletons); and (2) genomic sequences of 88,530 WGS contigs with 246,244 degenerate contigs and 106,455 singletons from which 6,310 de novo identified repeat sequences and 34,890 predicted gene-coding sequences were extracted. The unigenes and predicted gene-coding sequences were clustered and 32,800 representative sequences were extracted as a comprehensive putative gene set. These sequences were annotated with BLAST descriptions, Gene Ontology (GO) terms, and Pfam descriptions, respectively. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data. Five useful search interfaces consisting of BLAST search, keyword search, BLAST result-based search, GO tree-based search, and genome browser are provided. KONAGAbase is publicly available from our website (http://dbm.dna.affrc.go.jp/px/) through standard web browsers. Conclusions KONAGAbase provides DBM comprehensive transcriptomic

  11. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    PubMed

    Verma, Mohit; Kumar, Vinay; Patel, Ravi K; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html. PMID:26322998

  12. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    PubMed

    Verma, Mohit; Kumar, Vinay; Patel, Ravi K; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html.

  13. Comparative Transcriptome and Chloroplast Genome Analyses of Two Related Dipteronia Species

    PubMed Central

    Zhou, Tao; Chen, Chen; Wei, Yue; Chang, Yongxia; Bai, Guoqing; Li, Zhonghu; Kanwal, Nazish; Zhao, Guifang

    2016-01-01

    Dipteronia (order Sapindales) is an endangered genus endemic to China and has two living species, D.sinensis and D. dyeriana. The plants are closely related to the genus Acer, which is also classified in the order Sapindales. Evolutionary studies on Dipteronia have been hindered by the paucity of information on their genomes and plastids. Here, we used next generation sequencing to characterize the transcriptomes and complete chloroplast genomes of both Dipteronia species. A comparison of the transcriptomes of both species identified a total of 7814 orthologs. Estimation of selection pressures using Ka/Ks ratios showed that only 30 of 5435 orthologous pairs had a ratio significantly >1, i.e., showing positive selection. However, 4041 orthologs had a Ka/Ks < 0.5 (p < 0.05), suggesting that most genes had likely undergone purifying selection. Based on orthologous unigenes, 314 single copy nuclear genes (SCNGs) were identified. Through a combination of de novo and reference guided assembly, plastid genomes were obtained; that of D. sinensis was 157,080 bp and that of D. dyeriana was 157,071 bp. Both plastid genomes encoded 87 protein coding genes, 40 tRNAs, and 8 rRNAs; no significant differences were detected in the size, gene content, and organization of the two plastomes. We used the whole chloroplast genomes to determine the phylogeny of D. sinensis and D. dyeriana and confirmed that the two species were highly divergent. Overall, our study provides comprehensive transcriptomic and chloroplast genomic resources, which will be valuable for future evolutionary studies of Dipteronia. PMID:27790228

  14. SNP discovery in European anchovy (Engraulis encrasicolus, L) by high-throughput transcriptome and genome sequencing.

    PubMed

    Montes, Iratxe; Conklin, Darrell; Albaina, Aitor; Creer, Simon; Carvalho, Gary R; Santos, María; Estonba, Andone

    2013-01-01

    Increased throughput in sequencing technologies has facilitated the acquisition of detailed genomic information in non-model species. The focus of this research was to discover and validate SNPs derived from the European anchovy (Engraulis encrasicolus) transcriptome, a species with no available reference genome, using next generation sequencing technologies. A cDNA library was constructed from four tissues of ten fish individuals corresponding to three populations of E. encrasicolus, and Roche 454 GS FLX Titanium sequencing yielded 19,367 contigs. Additionally, the European anchovy genome was sequenced for the same ten individuals using an Illumina HiSeq2000. Using a computational pipeline for combining transcriptome and genome information, a total of 18,994 SNPs met the necessary minor allele frequency and depth filters. A series of further stringent filters were applied to identify those SNPs likely to succeed in genotyping assays, and for filtering of those in potential duplicated genome regions. A novel method for detecting potential intron-exon boundaries in areas of putative SNPs has also been applied in silico to improve genotyping success. In all, 2,317 filtered putative transcriptome SNPs suitable for genotyping primer design were identified. From those, a subset of 530 were selected, with the genotyping results showing the highest reported conversion and validation rates (91.3% and 83.2%, respectively) reported to date for a non-model species. This study represents a promising strategy to discover genotypable SNPs in the exome of non-model organisms. The genomic resource generated for E. encrasicolus, both in terms of sequences and novel markers, will be informative for research into this species with applications including traceability studies, population genetic analyses and aquaculture.

  15. Genome scale engineering techniques for metabolic engineering.

    PubMed

    Liu, Rongming; Bassalo, Marcelo C; Zeitoun, Ramsey I; Gill, Ryan T

    2015-11-01

    Metabolic engineering has expanded from a focus on designs requiring a small number of genetic modifications to increasingly complex designs driven by advances in genome-scale engineering technologies. Metabolic engineering has been generally defined by the use of iterative cycles of rational genome modifications, strain analysis and characterization, and a synthesis step that fuels additional hypothesis generation. This cycle mirrors the Design-Build-Test-Learn cycle followed throughout various engineering fields that has recently become a defining aspect of synthetic biology. This review will attempt to summarize recent genome-scale design, build, test, and learn technologies and relate their use to a range of metabolic engineering applications.

  16. Transcriptome characterization and SSR discovery in large-scale loach Paramisgurnus dabryanus (Cobitidae, Cypriniformes).

    PubMed

    Li, Caijuan; Ling, Qufei; Ge, Chen; Ye, Zhuqing; Han, Xiaofei

    2015-02-25

    The large-scale loach (Paramisgurnus dabryanus, Cypriniformes) is a bottom-dwelling freshwater species of fish found mainly in eastern Asia. The natural germplasm resources of this important aquaculture species has been recently threatened due to overfishing and artificial propagation. The objective of this study is to obtain the first functional genomic resource and candidate molecular markers for future conservation and breeding research. Illumina paired-end sequencing generated over one hundred million reads that resulted in 71,887 assembled transcripts, with an average length of 1465bp. 42,093 (58.56%) protein-coding sequences were predicted; and 43,837 transcripts had significant matches to NCBI nonredundant protein (Nr) database. 29,389 and 14,419 transcripts were assigned into gene ontology (GO) categories and Eukaryotic Orthologous Groups (KOG), respectively. 22,102 (31.14%) transcripts were mapped to 302 KEGG pathways. In addition, 15,106 candidate SSR markers were identified, with 11,037 pairs of PCR primers designed. 400 primers pairs of SSR selected randomly were validated, of which 364 (91%) pairs of primers were able to produce PCR products. Further test with 41 loci and 20 large-scale loach specimens collected from the four largest lakes in China showed that 36 (87.8%) loci were polymorphic. The transcriptomic profile and SSR repertoire obtained in this study will facilitate population genetic studies and selective breeding of large-scale loach in the future. PMID:25528212

  17. Transcriptome characterization and SSR discovery in large-scale loach Paramisgurnus dabryanus (Cobitidae, Cypriniformes).

    PubMed

    Li, Caijuan; Ling, Qufei; Ge, Chen; Ye, Zhuqing; Han, Xiaofei

    2015-02-25

    The large-scale loach (Paramisgurnus dabryanus, Cypriniformes) is a bottom-dwelling freshwater species of fish found mainly in eastern Asia. The natural germplasm resources of this important aquaculture species has been recently threatened due to overfishing and artificial propagation. The objective of this study is to obtain the first functional genomic resource and candidate molecular markers for future conservation and breeding research. Illumina paired-end sequencing generated over one hundred million reads that resulted in 71,887 assembled transcripts, with an average length of 1465bp. 42,093 (58.56%) protein-coding sequences were predicted; and 43,837 transcripts had significant matches to NCBI nonredundant protein (Nr) database. 29,389 and 14,419 transcripts were assigned into gene ontology (GO) categories and Eukaryotic Orthologous Groups (KOG), respectively. 22,102 (31.14%) transcripts were mapped to 302 KEGG pathways. In addition, 15,106 candidate SSR markers were identified, with 11,037 pairs of PCR primers designed. 400 primers pairs of SSR selected randomly were validated, of which 364 (91%) pairs of primers were able to produce PCR products. Further test with 41 loci and 20 large-scale loach specimens collected from the four largest lakes in China showed that 36 (87.8%) loci were polymorphic. The transcriptomic profile and SSR repertoire obtained in this study will facilitate population genetic studies and selective breeding of large-scale loach in the future.

  18. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    PubMed

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species.

  19. The genome and developmental transcriptome of the strongylid nematode Haemonchus contortus

    PubMed Central

    2013-01-01

    Background The barber's pole worm, Haemonchus contortus, is one of the most economically important parasites of small ruminants worldwide. Although this parasite can be controlled using anthelmintic drugs, resistance against most drugs in common use has become a widespread problem. We provide a draft of the genome and the transcriptomes of all key developmental stages of H. contortus to support biological and biotechnological research areas of this and related parasites. Results The draft genome of H. contortus is 320 Mb in size and encodes 23,610 protein-coding genes. On a fundamental level, we elucidate transcriptional alterations taking place throughout the life cycle, characterize the parasite's gene silencing machinery, and explore molecules involved in development, reproduction, host-parasite interactions, immunity, and disease. The secretome of H. contortus is particularly rich in peptidases linked to blood-feeding activity and interactions with host tissues, and a diverse array of molecules is involved in complex immune responses. On an applied level, we predict drug targets and identify vaccine molecules. Conclusions The draft genome and developmental transcriptome of H. contortus provide a major resource to the scientific community for a wide range of genomic, genetic, proteomic, metabolomic, evolutionary, biological, ecological, and epidemiological investigations, and a solid foundation for biotechnological outcomes, including new anthelmintics, vaccines and diagnostic tests. This first draft genome of any strongylid nematode paves the way for a rapid acceleration in our understanding of a wide range of socioeconomically important parasites of one of the largest nematode orders. PMID:23985341

  20. Complete Genome and Transcriptomes of Streptococcus parasanguinis FW213: Phylogenic Relations and Potential Virulence Mechanisms

    PubMed Central

    Geng, Jianing; Chiu, Cheng-Hsun; Tang, Petrus; Chen, Yaping; Shieh, Hui-Ru; Hu, Songnian; Chen, Yi-Ywan M.

    2012-01-01

    Streptococcus parasanguinis, a primary colonizer of the tooth surface, is also an opportunistic pathogen for subacute endocarditis. The complete genome of strain FW213 was determined using the traditional shotgun sequencing approach and further refined by the transcriptomes of cells in early exponential and early stationary growth phases in this study. The transcriptomes also discovered 10 transcripts encoding known hypothetical proteins, one pseudogene, five transcripts matched to the Rfam and additional 87 putative small RNAs within the intergenic regions defined by the GLIMMER analysis. The genome contains five acquired genomic islands (GIs) encoding proteins which potentially contribute to the overall pathogenic capacity and fitness of this microbe. The differential expression of the GIs and various open reading frames outside the GIs at the two growth phases suggested that FW213 possess a range of mechanisms to avoid host immune clearance, to colonize host tissues, to survive within oral biofilms and to overcome various environmental insults. Furthermore, the comparative genome analysis of five S. parasanguinis strains indicates that albeit S. parasanguinis strains are highly conserved, variations in the genome content exist. These variations may reflect differences in pathogenic potential between the strains. PMID:22529932

  1. Insights into the Maize Pan-Genome and Pan-Transcriptome[W][OPEN

    PubMed Central

    Hirsch, Candice N.; Foerster, Jillian M.; Johnson, James M.; Sekhon, Rajandeep S.; Muttoni, German; Vaillancourt, Brieanne; Peñagaricano, Francisco; Lindquist, Erika; Pedraza, Mary Ann; Barry, Kerrie; de Leon, Natalia; Kaeppler, Shawn M.; Buell, C. Robin

    2014-01-01

    Genomes at the species level are dynamic, with genes present in every individual (core) and genes in a subset of individuals (dispensable) that collectively constitute the pan-genome. Using transcriptome sequencing of seedling RNA from 503 maize (Zea mays) inbred lines to characterize the maize pan-genome, we identified 8681 representative transcript assemblies (RTAs) with 16.4% expressed in all lines and 82.7% expressed in subsets of the lines. Interestingly, with linkage disequilibrium mapping, 76.7% of the RTAs with at least one single nucleotide polymorphism (SNP) could be mapped to a single genetic position, distributed primarily throughout the nonpericentromeric portion of the genome. Stepwise iterative clustering of RTAs suggests, within the context of the genotypes used in this study, that the maize genome is restricted and further sampling of seedling RNA within this germplasm base will result in minimal discovery. Genome-wide association studies based on SNPs and transcript abundance in the pan-genome revealed loci associated with the timing of the juvenile-to-adult vegetative and vegetative-to-reproductive developmental transitions, two traits important for fitness and adaptation. This study revealed the dynamic nature of the maize pan-genome and demonstrated that a substantial portion of variation may lie outside the single reference genome for a species. PMID:24488960

  2. Genome-Wide Transcriptome and Expression Profile Analysis of Phalaenopsis during Explant Browning

    PubMed Central

    Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei

    2015-01-01

    Background Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. Methodology/Principal Findings We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Conclusions/Significance Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further

  3. The genomic and transcriptomic landscape of a HeLa cell line.

    PubMed

    Landry, Jonathan J M; Pyl, Paul Theodor; Rausch, Tobias; Zichner, Thomas; Tekkedil, Manu M; Stütz, Adrian M; Jauch, Anna; Aiyar, Raeka S; Pau, Gregoire; Delhomme, Nicolas; Gagneur, Julien; Korbel, Jan O; Huber, Wolfgang; Steinmetz, Lars M

    2013-08-07

    HeLa is the most widely used model cell line for studying human cellular and molecular biology. To date, no genomic reference for this cell line has been released, and experiments have relied on the human reference genome. Effective design and interpretation of molecular genetic studies performed using HeLa cells require accurate genomic information. Here we present a detailed genomic and transcriptomic characterization of a HeLa cell line. We performed DNA and RNA sequencing of a HeLa Kyoto cell line and analyzed its mutational portfolio and gene expression profile. Segmentation of the genome according to copy number revealed a remarkably high level of aneuploidy and numerous large structural variants at unprecedented resolution. Some of the extensive genomic rearrangements are indicative of catastrophic chromosome shattering, known as chromothripsis. Our analysis of the HeLa gene expression profile revealed that several pathways, including cell cycle and DNA repair, exhibit significantly different expression patterns from those in normal human tissues. Our results provide the first detailed account of genomic variants in the HeLa genome, yielding insight into their impact on gene expression and cellular function as well as their origins. This study underscores the importance of accounting for the strikingly aberrant characteristics of HeLa cells when designing and interpreting experiments, and has implications for the use of HeLa as a model of human biology.

  4. Phenotypic, genomic, transcriptomic and proteomic changes in Bacillus cereus after a short-term space flight

    NASA Astrophysics Data System (ADS)

    Su, Longxiang; Zhou, Lisha; Liu, Jinwen; Cen, Zhong; Wu, Chunyan; Wang, Tong; Zhou, Tao; Chang, De; Guo, Yinghua; Fang, Xiangqun; Wang, Junfeng; Li, Tianzhi; Yin, Sanjun; Dai, Wenkui; Zhou, Yuping; Zhao, Jiao; Fang, Chengxiang; Yang, Ruifu; Liu, Changting

    2014-01-01

    The environment in space could affect microorganisms by changing a variety of features, including proliferation rate, cell physiology, cell metabolism, biofilm production, virulence, and drug resistance. However, the relevant mechanisms remain unclear. To explore the effect of a space environment on Bacillus cereus, a strain of B. cereus was sent to space for 398 h by ShenZhou VIII from November 1, 2011 to November 17, 2011. A ground simulation with similar temperature conditions was simultaneously performed as a control. After the flight, the flight and control strains were further analyzed using phenotypic, genomic, transcriptomic and proteomic techniques to explore the divergence of B. cereus in a space environment. The flight strains exhibited a significantly slower growth rate, a significantly higher amikacin resistance level, and changes in metabolism relative to the ground control strain. After the space flight, three polymorphic loci were found in the flight strains LCT-BC25 and LCT-BC235. A combined transcriptome and proteome analysis was performed, and this analysis revealed that the flight strains had changes in genes/proteins relevant to metabolism. In addition, certain genes/proteins that are relevant to structural function, gene expression modification and translation, and virulence were also altered. Our study represents the first documented analysis of the phenotypic, genomic, transcriptomic, and proteomic changes that occur in B. cereus during space flight, and our results could be beneficial to the field of space microbiology.

  5. Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer.

    PubMed

    Wu, Chunxiao; Wyatt, Alexander W; Lapuk, Anna V; McPherson, Andrew; McConeghy, Brian J; Bell, Robert H; Anderson, Shawn; Haegert, Anne; Brahmbhatt, Sonal; Shukin, Robert; Mo, Fan; Li, Estelle; Fazli, Ladan; Hurtado-Coll, Antonio; Jones, Edward C; Butterfield, Yaron S; Hach, Faraz; Hormozdiari, Fereydoun; Hajirasouliha, Iman; Boutros, Paul C; Bristow, Robert G; Jones, Steven Jm; Hirst, Martin; Marra, Marco A; Maher, Christopher A; Chinnaiyan, Arul M; Sahinalp, S Cenk; Gleave, Martin E; Volik, Stanislav V; Collins, Colin C

    2012-05-01

    Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology.

  6. Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer†

    PubMed Central

    Wu, Chunxiao; Wyatt, Alexander W; Lapuk, Anna V; McPherson, Andrew; McConeghy, Brian J; Bell, Robert H; Anderson, Shawn; Haegert, Anne; Brahmbhatt, Sonal; Shukin, Robert; Mo, Fan; Li, Estelle; Fazli, Ladan; Hurtado-Coll, Antonio; Jones, Edward C; Butterfield, Yaron S; Hach, Faraz; Hormozdiari, Fereydoun; Hajirasouliha, Iman; Boutros, Paul C; Bristow, Robert G; Jones, Steven JM; Hirst, Martin; Marra, Marco A; Maher, Christopher A; Chinnaiyan, Arul M; Sahinalp, S Cenk; Gleave, Martin E; Volik, Stanislav V; Collins, Colin C

    2013-01-01

    Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology. PMID:22294438

  7. Role of genomics and transcriptomics in selection of reintroduction source populations.

    PubMed

    He, Xiaoping; Johansson, Mattias L; Heath, Daniel D

    2016-10-01

    The use and importance of reintroduction as a conservation tool to return a species to its historical range from which it has been extirpated will increase as climate change and human development accelerate habitat loss and population extinctions. Although the number of reintroduction attempts has increased rapidly over the past 2 decades, the success rate is generally low. As a result of population differences in fitness-related traits and divergent responses to environmental stresses, population performance upon reintroduction is highly variable, and it is generally agreed that selecting an appropriate source population is a critical component of a successful reintroduction. Conservation genomics is an emerging field that addresses long-standing challenges in conservation, and the potential for using novel molecular genetic approaches to inform and improve conservation efforts is high. Because the successful establishment and persistence of reintroduced populations is highly dependent on the functional genetic variation and environmental stress tolerance of the source population, we propose the application of conservation genomics and transcriptomics to guide reintroduction practices. Specifically, we propose using genome-wide functional loci to estimate genetic variation of source populations. This estimate can then be used to predict the potential for adaptation. We also propose using transcriptional profiling to measure the expression response of fitness-related genes to environmental stresses as a proxy for acclimation (tolerance) capacity. Appropriate application of conservation genomics and transcriptomics has the potential to dramatically enhance reintroduction success in a time of rapidly declining biodiversity and accelerating environmental change.

  8. Genome and transcriptome of the regeneration-competent flatworm, Macrostomum lignano

    PubMed Central

    Wasik, Kaja; Gurtowski, James; Zhou, Xin; Ramos, Olivia Mendivil; Delás, M. Joaquina; Battistoni, Giorgia; El Demerdash, Osama; Falciatori, Ilaria; Vizoso, Dita B.; Smith, Andrew D.; Ladurner, Peter; Schärer, Lukas; McCombie, W. Richard; Hannon, Gregory J.; Schatz, Michael

    2015-01-01

    The free-living flatworm, Macrostomum lignano has an impressive regenerative capacity. Following injury, it can regenerate almost an entirely new organism because of the presence of an abundant somatic stem cell population, the neoblasts. This set of unique properties makes many flatworms attractive organisms for studying the evolution of pathways involved in tissue self-renewal, cell-fate specification, and regeneration. The use of these organisms as models, however, is hampered by the lack of a well-assembled and annotated genome sequences, fundamental to modern genetic and molecular studies. Here we report the genomic sequence of M. lignano and an accompanying characterization of its transcriptome. The genome structure of M. lignano is remarkably complex, with ∼75% of its sequence being comprised of simple repeats and transposon sequences. This has made high-quality assembly from Illumina reads alone impossible (N50 = 222 bp). We therefore generated 130× coverage by long sequencing reads from the Pacific Biosciences platform to create a substantially improved assembly with an N50 of 64 Kbp. We complemented the reference genome with an assembled and annotated transcriptome, and used both of these datasets in combination to probe gene-expression patterns during regeneration, examining pathways important to stem cell function. PMID:26392545

  9. Microbial genomics, transcriptomics and proteomics: new discoveries in decomposition research using complementary methods.

    PubMed

    Baldrian, Petr; López-Mondéjar, Rubén

    2014-02-01

    Molecular methods for the analysis of biomolecules have undergone rapid technological development in the last decade. The advent of next-generation sequencing methods and improvements in instrumental resolution enabled the analysis of complex transcriptome, proteome and metabolome data, as well as a detailed annotation of microbial genomes. The mechanisms of decomposition by model fungi have been described in unprecedented detail by the combination of genome sequencing, transcriptomics and proteomics. The increasing number of available genomes for fungi and bacteria shows that the genetic potential for decomposition of organic matter is widespread among taxonomically diverse microbial taxa, while expression studies document the importance of the regulation of expression in decomposition efficiency. Importantly, high-throughput methods of nucleic acid analysis used for the analysis of metagenomes and metatranscriptomes indicate the high diversity of decomposer communities in natural habitats and their taxonomic composition. Today, the metaproteomics of natural habitats is of interest. In combination with advanced analytical techniques to explore the products of decomposition and the accumulation of information on the genomes of environmentally relevant microorganisms, advanced methods in microbial ecophysiology should increase our understanding of the complex processes of organic matter transformation.

  10. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA.

    PubMed

    Chen, Lei; Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world's population. Rhizoctonia solaniis a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10,489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL:http://genedenovoweb.ticp.net:81/rsia/index.php.

  11. The plover neurotranscriptome assembly: transcriptomic analysis in an ecological model species without a reference genome.

    PubMed

    Moghadam, Hooman K; Harrison, Peter W; Zachar, Gergely; Székely, Tamás; Mank, Judith E

    2013-07-01

    We assembled a de novo transcriptome of short-read Illumina RNA-Seq data generated from telencephalon and diencephalon tissue samples from the Kentish plover, Charadrius alexandrinus. This is a species of considerable interest in behavioural ecology for its highly variable mating system and parental behaviour, but it lacks genomic resources and is evolutionarily distant from the few available avian draft genome sequences. We assembled and identified over 21,000 transcript contigs with significant expression in our samples, showing high homology to exonic sequences in avian draft genomes. From these, we identified >31,000 high-quality SNPs and > 2500 simple sequence repeats (SSRs). We also analysed expression patterns in our data to identify potential candidate genes related to differences in male and female behaviour, identifying over 200 nonoverlapping putative autosomal transcripts that show significant expression differences between males and females. Gene ontology analysis revealed that female-biased transcripts were significantly enriched for cerebral functions related to learning, cognition and memory, and male-biased transcripts were mostly enriched for terms related to neural function such as neuron projection and synapses. This data set provides one of the first de novo transcriptome assemblies from non-normalized short-read next-generation data and outlines an effective strategy for measuring sequence and expression variability simultaneously without the aid of a reference genome.

  12. Role of genomics and transcriptomics in selection of reintroduction source populations.

    PubMed

    He, Xiaoping; Johansson, Mattias L; Heath, Daniel D

    2016-10-01

    The use and importance of reintroduction as a conservation tool to return a species to its historical range from which it has been extirpated will increase as climate change and human development accelerate habitat loss and population extinctions. Although the number of reintroduction attempts has increased rapidly over the past 2 decades, the success rate is generally low. As a result of population differences in fitness-related traits and divergent responses to environmental stresses, population performance upon reintroduction is highly variable, and it is generally agreed that selecting an appropriate source population is a critical component of a successful reintroduction. Conservation genomics is an emerging field that addresses long-standing challenges in conservation, and the potential for using novel molecular genetic approaches to inform and improve conservation efforts is high. Because the successful establishment and persistence of reintroduced populations is highly dependent on the functional genetic variation and environmental stress tolerance of the source population, we propose the application of conservation genomics and transcriptomics to guide reintroduction practices. Specifically, we propose using genome-wide functional loci to estimate genetic variation of source populations. This estimate can then be used to predict the potential for adaptation. We also propose using transcriptional profiling to measure the expression response of fitness-related genes to environmental stresses as a proxy for acclimation (tolerance) capacity. Appropriate application of conservation genomics and transcriptomics has the potential to dramatically enhance reintroduction success in a time of rapidly declining biodiversity and accelerating environmental change. PMID:26756292

  13. Hepatocellular carcinoma cell lines retain the genomic and transcriptomic landscapes of primary human cancers

    PubMed Central

    Qiu, Zhixin; Zou, Keke; Zhuang, Liping; Qin, Jianjie; Li, Hong; Li, Chao; Zhang, Zhengtao; Chen, Xiaotao; Cen, Jin; Meng, Zhiqiang; Zhang, Haibin; Li, Yixue; Hui, Lijian

    2016-01-01

    Hepatocellular carcinoma (HCC) cell lines are useful in vitro models for the study of primary HCCs. Because cell lines acquire additional mutations in culture, it is important to understand to what extent HCC cell lines retain the genetic landscapes of primary HCCs. Most HCC cell lines were established during the last century, precluding comparison between cell lines and primary cancers. In this study, 9 Chinese HCC cell lines with matched patient-derived cells at low passages (PDCs) were established in the defined culture condition. Whole genome analyses of 4 HCC cell lines showed that genomic mutation landscapes, including mutations, copy number alterations (CNAs) and HBV integrations, were highly stable during cell line establishment. Importantly, genetic alterations in cancer drivers and druggable genes were reserved in cell lines. HCC cell lines also retained gene expression patterns of primary HCCs during in vitro culture. Finally, sequential analysis of HCC cell lines and PDCs at different passages revealed their comparable and stable genomic and transcriptomic levels if maintained within proper passages. These results show that HCC cell lines largely retain the genomic and transcriptomic landscapes of primary HCCs, thus laying the rationale for testing HCC cell lines as preclinical models in precision medicine. PMID:27273737

  14. Hepatocellular carcinoma cell lines retain the genomic and transcriptomic landscapes of primary human cancers.

    PubMed

    Qiu, Zhixin; Zou, Keke; Zhuang, Liping; Qin, Jianjie; Li, Hong; Li, Chao; Zhang, Zhengtao; Chen, Xiaotao; Cen, Jin; Meng, Zhiqiang; Zhang, Haibin; Li, Yixue; Hui, Lijian

    2016-06-07

    Hepatocellular carcinoma (HCC) cell lines are useful in vitro models for the study of primary HCCs. Because cell lines acquire additional mutations in culture, it is important to understand to what extent HCC cell lines retain the genetic landscapes of primary HCCs. Most HCC cell lines were established during the last century, precluding comparison between cell lines and primary cancers. In this study, 9 Chinese HCC cell lines with matched patient-derived cells at low passages (PDCs) were established in the defined culture condition. Whole genome analyses of 4 HCC cell lines showed that genomic mutation landscapes, including mutations, copy number alterations (CNAs) and HBV integrations, were highly stable during cell line establishment. Importantly, genetic alterations in cancer drivers and druggable genes were reserved in cell lines. HCC cell lines also retained gene expression patterns of primary HCCs during in vitro culture. Finally, sequential analysis of HCC cell lines and PDCs at different passages revealed their comparable and stable genomic and transcriptomic levels if maintained within proper passages. These results show that HCC cell lines largely retain the genomic and transcriptomic landscapes of primary HCCs, thus laying the rationale for testing HCC cell lines as preclinical models in precision medicine.

  15. Genome and transcriptome of the regeneration-competent flatworm, Macrostomum lignano.

    PubMed

    Wasik, Kaja; Gurtowski, James; Zhou, Xin; Ramos, Olivia Mendivil; Delás, M Joaquina; Battistoni, Giorgia; El Demerdash, Osama; Falciatori, Ilaria; Vizoso, Dita B; Smith, Andrew D; Ladurner, Peter; Schärer, Lukas; McCombie, W Richard; Hannon, Gregory J; Schatz, Michael

    2015-10-01

    The free-living flatworm, Macrostomum lignano has an impressive regenerative capacity. Following injury, it can regenerate almost an entirely new organism because of the presence of an abundant somatic stem cell population, the neoblasts. This set of unique properties makes many flatworms attractive organisms for studying the evolution of pathways involved in tissue self-renewal, cell-fate specification, and regeneration. The use of these organisms as models, however, is hampered by the lack of a well-assembled and annotated genome sequences, fundamental to modern genetic and molecular studies. Here we report the genomic sequence of M. lignano and an accompanying characterization of its transcriptome. The genome structure of M. lignano is remarkably complex, with ∼75% of its sequence being comprised of simple repeats and transposon sequences. This has made high-quality assembly from Illumina reads alone impossible (N50=222 bp). We therefore generated 130× coverage by long sequencing reads from the Pacific Biosciences platform to create a substantially improved assembly with an N50 of 64 Kbp. We complemented the reference genome with an assembled and annotated transcriptome, and used both of these datasets in combination to probe gene-expression patterns during regeneration, examining pathways important to stem cell function.

  16. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA

    PubMed Central

    Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world’s population. Rhizoctonia solani is a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10 489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL: http://genedenovoweb.ticp.net:81/rsia/index.php PMID:27022158

  17. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA.

    PubMed

    Chen, Lei; Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world's population. Rhizoctonia solaniis a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10,489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL:http://genedenovoweb.ticp.net:81/rsia/index.php. PMID:27022158

  18. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution

    PubMed Central

    Clarke, Thomas H.; Garb, Jessica E.; Hayashi, Cheryl Y.; Arensburger, Peter; Ayoub, Nadia A.

    2015-01-01

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). PMID:26058392

  19. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

    PubMed

    Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

    2015-06-08

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae).

  20. Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This study reports generation of large-scale genomic resources for pigeonpea, a so-called ‘orphan crop species’ of the semi-arid tropic regions. Roche FLX/454 sequencing was carried out on a normalized cDNA pool prepared from 31 tissues produced 494,353 short transcript reads (STRs). Cluster analysi...

  1. The Genome and Development-Dependent Transcriptomes of Pyronema confluens: A Window into Fungal Evolution

    PubMed Central

    Traeger, Stefanie; Altegoer, Florian; Freitag, Michael; Gabaldon, Toni; Kempken, Frank; Kumar, Abhishek; Marcet-Houben, Marina; Pöggeler, Stefanie; Stajich, Jason E.; Nowrousian, Minou

    2013-01-01

    Fungi are a large group of eukaryotes found in nearly all ecosystems. More than 250 fungal genomes have already been sequenced, greatly improving our understanding of fungal evolution, physiology, and development. However, for the Pezizomycetes, an early-diverging lineage of filamentous ascomycetes, there is so far only one genome available, namely that of the black truffle, Tuber melanosporum, a mycorrhizal species with unusual subterranean fruiting bodies. To help close the sequence gap among basal filamentous ascomycetes, and to allow conclusions about the evolution of fungal development, we sequenced the genome and assayed transcriptomes during development of Pyronema confluens, a saprobic Pezizomycete with a typical apothecium as fruiting body. With a size of 50 Mb and ∼13,400 protein-coding genes, the genome is more characteristic of higher filamentous ascomycetes than the large, repeat-rich truffle genome; however, some typical features are different in the P. confluens lineage, e.g. the genomic environment of the mating type genes that is conserved in higher filamentous ascomycetes, but only partly conserved in P. confluens. On the other hand, P. confluens has a full complement of fungal photoreceptors, and expression studies indicate that light perception might be similar to distantly related ascomycetes and, thus, represent a basic feature of filamentous ascomycetes. Analysis of spliced RNA-seq sequence reads allowed the detection of natural antisense transcripts for 281 genes. The P. confluens genome contains an unusually high number of predicted orphan genes, many of which are upregulated during sexual development, consistent with the idea of rapid evolution of sex-associated genes. Comparative transcriptomics identified the transcription factor gene pro44 that is upregulated during development in P. confluens and the Sordariomycete Sordaria macrospora. The P. confluens pro44 gene (PCON_06721) was used to complement the S. macrospora pro44 deletion

  2. Genome scale metabolic modeling of the riboflavin overproducer Ashbya gossypii.

    PubMed

    Ledesma-Amaro, Rodrigo; Kerkhoven, Eduard J; Revuelta, José Luis; Nielsen, Jens

    2014-06-01

    Ashbya gossypii is a filamentous fungus that naturally overproduces riboflavin, or vitamin B2. Advances in genetic and metabolic engineering of A. gossypii have permitted the switch from industrial chemical synthesis to the current biotechnological production of this vitamin. Additionally, A. gossypii is a model organism with one of the smallest eukaryote genomes being phylogenetically close to Saccharomyces cerevisiae. It has therefore been used to study evolutionary aspects of bakers' yeast. We here reconstructed the first genome scale metabolic model of A. gossypii, iRL766. The model was validated by biomass growth, riboflavin production and substrate utilization predictions. Gene essentiality analysis of the A. gossypii model in comparison with the S. cerevisiae model demonstrated how the whole-genome duplication event that separates the two species has led to an even spread of paralogs among all metabolic pathways. Additionally, iRL766 was used to integrate transcriptomics data from two different growth stages of A. gossypii, comparing exponential growth to riboflavin production stages. Both reporter metabolite analysis and in silico identification of transcriptionally regulated enzymes demonstrated the important involvement of beta-oxidation and the glyoxylate cycle in riboflavin production. PMID:24374726

  3. Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.

    PubMed

    Rivas, Manuel A; Pirinen, Matti; Conrad, Donald F; Lek, Monkol; Tsang, Emily K; Karczewski, Konrad J; Maller, Julian B; Kukurba, Kimberly R; DeLuca, David S; Fromer, Menachem; Ferreira, Pedro G; Smith, Kevin S; Zhang, Rui; Zhao, Fengmei; Banks, Eric; Poplin, Ryan; Ruderfer, Douglas M; Purcell, Shaun M; Tukiainen, Taru; Minikel, Eric V; Stenson, Peter D; Cooper, David N; Huang, Katharine H; Sullivan, Timothy J; Nedzel, Jared; Bustamante, Carlos D; Li, Jin Billy; Daly, Mark J; Guigo, Roderic; Donnelly, Peter; Ardlie, Kristin; Sammeth, Michael; Dermitzakis, Emmanouil T; McCarthy, Mark I; Montgomery, Stephen B; Lappalainen, Tuuli; MacArthur, Daniel G

    2015-05-01

    Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants. PMID:25954003

  4. Whole-genome resequencing and transcriptomic analysis to identify genes involved in leaf-color diversity in ornamental rice plants.

    PubMed

    Kim, Chang-Kug; Seol, Young-Joo; Shin, Younhee; Lim, Hye-Min; Lee, Gang-Seob; Kim, A-Ram; Lee, Tae-Ho; Lee, Jae-Hee; Park, Dong-Suk; Yoo, Seungil; Kim, Yong-Hwan; Kim, Yong-Kab

    2015-01-01

    Rice field art is a large-scale art form in which people design rice fields using various kinds of ornamental rice plants with different leaf colors. Leaf color-related genes play an important role in the study of chlorophyll biosynthesis, chloroplast structure and function, and anthocyanin biosynthesis. Despite the role of different metabolites in the traditional relationship between leaf and color, comprehensive color-specific metabolite studies of ornamental rice have been limited. We performed whole-genome resequencing and transcriptomic analysis of regulatory patterns and genetic diversity among different rice cultivars to discover new genetic mechanisms that promote enhanced levels of various leaf colors. We resequenced the genomes of 10 rice leaf-color accessions to an average of 40× reads depth and >95% coverage and performed 30 RNA-seq experiments using the 10 rice accessions sampled at three developmental stages. The sequencing results yielded a total of 1,814 × 106 reads and identified an average of 713,114 SNPs per rice accession. Based on our analysis of the DNA variation and gene expression, we selected 47 candidate genes. We used an integrated analysis of the whole-genome resequencing data and the RNA-seq data to divide the candidate genes into two groups: genes related to macronutrient (i.e., magnesium and sulfur) transport and genes related to flavonoid pathways, including anthocyanidin biosynthesis. We verified the candidate genes with quantitative real-time PCR using transgenic T-DNA insertion mutants. Our study demonstrates the potential of integrated screening methods combined with genetic-variation and transcriptomic data to isolate genes involved in complex biosynthetic networks and pathways.

  5. Whole-Genome Resequencing and Transcriptomic Analysis to Identify Genes Involved in Leaf-Color Diversity in Ornamental Rice Plants

    PubMed Central

    Shin, Younhee; Lim, Hye-Min; Lee, Gang-Seob; Kim, A-Ram; Lee, Tae-Ho; Lee, Jae-Hee; Park, Dong-Suk; Yoo, Seungil; Kim, Yong-Hwan; Kim, Yong-Kab

    2015-01-01

    Rice field art is a large-scale art form in which people design rice fields using various kinds of ornamental rice plants with different leaf colors. Leaf color-related genes play an important role in the study of chlorophyll biosynthesis, chloroplast structure and function, and anthocyanin biosynthesis. Despite the role of different metabolites in the traditional relationship between leaf and color, comprehensive color-specific metabolite studies of ornamental rice have been limited. We performed whole-genome resequencing and transcriptomic analysis of regulatory patterns and genetic diversity among different rice cultivars to discover new genetic mechanisms that promote enhanced levels of various leaf colors. We resequenced the genomes of 10 rice leaf-color accessions to an average of 40× reads depth and >95% coverage and performed 30 RNA-seq experiments using the 10 rice accessions sampled at three developmental stages. The sequencing results yielded a total of 1,814 × 106 reads and identified an average of 713,114 SNPs per rice accession. Based on our analysis of the DNA variation and gene expression, we selected 47 candidate genes. We used an integrated analysis of the whole-genome resequencing data and the RNA-seq data to divide the candidate genes into two groups: genes related to macronutrient (i.e., magnesium and sulfur) transport and genes related to flavonoid pathways, including anthocyanidin biosynthesis. We verified the candidate genes with quantitative real-time PCR using transgenic T-DNA insertion mutants. Our study demonstrates the potential of integrated screening methods combined with genetic-variation and transcriptomic data to isolate genes involved in complex biosynthetic networks and pathways. PMID:25897514

  6. High density linkage mapping of genomic and transcriptomic SNPs for synteny analysis and anchoring the genome sequence of chickpea

    PubMed Central

    Gaur, Rashmi; Jeena, Ganga; Shah, Niraj; Gupta, Shefali; Pradhan, Seema; Tyagi, Akhilesh K; Jain, Mukesh; Chattopadhyay, Debasis; Bhatia, Sabhyata

    2015-01-01

    This study presents genome-wide discovery of SNPs through next generation sequencing of the genome of Cicer reticulatum. Mapping of the C. reticulatum sequenced reads onto the draft genome assembly of C. arietinum (desi chickpea) resulted in identification of 842,104 genomic SNPs which were utilized along with an additional 36,446 genic SNPs identified from transcriptome sequences of the aforementioned varieties. Two new chickpea Oligo Pool All (OPAs) each having 3,072 SNPs were designed and utilized for SNP genotyping of 129 Recombinant Inbred Lines (RILs). Using Illumina GoldenGate Technology genotyping data of 5,041 SNPs were generated and combined with the 1,673 marker data from previously published studies, to generate a high resolution linkage map. The map comprised of 6698 markers distributed on eight linkage groups spanning 1083.93 cM with an average inter-marker distance of 0.16 cM. Utility of the present map was demonstrated for improving the anchoring of the earlier reported draft genome sequence of desi chickpea by ~30% and that of kabuli chickpea by 18%. The genetic map reported in this study represents the most dense linkage map of chickpea , with the potential to facilitate efficient anchoring of the draft genome sequences of desi as well as kabuli chickpea varieties. PMID:26303721

  7. High density linkage mapping of genomic and transcriptomic SNPs for synteny analysis and anchoring the genome sequence of chickpea.

    PubMed

    Gaur, Rashmi; Jeena, Ganga; Shah, Niraj; Gupta, Shefali; Pradhan, Seema; Tyagi, Akhilesh K; Jain, Mukesh; Chattopadhyay, Debasis; Bhatia, Sabhyata

    2015-01-01

    This study presents genome-wide discovery of SNPs through next generation sequencing of the genome of Cicer reticulatum. Mapping of the C. reticulatum sequenced reads onto the draft genome assembly of C. arietinum (desi chickpea) resulted in identification of 842,104 genomic SNPs which were utilized along with an additional 36,446 genic SNPs identified from transcriptome sequences of the aforementioned varieties. Two new chickpea Oligo Pool All (OPAs) each having 3,072 SNPs were designed and utilized for SNP genotyping of 129 Recombinant Inbred Lines (RILs). Using Illumina GoldenGate Technology genotyping data of 5,041 SNPs were generated and combined with the 1,673 marker data from previously published studies, to generate a high resolution linkage map. The map comprised of 6698 markers distributed on eight linkage groups spanning 1083.93 cM with an average inter-marker distance of 0.16 cM. Utility of the present map was demonstrated for improving the anchoring of the earlier reported draft genome sequence of desi chickpea by ~30% and that of kabuli chickpea by 18%. The genetic map reported in this study represents the most dense linkage map of chickpea , with the potential to facilitate efficient anchoring of the draft genome sequences of desi as well as kabuli chickpea varieties.

  8. Transcriptome sequencing and microarray development for the Manila clam, Ruditapes philippinarum: genomic tools for environmental monitoring

    PubMed Central

    2011-01-01

    Background The Manila clam, Ruditapes philippinarum, is one of the major aquaculture species in the world and a potential sentinel organism for monitoring the status of marine ecosystems. However, genomic resources for R. philippinarum are still extremely limited. Global analysis of gene expression profiles is increasingly used to evaluate the biological effects of various environmental stressors on aquatic animals under either artificial conditions or in the wild. Here, we report on the development of a transcriptomic platform for global gene expression profiling in the Manila clam. Results A normalized cDNA library representing a mixture of adult tissues was sequenced using a ultra high-throughput sequencing technology (Roche 454). A database consisting of 32,606 unique transcripts was constructed, 9,747 (30%) of which could be annotated by similarity. An oligo-DNA microarray platform was designed and applied to profile gene expression of digestive gland and gills. Functional annotation of differentially expressed genes between different tissues was performed by enrichment analysis. Expression of Natural Antisense Transcripts (NAT) analysis was also performed and bi-directional transcription appears a common phenomenon in the R. philippinarum transcriptome. A preliminary study on clam samples collected in a highly polluted area of the Venice Lagoon demonstrated the applicability of genomic tools to environmental monitoring. Conclusions The transcriptomic platform developed for the Manila clam confirmed the high level of reproducibility of current microarray technology. Next-generation sequencing provided a good representation of the clam transcriptome. Despite the known limitations in transcript annotation and sequence coverage for non model species, sufficient information was obtained to identify a large set of genes potentially involved in cellular response to environmental stress. PMID:21569398

  9. Databases and information integration for the Medicago truncatula genome and transcriptome.

    PubMed

    Cannon, Steven B; Crow, John A; Heuer, Michael L; Wang, Xiaohong; Cannon, Ethalinda K S; Dwan, Christopher; Lamblin, Anne-Francoise; Vasdewani, Jayprakash; Mudge, Joann; Cook, Andrew; Gish, John; Cheung, Foo; Kenton, Steve; Kunau, Timothy M; Brown, Douglas; May, Gregory D; Kim, Dongjin; Cook, Douglas R; Roe, Bruce A; Town, Chris D; Young, Nevin D; Retzel, Ernest F

    2005-05-01

    An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu), the Institute for Genomic Research (www.tigr.org), and Munich Information for Protein Sequences Center (mips.gsf.de). PMID:15888676

  10. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses.

    PubMed

    O'Connell, Richard J; Thon, Michael R; Hacquard, Stéphane; Amyotte, Stefan G; Kleemann, Jochen; Torres, Maria F; Damm, Ulrike; Buiate, Ester A; Epstein, Lynn; Alkan, Noam; Altmüller, Janine; Alvarado-Balderrama, Lucia; Bauser, Christopher A; Becker, Christian; Birren, Bruce W; Chen, Zehua; Choi, Jaeyoung; Crouch, Jo Anne; Duvick, Jonathan P; Farman, Mark A; Gan, Pamela; Heiman, David; Henrissat, Bernard; Howard, Richard J; Kabbage, Mehdi; Koch, Christian; Kracher, Barbara; Kubo, Yasuyuki; Law, Audrey D; Lebrun, Marc-Henri; Lee, Yong-Hwan; Miyara, Itay; Moore, Neil; Neumann, Ulla; Nordström, Karl; Panaccione, Daniel G; Panstruga, Ralph; Place, Michael; Proctor, Robert H; Prusky, Dov; Rech, Gabriel; Reinhardt, Richard; Rollins, Jeffrey A; Rounsley, Steve; Schardl, Christopher L; Schwartz, David C; Shenoy, Narmada; Shirasu, Ken; Sikhakolli, Usha R; Stüber, Kurt; Sukno, Serenella A; Sweigard, James A; Takano, Yoshitaka; Takahara, Hiroyuki; Trail, Frances; van der Does, H Charlotte; Voll, Lars M; Will, Isa; Young, Sarah; Zeng, Qiandong; Zhang, Jingze; Zhou, Shiguo; Dickman, Martin B; Schulze-Lefert, Paul; Ver Loren van Themaat, Emiel; Ma, Li-Jun; Vaillancourt, Lisa J

    2012-09-01

    Colletotrichum species are fungal pathogens that devastate crop plants worldwide. Host infection involves the differentiation of specialized cell types that are associated with penetration, growth inside living host cells (biotrophy) and tissue destruction (necrotrophy). We report here genome and transcriptome analyses of Colletotrichum higginsianum infecting Arabidopsis thaliana and Colletotrichum graminicola infecting maize. Comparative genomics showed that both fungi have large sets of pathogenicity-related genes, but families of genes encoding secreted effectors, pectin-degrading enzymes, secondary metabolism enzymes, transporters and peptidases are expanded in C. higginsianum. Genome-wide expression profiling revealed that these genes are transcribed in successive waves that are linked to pathogenic transitions: effectors and secondary metabolism enzymes are induced before penetration and during biotrophy, whereas most hydrolases and transporters are upregulated later, at the switch to necrotrophy. Our findings show that preinvasion perception of plant-derived signals substantially reprograms fungal gene expression and indicate previously unknown functions for particular fungal cell types.

  11. Transcriptome and metabolome of synthetic Solanum autotetraploids reveal key genomic stress events following polyploidization.

    PubMed

    Fasano, Carlo; Diretto, Gianfranco; Aversano, Riccardo; D'Agostino, Nunzio; Di Matteo, Antonio; Frusciante, Luigi; Giuliano, Giovanni; Carputo, Domenico

    2016-06-01

    Polyploids are generally classified as autopolyploids, derived from a single species, and allopolyploids, arising from interspecific hybridization. The former represent ideal materials with which to study the consequences of genome doubling and ascertain whether there are molecular and functional rules operating following polyploidization events. To investigate whether the effects of autopolyploidization are common to different species, or if species-specific or stochastic events are prevalent, we performed a comprehensive transcriptomic and metabolomic characterization of diploids and autotetraploids of Solanum commersonii and Solanum bulbocastanum. Autopolyploidization remodelled the transcriptome and the metabolome of both species. In S. commersonii, differentially expressed genes (DEGs) were highly enriched in pericentromeric regions. Most changes were stochastic, suggesting a strong genotypic response. However, a set of robustly regulated transcripts and metabolites was also detected, including purine bases and nucleosides, which are likely to underlie a common response to polyploidization. We hypothesize that autopolyploidization results in nucleotide pool imbalance, which in turn triggers a genomic shock responsible for the stochastic events observed. The more extensive genomic stress and the higher number of stochastic events observed in S. commersonii with respect to S. bulbocastanum could be the result of the higher nucleoside depletion observed in this species.

  12. Genome-Wide Transcriptome Analysis of Cadmium Stress in Rice

    PubMed Central

    Oono, Youko; Yazawa, Takayuki; Kanamori, Hiroyuki; Sasaki, Harumi; Mori, Satomi; Handa, Hirokazu; Matsumoto, Takashi

    2016-01-01

    Rice growth is severely affected by toxic concentrations of the nonessential heavy metal cadmium (Cd). To elucidate the molecular basis of the response to Cd stress, we performed mRNA sequencing of rice following our previous study on exposure to high concentrations of Cd (Oono et al., 2014). In this study, rice plants were hydroponically treated with low concentrations of Cd and approximately 211 million sequence reads were mapped onto the IRGSP-1.0 reference rice genome sequence. Many genes, including some identified under high Cd concentration exposure in our previous study, were found to be responsive to low Cd exposure, with an average of about 11,000 transcripts from each condition. However, genes expressed constitutively across the developmental course responded only slightly to low Cd concentrations, in contrast to their clear response to high Cd concentration, which causes fatal damage to rice seedlings according to phenotypic changes. The expression of metal ion transporter genes tended to correlate with Cd concentration, suggesting the potential of the RNA-Seq strategy to reveal novel Cd-responsive transporters by analyzing gene expression under different Cd concentrations. This study could help to develop novel strategies for improving tolerance to Cd exposure in rice and other cereal crops. PMID:27034955

  13. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer

    PubMed Central

    Bova, G. Steven; Kallio, Heini M.L.; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B.; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-01-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials. PMID:27148588

  14. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer.

    PubMed

    Bova, G Steven; Kallio, Heini M L; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-05-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials. PMID:27148588

  15. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    PubMed

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species. PMID:25847029

  16. Genome Evolution in the Cold: Antarctic Icefish Muscle Transcriptome Reveals Selective Duplications Increasing Mitochondrial Function

    PubMed Central

    Coppe, Alessandro; Agostini, Cecilia; Marino, Ilaria A.M.; Zane, Lorenzo; Bargelloni, Luca; Bortoluzzi, Stefania; Patarnello, Tomaso

    2013-01-01

    Antarctic notothenioids radiated over millions of years in subzero waters, evolving peculiar features, such as antifreeze glycoproteins and absence of heat shock response. Icefish, family Channichthyidae, also lack oxygen-binding proteins and display extreme modifications, including high mitochondrial densities in aerobic tissues. A genomic expansion accompanying the evolution of these fish was reported, but paucity of genomic information limits the understanding of notothenioid cold adaptation. We reconstructed and annotated the first skeletal muscle transcriptome of the icefish Chionodraco hamatus providing a new resource for icefish genomics (http://compgen.bio.unipd.it/chamatusbase/, last accessed December 12, 2012). We exploited deep sequencing of this energy-dependent tissue to test the hypothesis of selective duplication of genes involved in mitochondrial function. We developed a bioinformatic approach to univocally assign C. hamatus transcripts to orthology groups extracted from phylogenetic trees of five model species. Chionodraco hamatus duplicates were recorded for each orthology group allowing the identification of duplicated genes specific to the icefish lineage. Significantly more duplicates were found in the icefish when transcriptome data were compared with whole-genome data of model species. Indeed, duplicated genes were significantly enriched in proteins with mitochondrial localization, involved in mitochondrial function and biogenesis. In cold conditions and without oxygen-carrying proteins, energy production is challenging. The combination of high mitochondrial densities and the maintenance of duplicated genes involved in mitochondrial biogenesis and aerobic respiration might confer a selective advantage by improving oxygen diffusion and energy supply to aerobic tissues. Our results provide new insights into the genomic basis of icefish cold adaptation. PMID:23196969

  17. Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus).

    PubMed

    Thanh, Nguyen Minh; Jung, Hyungtaek; Lyons, Russell E; Njaci, Isaac; Yoon, Byoung-Ha; Chand, Vincent; Tuan, Nguyen Viet; Thu, Vo Thi Minh; Mather, Peter

    2015-10-01

    Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478bp and N50 length of 506bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species. PMID:25979246

  18. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea

    PubMed Central

    2014-01-01

    Background Brassica oleracea is a valuable vegetable species that has contributed to human health and nutrition for hundreds of years and comprises multiple distinct cultivar groups with diverse morphological and phytochemical attributes. In addition to this phenotypic wealth, B. oleracea offers unique insights into polyploid evolution, as it results from multiple ancestral polyploidy events and a final Brassiceae-specific triplication event. Further, B. oleracea represents one of the diploid genomes that formed the economically important allopolyploid oilseed, Brassica napus. A deeper understanding of B. oleracea genome architecture provides a foundation for crop improvement strategies throughout the Brassica genus. Results We generate an assembly representing 75% of the predicted B. oleracea genome using a hybrid Illumina/Roche 454 approach. Two dense genetic maps are generated to anchor almost 92% of the assembled scaffolds to nine pseudo-chromosomes. Over 50,000 genes are annotated and 40% of the genome predicted to be repetitive, thus contributing to the increased genome size of B. oleracea compared to its close relative B. rapa. A snapshot of both the leaf transcriptome and methylome allows comparisons to be made across the triplicated sub-genomes, which resulted from the most recent Brassiceae-specific polyploidy event. Conclusions Differential expression of the triplicated syntelogs and cytosine methylation levels across the sub-genomes suggest residual marks of the genome dominance that led to the current genome architecture. Although cytosine methylation does not correlate with individual gene dominance, the independent methylation patterns of triplicated copies suggest epigenetic mechanisms play a role in the functional diversification of duplicate genes. PMID:24916971

  19. Combined Analysis of the Chloroplast Genome and Transcriptome of the Antarctic Vascular Plant Deschampsia antarctica Desv

    PubMed Central

    Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

    2014-01-01

    Background Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. Results The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5′- or 3′-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. Conclusions We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the

  20. The genome and transcriptome of the zoonotic hookworm Ancylostoma ceylanicum identify infection-specific gene families.

    PubMed

    Schwarz, Erich M; Hu, Yan; Antoshechkin, Igor; Miller, Melanie M; Sternberg, Paul W; Aroian, Raffi V

    2015-04-01

    Hookworms infect over 400 million people, stunting and impoverishing them. Sequencing hookworm genomes and finding which genes they express during infection should help in devising new drugs or vaccines against hookworms. Unlike other hookworms, Ancylostoma ceylanicum infects both humans and other mammals, providing a laboratory model for hookworm disease. We determined an A. ceylanicum genome sequence of 313 Mb, with transcriptomic data throughout infection showing expression of 30,738 genes. Approximately 900 genes were upregulated during early infection in vivo, including ASPRs, a cryptic subfamily of activation-associated secreted proteins (ASPs). Genes downregulated during early infection included ion channels and G protein-coupled receptors; this downregulation was observed in both parasitic and free-living nematodes. Later, at the onset of heavy blood feeding, C-lectin genes were upregulated along with genes for secreted clade V proteins (SCVPs), encoding a previously undescribed protein family. These findings provide new drug and vaccine targets and should help elucidate hookworm pathogenesis.

  1. Genomic and transcriptomic evidence for scavenging of diverse organic compounds by widespread deep-sea archaea.

    PubMed

    Li, Meng; Baker, Brett J; Anantharaman, Karthik; Jain, Sunit; Breier, John A; Dick, Gregory J

    2015-01-01

    Microbial activity is one of the most important processes to mediate the flux of organic carbon from the ocean surface to the seafloor. However, little is known about the microorganisms that underpin this key step of the global carbon cycle in the deep oceans. Here we present genomic and transcriptomic evidence that five ubiquitous archaeal groups actively use proteins, carbohydrates, fatty acids and lipids as sources of carbon and energy at depths ranging from 800 to 4,950 m in hydrothermal vent plumes and pelagic background seawater across three different ocean basins. Genome-enabled metabolic reconstructions and gene expression patterns show that these marine archaea are motile heterotrophs with extensive mechanisms for scavenging organic matter. Our results shed light on the ecological and physiological properties of ubiquitous marine archaea and highlight their versatile metabolic strategies in deep oceans that might play a critical role in global carbon cycling.

  2. Amplified Fragment Length Polymorphism (AFLP) - an invaluable fingerprinting technique for genomic, transcriptomic and epigenetic studies

    PubMed Central

    Paun, Ovidiu; Schönswetter, Peter

    2012-01-01

    Summary Amplified fragment length polymorphism (AFLP) is a PCR-based technique that uses selective amplification of a subset of digested DNA fragments to generate and compare unique fingerprints for genomes of interest. The power of this method relies mainly in that it does not require prior information regarding the targeted genome, as well as in its high reproducibility and sensitivity for detecting polymorphism at the level of DNA sequence. Widely used for plant and microbial studies, AFLP is employed for a variety of applications, such as: to assess genetic diversity within species or among closely related species, to infer population-level phylogenies and biogeographic patterns, to generate genetic maps and to determine relatedness among cultivars. Variations of standard AFLP methodology have been also developed for targeting additional levels of diversity, like transcriptomic variation and DNA methylation polymorphism. PMID:22419490

  3. Genomic and transcriptomic evidence for scavenging of diverse organic compounds by widespread deep-sea archaea.

    PubMed

    Li, Meng; Baker, Brett J; Anantharaman, Karthik; Jain, Sunit; Breier, John A; Dick, Gregory J

    2015-01-01

    Microbial activity is one of the most important processes to mediate the flux of organic carbon from the ocean surface to the seafloor. However, little is known about the microorganisms that underpin this key step of the global carbon cycle in the deep oceans. Here we present genomic and transcriptomic evidence that five ubiquitous archaeal groups actively use proteins, carbohydrates, fatty acids and lipids as sources of carbon and energy at depths ranging from 800 to 4,950 m in hydrothermal vent plumes and pelagic background seawater across three different ocean basins. Genome-enabled metabolic reconstructions and gene expression patterns show that these marine archaea are motile heterotrophs with extensive mechanisms for scavenging organic matter. Our results shed light on the ecological and physiological properties of ubiquitous marine archaea and highlight their versatile metabolic strategies in deep oceans that might play a critical role in global carbon cycling. PMID:26573375

  4. Genomic and transcriptomic evidence for scavenging of diverse organic compounds by widespread deep-sea archaea

    PubMed Central

    Li, Meng; Baker, Brett J.; Anantharaman, Karthik; Jain, Sunit; Breier, John A.; Dick, Gregory J.

    2015-01-01

    Microbial activity is one of the most important processes to mediate the flux of organic carbon from the ocean surface to the seafloor. However, little is known about the microorganisms that underpin this key step of the global carbon cycle in the deep oceans. Here we present genomic and transcriptomic evidence that five ubiquitous archaeal groups actively use proteins, carbohydrates, fatty acids and lipids as sources of carbon and energy at depths ranging from 800 to 4,950 m in hydrothermal vent plumes and pelagic background seawater across three different ocean basins. Genome-enabled metabolic reconstructions and gene expression patterns show that these marine archaea are motile heterotrophs with extensive mechanisms for scavenging organic matter. Our results shed light on the ecological and physiological properties of ubiquitous marine archaea and highlight their versatile metabolic strategies in deep oceans that might play a critical role in global carbon cycling. PMID:26573375

  5. Genome-, Transcriptome- and Proteome-Wide Analyses of the Gliadin Gene Families in Triticum urartu.

    PubMed

    Zhang, Yanlin; Luo, Guangbin; Liu, Dongcheng; Wang, Dongzhi; Yang, Wenlong; Sun, Jiazhu; Zhang, Aimin; Zhan, Kehui

    2015-01-01

    Gliadins are the major components of storage proteins in wheat grains, and they play an essential role in the dough extensibility and nutritional quality of flour. Because of the large number of the gliadin family members, the high level of sequence identity, and the lack of abundant genomic data for Triticum species, identifying the full complement of gliadin family genes in hexaploid wheat remains challenging. Triticum urartu is a wild diploid wheat species and considered the A-genome donor of polyploid wheat species. The accession PI428198 (G1812) was chosen to determine the complete composition of the gliadin gene families in the wheat A-genome using the available draft genome. Using a PCR-based cloning strategy for genomic DNA and mRNA as well as a bioinformatics analysis of genomic sequence data, 28 gliadin genes were characterized. Of these genes, 23 were α-gliadin genes, three were γ-gliadin genes and two were ω-gliadin genes. An RNA sequencing (RNA-Seq) survey of the dynamic expression patterns of gliadin genes revealed that their synthesis in immature grains began prior to 10 days post-anthesis (DPA), peaked at 15 DPA and gradually decreased at 20 DPA. The accumulation of proteins encoded by 16 of the expressed gliadin genes was further verified and quantified using proteomic methods. The phylogenetic analysis demonstrated that the homologs of these α-gliadin genes were present in tetraploid and hexaploid wheat, which was consistent with T. urartu being the A-genome progenitor species. This study presents a systematic investigation of the gliadin gene families in T. urartu that spans the genome, transcriptome and proteome, and it provides new information to better understand the molecular structure, expression profiles and evolution of the gliadin genes in T. urartu and common wheat.

  6. Genome-, Transcriptome- and Proteome-Wide Analyses of the Gliadin Gene Families in Triticum urartu

    PubMed Central

    Wang, Dongzhi; Yang, Wenlong; Sun, Jiazhu; Zhang, Aimin; Zhan, Kehui

    2015-01-01

    Gliadins are the major components of storage proteins in wheat grains, and they play an essential role in the dough extensibility and nutritional quality of flour. Because of the large number of the gliadin family members, the high level of sequence identity, and the lack of abundant genomic data for Triticum species, identifying the full complement of gliadin family genes in hexaploid wheat remains challenging. Triticum urartu is a wild diploid wheat species and considered the A-genome donor of polyploid wheat species. The accession PI428198 (G1812) was chosen to determine the complete composition of the gliadin gene families in the wheat A-genome using the available draft genome. Using a PCR-based cloning strategy for genomic DNA and mRNA as well as a bioinformatics analysis of genomic sequence data, 28 gliadin genes were characterized. Of these genes, 23 were α-gliadin genes, three were γ-gliadin genes and two were ω-gliadin genes. An RNA sequencing (RNA-Seq) survey of the dynamic expression patterns of gliadin genes revealed that their synthesis in immature grains began prior to 10 days post-anthesis (DPA), peaked at 15 DPA and gradually decreased at 20 DPA. The accumulation of proteins encoded by 16 of the expressed gliadin genes was further verified and quantified using proteomic methods. The phylogenetic analysis demonstrated that the homologs of these α-gliadin genes were present in tetraploid and hexaploid wheat, which was consistent with T. urartu being the A-genome progenitor species. This study presents a systematic investigation of the gliadin gene families in T. urartu that spans the genome, transcriptome and proteome, and it provides new information to better understand the molecular structure, expression profiles and evolution of the gliadin genes in T. urartu and common wheat. PMID:26132381

  7. A Systematically Improved High Quality Genome and Transcriptome of the Human Blood Fluke Schistosoma mansoni

    PubMed Central

    Babbage, Anne; Nichol, Sarah; Hunt, Martin; Aslett, Martin A.; De Silva, Nishadi; Velarde, Giles S.; Anderson, Tim J. C.; Clark, Richard C.; Davidson, Claire; Dillon, Gary P.; Holroyd, Nancy E.; LoVerde, Philip T.; Lloyd, Christine; McQuillan, Jacquelline; Oliveira, Guilherme; Otto, Thomas D.; Parker-Manuel, Sophia J.; Quail, Michael A.; Wilson, R. Alan; Zerlotini, Adhemar; Dunne, David W.; Berriman, Matthew

    2012-01-01

    Schistosomiasis is one of the most prevalent parasitic diseases, affecting millions of people in developing countries. Amongst the human-infective species, Schistosoma mansoni is also the most commonly used in the laboratory and here we present the systematic improvement of its draft genome. We used Sanger capillary and deep-coverage Illumina sequencing from clonal worms to upgrade the highly fragmented draft 380 Mb genome to one with only 885 scaffolds and more than 81% of the bases organised into chromosomes. We have also used transcriptome sequencing (RNA-seq) from four time points in the parasite's life cycle to refine gene predictions and profile their expression. More than 45% of predicted genes have been extensively modified and the total number has been reduced from 11,807 to 10,852. Using the new version of the genome, we identified trans-splicing events occurring in at least 11% of genes and identified clear cases where it is used to resolve polycistronic transcripts. We have produced a high-resolution map of temporal changes in expression for 9,535 genes, covering an unprecedented dynamic range for this organism. All of these data have been consolidated into a searchable format within the GeneDB (www.genedb.org) and SchistoDB (www.schistodb.net) databases. With further transcriptional profiling and genome sequencing increasingly accessible, the upgraded genome will form a fundamental dataset to underpin further advances in schistosome research. PMID:22253936

  8. Dictyocaulus viviparus genome, variome and transcriptome elucidate lungworm biology and support future intervention

    PubMed Central

    McNulty, Samantha N.; Strübe, Christina; Rosa, Bruce A.; Martin, John C.; Tyagi, Rahul; Choi, Young-Jun; Wang, Qi; Hallsworth Pepin, Kymberlie; Zhang, Xu; Ozersky, Philip; Wilson, Richard K.; Sternberg, Paul W.; Gasser, Robin B.; Mitreva, Makedonka

    2016-01-01

    The bovine lungworm, Dictyocaulus viviparus (order Strongylida), is an important parasite of livestock that causes substantial economic and production losses worldwide. Here we report the draft genome, variome, and developmental transcriptome of D. viviparus. The genome (161 Mb) is smaller than those of related bursate nematodes and encodes fewer proteins (14,171 total). In the first genome-wide assessment of genomic variation in any parasitic nematode, we found a high degree of sequence variability in proteins predicted to be involved host-parasite interactions. Next, we used extensive RNA sequence data to track gene transcription across the life cycle of D. viviparus, and identified genes that might be important in nematode development and parasitism. Finally, we predicted genes that could be vital in host-parasite interactions, genes that could serve as drug targets, and putative RNAi effectors with a view to developing functional genomic tools. This extensive, well-curated dataset should provide a basis for developing new anthelmintics, vaccines, and improved diagnostic tests and serve as a platform for future investigations of drug resistance and epidemiology of the bovine lungworm and related nematodes. PMID:26856411

  9. Transcriptome analysis in Concholepas concholepas (Gastropoda, Muricidae): mining and characterization of new genomic and molecular markers.

    PubMed

    Cárdenas, Leyla; Sánchez, Roland; Gomez, Daniela; Fuenzalida, Gonzalo; Gallardo-Escárate, Cristián; Tanguy, Arnaud

    2011-09-01

    The marine gastropod Concholepas concholepas, locally known as the "loco", is the main target species of the benthonic Chilean fisheries. Genetic and genomic tools are necessary to study the genome of this species in order to understand the molecular basis of its development, growth, and other key traits to improve the management strategies and to identify local adaptation to prevent loss of biodiversity. Here, we use pyrosequencing technologies to generate the first transcriptomic database from adult specimens of the loco. After trimming, a total of 140,756 Expressed Sequence Tag sequences were achieved. Clustering and assembly analysis identified 19,219 contigs and 105,435 singleton sequences. BlastN analysis showed a significant identity with Expressed Sequence Tags of different gastropod species available in public databases. Similarly, BlastX results showed that only 895 out of the total 124,654 had significant hits and may represent novel genes for marine gastropods. From this database, simple sequence repeat motifs were also identified and a total of 38 primer pairs were designed and tested to assess their potential as informative markers and to investigate their cross-species amplification in different related gastropod species. This dataset represents the first publicly available 454 data for a marine gastropod endemic to the southeastern Pacific coast, providing a valuable transcriptomic resource for future efforts of gene discovery and development of functional markers in other marine gastropods.

  10. Phenotypic, transcriptomic, and genomic features of clonal plasma cells in light-chain amyloidosis.

    PubMed

    Paiva, Bruno; Martinez-Lopez, Joaquin; Corchete, Luis A; Sanchez-Vega, Beatriz; Rapado, Inmaculada; Puig, Noemi; Barrio, Santiago; Sanchez, Maria-Luz; Alignani, Diego; Lasa, Marta; García de Coca, Alfonso; Pardal, Emilia; Oriol, Alberto; Garcia, Maria-Esther Gonzalez; Escalante, Fernando; González-López, Tomás J; Palomera, Luis; Alonso, José; Prosper, Felipe; Orfao, Alberto; Vidriales, Maria-Belen; Mateos, María-Victoria; Lahuerta, Juan-Jose; Gutierrez, Norma C; San Miguel, Jesús F

    2016-06-16

    Immunoglobulin light-chain amyloidosis (AL) and multiple myeloma (MM) are 2 distinct monoclonal gammopathies that involve the same cellular compartment: clonal plasma cells (PCs). Despite the fact that knowledge about MM PC biology has significantly increased in the last decade, the same does not apply for AL. Here, we used an integrative phenotypic, molecular, and genomic approach to study clonal PCs from 24 newly diagnosed patients with AL. Through principal-component-analysis, we demonstrated highly overlapping phenotypic profiles between AL and both monoclonal gammopathy of undetermined significance and MM PCs. However, in contrast to MM, highly purified fluorescence-activated cell-sorted clonal PCs from AL (n = 9) showed almost normal transcriptome, with only 38 deregulated genes vs normal PCs; these included a few tumor-suppressor (CDH1, RCAN) and proapoptotic (GLIPR1, FAS) genes. Notwithstanding, clonal PCs in AL (n = 11) were genomically unstable, with a median of 9 copy number alterations (CNAs) per case, many of such CNAs being similar to those found in MM. Whole-exome sequencing (WES) performed in 5 AL patients revealed a median of 15 nonrecurrent mutations per case. Altogether, our results show that in the absence of a unifying mutation by WES, clonal PCs in AL display phenotypic and CNA profiles similar to MM, but their transcriptome is remarkably similar to that of normal PCs. PMID:27069257

  11. Development of Genomic Resources for Pacific Herring through Targeted Transcriptome Pyrosequencing

    PubMed Central

    Roberts, Steven B.; Hauser, Lorenz; Seeb, Lisa W.; Seeb, James E.

    2012-01-01

    Pacific herring (Clupea pallasii) support commercially and culturally important fisheries but have experienced significant additional pressure from a variety of anthropogenic and environmental sources. In order to provide genomic resources to facilitate organismal and population level research, high-throughput pyrosequencing (Roche 454) was carried out on transcriptome libraries from liver and testes samples taken in Prince William Sound, the Bering Sea, and the Gulf of Alaska. Over 40,000 contigs were identified with an average length of 728 bp. We describe an annotated transcriptome as well as a workflow for single nucleotide polymorphism (SNP) discovery and validation. A subset of 96 candidate SNPs chosen from 10,933 potential SNPs, were tested using a combination of Sanger sequencing and high-resolution melt-curve analysis. Five SNPs supported between-ocean-basin differentiation, while one SNP associated with immune function provided high differentiation between Prince William Sound and Kodiak Island within the Gulf of Alaska. These genomic resources provide a basis for environmental physiology studies and opportunities for marker development and subsequent population structure analysis. PMID:22383979

  12. The capsicum transcriptome DB: a “hot” tool for genomic research

    PubMed Central

    Góngora-Castillo, Elsa; Fajardo-Jaime, Rubén; Fernández-Cortes, Araceli; Jofre-Garfias, Alba E; Lozoya-Gloria, Edmundo; Martínez, Octavio; Ochoa-Alejo, Neftalí; Rivera-Bustamante, Rafael

    2012-01-01

    Chili pepper (Capsicum annuum) is an economically important crop with no available public genome sequence. We describe a genomic resource to facilitate Capsicum annuum research. A collection of Expressed Sequence Tags (ESTs) derived from five C. annuum organs (root, stem, leaf, flower and fruit) were sequenced using the Sanger method and multiple leaf transcriptomes were deeply sampled using with GS-pyrosequencing. A hybrid assembly of 1,324,516 raw reads yielded 32,314 high quality contigs as validated by coverage and identity analysis with existing pepper sequences. Overall, 75.5% of the contigs had significant sequence similarity to entries in nucleic acid and protein databases; 23% of the sequences have not been previously reported for C. annuum and expand sequence resources for this species. A MySQL database and a user-friendly Web interface were constructed with search-tools that permit queries of the ESTs including sequence, functional annotation, Gene Ontology classification, metabolic pathways, and assembly information. The Capsicum Transcriptome DB is free available from http://www.bioingenios.ira.cinvestav.mx:81/Joomla/ PMID:22359434

  13. Genome and Transcriptome Sequences Reveal the Specific Parasitism of the Nematophagous Purpureocillium lilacinum 36-1

    PubMed Central

    Xie, Jialian; Li, Shaojun; Mo, Chenmi; Xiao, Xueqiong; Peng, Deliang; Wang, Gaofeng; Xiao, Yannong

    2016-01-01

    Purpureocillium lilacinum is a promising nematophagous ascomycete able to adapt diverse environments and it is also an opportunistic fungus that infects humans. A microbial inoculant of P. lilacinum has been registered to control plant parasitic nematodes. However, the molecular mechanism of the toxicological processes is still unclear because of the relatively few reports on the subject. In this study, using Illumina paired-end sequencing, the draft genome sequence and the transcriptome of P. lilacinum strain 36-1 infecting nematode-eggs were determined. Whole genome alignment indicated that P. lilacinum 36-1 possessed a more dynamic genome in comparison with P. lilacinum India strain. Moreover, a phylogenetic analysis showed that the P. lilacinum 36-1 had a closer relation to entomophagous fungi. The protein-coding genes in P. lilacinum 36-1 occurred much more frequently than they did in other fungi, which was a result of the depletion of repeat-induced point mutations (RIP). Comparative genome and transcriptome analyses revealed the genes that were involved in pathogenicity, particularly in the recognition, adhesion of nematode-eggs, downstream signal transduction pathways and hydrolase genes. By contrast, certain numbers of cellulose and xylan degradation genes and a lack of polysaccharide lyase genes showed the potential of P. lilacinum 36-1 as an endophyte. Notably, the expression of appressorium-formation and antioxidants-related genes exhibited similar infection patterns in P. lilacinum strain 36-1 to those of the model entomophagous fungi Metarhizium spp. These results uncovered the specific parasitism of P. lilacinum and presented the genes responsible for the infection of nematode-eggs. PMID:27486440

  14. Genome and Transcriptome Sequences Reveal the Specific Parasitism of the Nematophagous Purpureocillium lilacinum 36-1.

    PubMed

    Xie, Jialian; Li, Shaojun; Mo, Chenmi; Xiao, Xueqiong; Peng, Deliang; Wang, Gaofeng; Xiao, Yannong

    2016-01-01

    Purpureocillium lilacinum is a promising nematophagous ascomycete able to adapt diverse environments and it is also an opportunistic fungus that infects humans. A microbial inoculant of P. lilacinum has been registered to control plant parasitic nematodes. However, the molecular mechanism of the toxicological processes is still unclear because of the relatively few reports on the subject. In this study, using Illumina paired-end sequencing, the draft genome sequence and the transcriptome of P. lilacinum strain 36-1 infecting nematode-eggs were determined. Whole genome alignment indicated that P. lilacinum 36-1 possessed a more dynamic genome in comparison with P. lilacinum India strain. Moreover, a phylogenetic analysis showed that the P. lilacinum 36-1 had a closer relation to entomophagous fungi. The protein-coding genes in P. lilacinum 36-1 occurred much more frequently than they did in other fungi, which was a result of the depletion of repeat-induced point mutations (RIP). Comparative genome and transcriptome analyses revealed the genes that were involved in pathogenicity, particularly in the recognition, adhesion of nematode-eggs, downstream signal transduction pathways and hydrolase genes. By contrast, certain numbers of cellulose and xylan degradation genes and a lack of polysaccharide lyase genes showed the potential of P. lilacinum 36-1 as an endophyte. Notably, the expression of appressorium-formation and antioxidants-related genes exhibited similar infection patterns in P. lilacinum strain 36-1 to those of the model entomophagous fungi Metarhizium spp. These results uncovered the specific parasitism of P. lilacinum and presented the genes responsible for the infection of nematode-eggs. PMID:27486440

  15. Genotyping-by-Sequencing SNP Identification for Crops without a Reference Genome: Using Transcriptome Based Mapping as an Alternative Strategy.

    PubMed

    Berthouly-Salazar, Cécile; Mariac, Cédric; Couderc, Marie; Pouzadoux, Juliette; Floc'h, Jean-Baptiste; Vigouroux, Yves

    2016-01-01

    Next-generation sequencing opens the way for genomic studies of diversity even for non-model crops and animals. Genome reduction techniques are becoming progressively more popular as they allow a fraction of the genome to be sequenced for multiple individuals and/or populations. These techniques are an efficient way to explore genome diversity in non-model crops and animals for which no reference genome is available. Genome reduction techniques emerged with the development of specific pipelines such as UNEAK (Universal Network Enabled Analysis Kit) and Stacks. However, even for non-model crops and animals, transcriptomes are easier to obtain, thereby making it possible to directly map reads. We investigate the direct use of transcriptome as an alternative strategy. Our specific objective was to compare SNPs obtained from the UNEAK pipeline as well as SNPs obtained by directly mapping genotyping-by-sequencing reads on a transcriptome. We assessed the feasibility of both SNP datasets, UNEAK and transcriptome mapping, to investigate the diversity of 91 samples of wild pearl millet sampled across its distribution area. Both approaches produced several tens of thousands of single nucleotide variants, but differed in the way the variants were identified, leading to differences in the frequency spectrum associated with marked differences in the assessment of diversity. Difference in the frequency spectrum significantly biased a large set of diversity analyses as well as detection of selection approaches. However, whatever the approach, we found very similar inference of genetic structure, with three major genetic groups from West, Central, and East Africa. For non-model crops, using transcriptome data as a reference is thus a particularly promising way to obtain a more thorough analysis of datasets generated using genome reduction techniques.

  16. Genotyping-by-Sequencing SNP Identification for Crops without a Reference Genome: Using Transcriptome Based Mapping as an Alternative Strategy

    PubMed Central

    Berthouly-Salazar, Cécile; Mariac, Cédric; Couderc, Marie; Pouzadoux, Juliette; Floc’h, Jean-Baptiste; Vigouroux, Yves

    2016-01-01

    Next-generation sequencing opens the way for genomic studies of diversity even for non-model crops and animals. Genome reduction techniques are becoming progressively more popular as they allow a fraction of the genome to be sequenced for multiple individuals and/or populations. These techniques are an efficient way to explore genome diversity in non-model crops and animals for which no reference genome is available. Genome reduction techniques emerged with the development of specific pipelines such as UNEAK (Universal Network Enabled Analysis Kit) and Stacks. However, even for non-model crops and animals, transcriptomes are easier to obtain, thereby making it possible to directly map reads. We investigate the direct use of transcriptome as an alternative strategy. Our specific objective was to compare SNPs obtained from the UNEAK pipeline as well as SNPs obtained by directly mapping genotyping-by-sequencing reads on a transcriptome. We assessed the feasibility of both SNP datasets, UNEAK and transcriptome mapping, to investigate the diversity of 91 samples of wild pearl millet sampled across its distribution area. Both approaches produced several tens of thousands of single nucleotide variants, but differed in the way the variants were identified, leading to differences in the frequency spectrum associated with marked differences in the assessment of diversity. Difference in the frequency spectrum significantly biased a large set of diversity analyses as well as detection of selection approaches. However, whatever the approach, we found very similar inference of genetic structure, with three major genetic groups from West, Central, and East Africa. For non-model crops, using transcriptome data as a reference is thus a particularly promising way to obtain a more thorough analysis of datasets generated using genome reduction techniques. PMID:27379109

  17. Transcriptome profiling of the demosponge Amphimedon queenslandica reveals genome-wide events that accompany major life cycle transitions

    PubMed Central

    2012-01-01

    Background The biphasic life cycle with pelagic larva and benthic adult stages is widely observed in the animal kingdom, including the Porifera (sponges), which are the earliest branching metazoans. The demosponge, Amphimedon queenslandica, undergoes metamorphosis from a free-swimming larva into a sessile adult that bears no morphological resemblance to other animals. While the genome of A. queenslandica contains an extensive repertoire of genes very similar to that of complex bilaterians, it is as yet unclear how this is drawn upon to coordinate changing morphological features and ecological demands throughout the sponge life cycle. Results To identify genome-wide events that accompany the pelagobenthic transition in A. queenslandica, we compared global gene expression profiles at four key developmental stages by sequencing the poly(A) transcriptome using SOLiD technology. Large-scale changes in transcription were observed as sponge larvae settled on the benthos and began metamorphosis. Although previous systematics suggest that the only clear homology between Porifera and other animals is in the embryonic and larval stages, we observed extensive use of genes involved in metazoan-associated cellular processes throughout the sponge life cycle. Sponge-specific transcripts are not over-represented in the morphologically distinct adult; rather, many genes that encode typical metazoan features, such as cell adhesion and immunity, are upregulated. Our analysis further revealed gene families with candidate roles in competence, settlement, and metamorphosis in the sponge, including transcription factors, G-protein coupled receptors and other signaling molecules. Conclusions This first genome-wide study of the developmental transcriptome in an early branching metazoan highlights major transcriptional events that accompany the pelagobenthic transition and point to a network of regulatory mechanisms that coordinate changes in morphology with shifting environmental demands

  18. Genomic and transcriptomic insights into the efficient entomopathogenicity of Bacillus thuringiensis.

    PubMed

    Zhu, Lei; Peng, Donghai; Wang, Yueying; Ye, Weixing; Zheng, Jinshui; Zhao, Changming; Han, Dongmei; Geng, Ce; Ruan, Lifang; He, Jin; Yu, Ziniu; Sun, Ming

    2015-09-28

    Bacillus thuringiensis has been globally used as a microbial pesticide for over 70 years. However, information regarding its various adaptions and virulence factors and their roles in the entomopathogenic process remains limited. In this work, we present the complete genomes of two industrially patented Bacillus thuringiensis strains (HD-1 and YBT-1520). A comparative genomic analysis showed a larger and more complicated genome constitution that included novel insecticidal toxicity-related genes (ITRGs). All of the putative ITRGs were summarized according to the steps of infection. A comparative genomic analysis showed that highly toxic strains contained significantly more ITRGs, thereby providing additional strategies for infection, immune evasion, and cadaver utilization. Furthermore, a comparative transcriptomic analysis suggested that a high expression of these ITRGs was a key factor in efficient entomopathogenicity. We identified an active extra urease synthesis system in the highly toxic strains that may aid B. thuringiensis survival in insects (similar to previous results with well-known pathogens). Taken together, these results explain the efficient entomopathogenicity of B. thuringiensis. It provides novel insights into the strategies used by B. thuringiensis to resist and overcome host immune defenses and helps identify novel toxicity factors.

  19. Genomic and transcriptomic insights into the efficient entomopathogenicity of Bacillus thuringiensis

    PubMed Central

    Zhu, Lei; Peng, Donghai; Wang, Yueying; Ye, Weixing; Zheng, Jinshui; Zhao, Changming; Han, Dongmei; Geng, Ce; Ruan, Lifang; He, Jin; Yu, Ziniu; Sun, Ming

    2015-01-01

    Bacillus thuringiensis has been globally used as a microbial pesticide for over 70 years. However, information regarding its various adaptions and virulence factors and their roles in the entomopathogenic process remains limited. In this work, we present the complete genomes of two industrially patented Bacillus thuringiensis strains (HD-1 and YBT-1520). A comparative genomic analysis showed a larger and more complicated genome constitution that included novel insecticidal toxicity-related genes (ITRGs). All of the putative ITRGs were summarized according to the steps of infection. A comparative genomic analysis showed that highly toxic strains contained significantly more ITRGs, thereby providing additional strategies for infection, immune evasion, and cadaver utilization. Furthermore, a comparative transcriptomic analysis suggested that a high expression of these ITRGs was a key factor in efficient entomopathogenicity. We identified an active extra urease synthesis system in the highly toxic strains that may aid B. thuringiensis survival in insects (similar to previous results with well-known pathogens). Taken together, these results explain the efficient entomopathogenicity of B. thuringiensis. It provides novel insights into the strategies used by B. thuringiensis to resist and overcome host immune defenses and helps identify novel toxicity factors. PMID:26411888

  20. Discover hidden splicing variations by mapping personal transcriptomes to personal genomes.

    PubMed

    Stein, Shayna; Lu, Zhi-Xiang; Bahrami-Samani, Emad; Park, Juw Won; Xing, Yi

    2015-12-15

    RNA-seq has become a popular technology for studying genetic variation of pre-mRNA alternative splicing. Commonly used RNA-seq aligners rely on the consensus splice site dinucleotide motifs to map reads across splice junctions. Consequently, genomic variants that create novel splice site dinucleotides may produce splice junction RNA-seq reads that cannot be mapped to the reference genome. We developed and evaluated an approach to identify 'hidden' splicing variations in personal transcriptomes, by mapping personal RNA-seq data to personal genomes. Computational analysis and experimental validation indicate that this approach identifies personal specific splice junctions at a low false positive rate. Applying this approach to an RNA-seq data set of 75 individuals, we identified 506 personal specific splice junctions, among which 437 were novel splice junctions not documented in current human transcript annotations. 94 splice junctions had splice site SNPs associated with GWAS signals of human traits and diseases. These involve genes whose splicing variations have been implicated in diseases (such as OAS1), as well as novel associations between alternative splicing and diseases (such as ICA1). Collectively, our work demonstrates that the personal genome approach to RNA-seq read alignment enables the discovery of a large but previously unknown catalog of splicing variations in human populations.

  1. Discover hidden splicing variations by mapping personal transcriptomes to personal genomes

    PubMed Central

    Stein, Shayna; Lu, Zhi-xiang; Bahrami-Samani, Emad; Park, Juw Won; Xing, Yi

    2015-01-01

    RNA-seq has become a popular technology for studying genetic variation of pre-mRNA alternative splicing. Commonly used RNA-seq aligners rely on the consensus splice site dinucleotide motifs to map reads across splice junctions. Consequently, genomic variants that create novel splice site dinucleotides may produce splice junction RNA-seq reads that cannot be mapped to the reference genome. We developed and evaluated an approach to identify ‘hidden’ splicing variations in personal transcriptomes, by mapping personal RNA-seq data to personal genomes. Computational analysis and experimental validation indicate that this approach identifies personal specific splice junctions at a low false positive rate. Applying this approach to an RNA-seq data set of 75 individuals, we identified 506 personal specific splice junctions, among which 437 were novel splice junctions not documented in current human transcript annotations. 94 splice junctions had splice site SNPs associated with GWAS signals of human traits and diseases. These involve genes whose splicing variations have been implicated in diseases (such as OAS1), as well as novel associations between alternative splicing and diseases (such as ICA1). Collectively, our work demonstrates that the personal genome approach to RNA-seq read alignment enables the discovery of a large but previously unknown catalog of splicing variations in human populations. PMID:26578562

  2. Extensive Transcriptomic and Genomic Analysis Provides New Insights about Luminal Breast Cancers

    PubMed Central

    Tishchenko, Inna; Milioli, Heloisa Helena; Riveros, Carlos; Moscato, Pablo

    2016-01-01

    Despite constituting approximately two thirds of all breast cancers, the luminal A and B tumours are poorly classified at both clinical and molecular levels. There are contradictory reports on the nature of these subtypes: some define them as intrinsic entities, others as a continuum. With the aim of addressing these uncertainties and identifying molecular signatures of patients at risk, we conducted a comprehensive transcriptomic and genomic analysis of 2,425 luminal breast cancer samples. Our results indicate that the separation between the molecular luminal A and B subtypes—per definition—is not associated with intrinsic characteristics evident in the differentiation between other subtypes. Moreover, t-SNE and MST-kNN clustering approaches based on 10,000 probes, associated with luminal tumour initiation and/or development, revealed the close connections between luminal A and B tumours, with no evidence of a clear boundary between them. Thus, we considered all luminal tumours as a single heterogeneous group for analysis purposes. We first stratified luminal tumours into two distinct groups by their HER2 gene cluster co-expression: HER2-amplified luminal and ordinary-luminal. The former group is associated with distinct transcriptomic and genomic profiles, and poor prognosis; it comprises approximately 8% of all luminal cases. For the remaining ordinary-luminal tumours we further identified the molecular signature correlated with disease outcomes, exhibiting an approximately continuous gene expression range from low to high risk. Thus, we employed four virtual quantiles to segregate the groups of patients. The clinico-pathological characteristics and ratios of genomic aberrations are concordant with the variations in gene expression profiles, hinting at a progressive staging. The comparison with the current separation into luminal A and B subtypes revealed a substantially improved survival stratification. Concluding, we suggest a review of the definition of

  3. Extensive Transcriptomic and Genomic Analysis Provides New Insights about Luminal Breast Cancers.

    PubMed

    Tishchenko, Inna; Milioli, Heloisa Helena; Riveros, Carlos; Moscato, Pablo

    2016-01-01

    Despite constituting approximately two thirds of all breast cancers, the luminal A and B tumours are poorly classified at both clinical and molecular levels. There are contradictory reports on the nature of these subtypes: some define them as intrinsic entities, others as a continuum. With the aim of addressing these uncertainties and identifying molecular signatures of patients at risk, we conducted a comprehensive transcriptomic and genomic analysis of 2,425 luminal breast cancer samples. Our results indicate that the separation between the molecular luminal A and B subtypes-per definition-is not associated with intrinsic characteristics evident in the differentiation between other subtypes. Moreover, t-SNE and MST-kNN clustering approaches based on 10,000 probes, associated with luminal tumour initiation and/or development, revealed the close connections between luminal A and B tumours, with no evidence of a clear boundary between them. Thus, we considered all luminal tumours as a single heterogeneous group for analysis purposes. We first stratified luminal tumours into two distinct groups by their HER2 gene cluster co-expression: HER2-amplified luminal and ordinary-luminal. The former group is associated with distinct transcriptomic and genomic profiles, and poor prognosis; it comprises approximately 8% of all luminal cases. For the remaining ordinary-luminal tumours we further identified the molecular signature correlated with disease outcomes, exhibiting an approximately continuous gene expression range from low to high risk. Thus, we employed four virtual quantiles to segregate the groups of patients. The clinico-pathological characteristics and ratios of genomic aberrations are concordant with the variations in gene expression profiles, hinting at a progressive staging. The comparison with the current separation into luminal A and B subtypes revealed a substantially improved survival stratification. Concluding, we suggest a review of the definition of

  4. Genome and transcriptome sequencing of the halophilic fungus Wallemia ichthyophaga: haloadaptations present and absent

    PubMed Central

    2013-01-01

    Background The basidomycete Wallemia ichthyophaga from the phylogenetically distinct class Wallemiomycetes is the most halophilic fungus known to date. It requires at least 10% NaCl and thrives in saturated salt solution. To investigate the genomic basis of this exceptional phenotype, we obtained a de-novo genome sequence of the species type-strain and analysed its transcriptomic response to conditions close to the limits of its lower and upper salinity range. Results The unusually compact genome is 9.6 Mb large and contains 1.67% repetitive sequences. Only 4884 predicted protein coding genes cover almost three quarters of the sequence. Of 639 differentially expressed genes, two thirds are more expressed at lower salinity. Phylogenomic analysis based on the largest dataset used to date (whole proteomes) positions Wallemiomycetes as a 250-million-year-old sister group of Agaricomycotina. Contrary to the closely related species Wallemia sebi, W. ichthyophaga appears to have lost the ability for sexual reproduction. Several protein families are significantly expanded or contracted in the genome. Among these, there are the P-type ATPase cation transporters, but not the sodium/ hydrogen exchanger family. Transcription of all but three cation transporters is not salt dependent. The analysis also reveals a significant enrichment in hydrophobins, which are cell-wall proteins with multiple cellular functions. Half of these are differentially expressed, and most contain an unusually large number of acidic amino acids. This discovery is of particular interest due to the numerous applications of hydrophobines from other fungi in industry, pharmaceutics and medicine. Conclusions W. ichthyophaga is an extremophilic specialist that shows only low levels of adaptability and genetic recombination. This is reflected in the characteristics of its genome and its transcriptomic response to salt. No unusual traits were observed in common salt-tolerance mechanisms, such as transport of

  5. Analysis of the Legionella longbeachae genome and transcriptome uncovers unique strategies to cause Legionnaires' disease.

    PubMed

    Cazalet, Christel; Gomez-Valero, Laura; Rusniok, Christophe; Lomma, Mariella; Dervins-Ravault, Delphine; Newton, Hayley J; Sansom, Fiona M; Jarraud, Sophie; Zidane, Nora; Ma, Laurence; Bouchier, Christiane; Etienne, Jerôme; Hartland, Elizabeth L; Buchrieser, Carmen

    2010-02-01

    Legionella pneumophila and L. longbeachae are two species of a large genus of bacteria that are ubiquitous in nature. L. pneumophila is mainly found in natural and artificial water circuits while L. longbeachae is mainly present in soil. Under the appropriate conditions both species are human pathogens, capable of causing a severe form of pneumonia termed Legionnaires' disease. Here we report the sequencing and analysis of four L. longbeachae genomes, one complete genome sequence of L. longbeachae strain NSW150 serogroup (Sg) 1, and three draft genome sequences another belonging to Sg1 and two to Sg2. The genome organization and gene content of the four L. longbeachae genomes are highly conserved, indicating strong pressure for niche adaptation. Analysis and comparison of L. longbeachae strain NSW150 with L. pneumophila revealed common but also unexpected features specific to this pathogen. The interaction with host cells shows distinct features from L. pneumophila, as L. longbeachae possesses a unique repertoire of putative Dot/Icm type IV secretion system substrates, eukaryotic-like and eukaryotic domain proteins, and encodes additional secretion systems. However, analysis of the ability of a dotA mutant of L. longbeachae NSW150 to replicate in the Acanthamoeba castellanii and in a mouse lung infection model showed that the Dot/Icm type IV secretion system is also essential for the virulence of L. longbeachae. In contrast to L. pneumophila, L. longbeachae does not encode flagella, thereby providing a possible explanation for differences in mouse susceptibility to infection between the two pathogens. Furthermore, transcriptome analysis revealed that L. longbeachae has a less pronounced biphasic life cycle as compared to L. pneumophila, and genome analysis and electron microscopy suggested that L. longbeachae is encapsulated. These species-specific differences may account for the different environmental niches and disease epidemiology of these two Legionella

  6. Genomic, Transcriptomic, and Phenomic Variation Reveals the Complex Adaptation of Modern Maize Breeding.

    PubMed

    Liu, Haijun; Wang, Xiaqing; Warburton, Marilyn L; Wen, Weiwei; Jin, Minliang; Deng, Min; Liu, Jie; Tong, Hao; Pan, Qingchun; Yang, Xiaohong; Yan, Jianbing

    2015-06-01

    The temperate-tropical division of early maize germplasms to different agricultural environments was arguably the greatest adaptation process associated with the success and near ubiquitous importance of global maize production. Deciphering this history is challenging, but new insight has been gained from examining 558 529 single nucleotide polymorphisms, expression data of 28 769 genes, and 662 traits collected from 368 diverse temperate and tropical maize inbred lines in this study. This is a new attempt to systematically exploit the mechanisms of the adaptation process in maize. Our results indicate that divergence between tropical and temperate lines apparently occurred 3400-6700 years ago. Seven hundred and one genomic selection signals and transcriptomic variants including 2700 differentially expressed individual genes and 389 rewired co-expression network genes were identified. These candidate signals were found to be functionally related to stress responses, and most were associated with directionally selected traits, which may have been an advantage under widely varying environmental conditions faced by maize as it was migrated away from its domestication center. Our study also clearly indicates that such stress adaptation could involve evolution of protein-coding sequences as well as transcriptome-level regulatory changes. The latter process may be a more flexible and dynamic way for maize to adapt to environmental changes along its short evolutionary history. PMID:25620769

  7. The expanding transcriptome: the genome as the ‘Book of Sand'

    PubMed Central

    Mendes Soares, Luis M; Valcárcel, Juan

    2006-01-01

    The central dogma of molecular biology inspired by classical work in prokaryotic organisms accounts for only part of the genetic agenda of complex eukaryotes. First, post-transcriptional events lead to the generation of multiple mRNAs, proteins and functions from a single primary transcript, revealing regulatory networks distinct in mechanism and biological function from those controlling RNA transcription. Second, a variety of populous families of small RNAs (small nuclear RNAs, small nucleolar RNAs, microRNAs, siRNAs and shRNAs) assemble on ribonucleoprotein complexes and regulate virtually all aspects of the gene expression pathway, with profound biological consequences. Third, high-throughput methods of genomic analysis reveal that RNAs other than non-protein-coding RNAs (ncRNAs) represent a major component of the transcriptome that may perform novel functions in gene regulation and beyond. Post-transcriptional regulation, small RNAs and ncRNAs provide an expanding picture of the transcriptome that enriches our views of what genes are, how they operate, evolve and are regulated. PMID:16511566

  8. Genome-wide transcriptome profiling reveals novel insights into Luffa cylindrica browning.

    PubMed

    Chen, Xia; Tan, Taiming; Xu, Changcheng; Huang, Shuping; Tan, Jie; Zhang, Min; Wang, Chunli; Xie, Conghua

    2015-08-01

    Luffa cylindrica (sponge gourd) is one of the most popular vegetables in China. Production and consumption of L. cylindrica are limited due to postharvest browning; however, little is known about the genetic regulation of the browning process. In the present study, transcriptome profiles of L. cylindrica cultivars, YLB05 (browning resistant) and XTR05 (browning sensitive), were analyzed using next-generation sequencing to clarify the genes and mechanisms associated with browning. A total of 9.1 Gb of valid data including 116,703 unigenes (>200 bp) were obtained and 39,473 sequences were annotated by alignment against five public databases. Of these, there were 27,407 genes assigned to 747 Gene Ontology functional categories; and 12,350 genes were annotated with 25 Eukaryotic Orthologous Groups (KOG) categories with 343 KOG functional terms. Additionally, by searching against the Kyoto Encyclopedia of Genes and Genomes database, 8689 unigenes were mapped to 189 pathways. Furthermore, there were 24,556 sequences found to be differentially regulated, including 4344 annotated unigenes. Several genes potentially associated with phenolic oxidation, carbohydrate and hormone metabolism were found differentially regulated between the cultivars of different browning sensitivities. Our results suggest that elements involved in enzymatic processes and other pathways might be responsible for L. cylindrica browning. The present study provides a comprehensive transcriptome sequence resource, which will facilitate further studies on gene discovery and exploiting the fruit browning mechanism of L. cylindrica. PMID:26086104

  9. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi

    PubMed Central

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  10. Genome-wide RNA-seq analysis of human and mouse platelet transcriptomes

    PubMed Central

    Rowley, Jesse W.; Oler, Andrew J.; Tolley, Neal D.; Hunter, Benjamin N.; Low, Elizabeth N.; Nix, David A.; Yost, Christian C.; Zimmerman, Guy A.

    2011-01-01

    Inbred mice are a useful tool for studying the in vivo functions of platelets. Nonetheless, the mRNA signature of mouse platelets is not known. Here, we use paired-end next-generation RNA sequencing (RNA-seq) to characterize the polyadenylated transcriptomes of human and mouse platelets. We report that RNA-seq provides unprecedented resolution of mRNAs that are expressed across the entire human and mouse genomes. Transcript expression and abundance are often conserved between the 2 species. Several mRNAs, however, are differentially expressed in human and mouse platelets. Moreover, previously described functional disparities between mouse and human platelets are reflected in differences at the transcript level, including protease activated receptor-1, protease activated receptor-3, platelet activating factor receptor, and factor V. This suggests that RNA-seq is a useful tool for predicting differences in platelet function between mice and humans. Our next-generation sequencing analysis provides new insights into the human and murine platelet transcriptomes. The sequencing dataset will be useful in the design of mouse models of hemostasis and a catalyst for discovery of new functions of platelets. Access to the dataset is found in the “Introduction.” PMID:21596849

  11. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges

    PubMed Central

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress. PMID:26738846

  12. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.

    PubMed

    Curtis, Christina; Shah, Sohrab P; Chin, Suet-Feung; Turashvili, Gulisa; Rueda, Oscar M; Dunning, Mark J; Speed, Doug; Lynch, Andy G; Samarajiwa, Shamith; Yuan, Yinyin; Gräf, Stefan; Ha, Gavin; Haffari, Gholamreza; Bashashati, Ali; Russell, Roslin; McKinney, Steven; Langerød, Anita; Green, Andrew; Provenzano, Elena; Wishart, Gordon; Pinder, Sarah; Watson, Peter; Markowetz, Florian; Murphy, Leigh; Ellis, Ian; Purushotham, Arnie; Børresen-Dale, Anne-Lise; Brenton, James D; Tavaré, Simon; Caldas, Carlos; Aparicio, Samuel

    2012-04-18

    The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in ~40% of genes, with the landscape dominated by cis- and trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome.

  13. Genomic and Transcriptomic Features of Response to Anti-PD-1 Therapy in Metastatic Melanoma.

    PubMed

    Hugo, Willy; Zaretsky, Jesse M; Sun, Lu; Song, Chunying; Moreno, Blanca Homet; Hu-Lieskovan, Siwen; Berent-Maoz, Beata; Pang, Jia; Chmielowski, Bartosz; Cherry, Grace; Seja, Elizabeth; Lomeli, Shirley; Kong, Xiangju; Kelley, Mark C; Sosman, Jeffrey A; Johnson, Douglas B; Ribas, Antoni; Lo, Roger S

    2016-03-24

    PD-1 immune checkpoint blockade provides significant clinical benefits for melanoma patients. We analyzed the somatic mutanomes and transcriptomes of pretreatment melanoma biopsies to identify factors that may influence innate sensitivity or resistance to anti-PD-1 therapy. We find that overall high mutational loads associate with improved survival, and tumors from responding patients are enriched for mutations in the DNA repair gene BRCA2. Innately resistant tumors display a transcriptional signature (referred to as the IPRES, or innate anti-PD-1 resistance), indicating concurrent up-expression of genes involved in the regulation of mesenchymal transition, cell adhesion, extracellular matrix remodeling, angiogenesis, and wound healing. Notably, mitogen-activated protein kinase (MAPK)-targeted therapy (MAPK inhibitor) induces similar signatures in melanoma, suggesting that a non-genomic form of MAPK inhibitor resistance mediates cross-resistance to anti-PD-1 therapy. Validation of the IPRES in other independent tumor cohorts defines a transcriptomic subset across distinct types of advanced cancer. These findings suggest that attenuating the biological processes that underlie IPRES may improve anti-PD-1 response in melanoma and other cancer types.

  14. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges.

    PubMed

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress. PMID:26738846

  15. Genomic and Transcriptomic Features of Response to Anti-PD-1 Therapy in Metastatic Melanoma.

    PubMed

    Hugo, Willy; Zaretsky, Jesse M; Sun, Lu; Song, Chunying; Moreno, Blanca Homet; Hu-Lieskovan, Siwen; Berent-Maoz, Beata; Pang, Jia; Chmielowski, Bartosz; Cherry, Grace; Seja, Elizabeth; Lomeli, Shirley; Kong, Xiangju; Kelley, Mark C; Sosman, Jeffrey A; Johnson, Douglas B; Ribas, Antoni; Lo, Roger S

    2016-03-24

    PD-1 immune checkpoint blockade provides significant clinical benefits for melanoma patients. We analyzed the somatic mutanomes and transcriptomes of pretreatment melanoma biopsies to identify factors that may influence innate sensitivity or resistance to anti-PD-1 therapy. We find that overall high mutational loads associate with improved survival, and tumors from responding patients are enriched for mutations in the DNA repair gene BRCA2. Innately resistant tumors display a transcriptional signature (referred to as the IPRES, or innate anti-PD-1 resistance), indicating concurrent up-expression of genes involved in the regulation of mesenchymal transition, cell adhesion, extracellular matrix remodeling, angiogenesis, and wound healing. Notably, mitogen-activated protein kinase (MAPK)-targeted therapy (MAPK inhibitor) induces similar signatures in melanoma, suggesting that a non-genomic form of MAPK inhibitor resistance mediates cross-resistance to anti-PD-1 therapy. Validation of the IPRES in other independent tumor cohorts defines a transcriptomic subset across distinct types of advanced cancer. These findings suggest that attenuating the biological processes that underlie IPRES may improve anti-PD-1 response in melanoma and other cancer types. PMID:26997480

  16. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi.

    PubMed

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  17. An evaluation of transcriptome-based exon capture for frog phylogenomics across multiple scales of divergence (Class: Amphibia, Order: Anura).

    PubMed

    Portik, Daniel M; Smith, Lydia L; Bi, Ke

    2016-09-01

    Custom sequence capture experiments are becoming an efficient approach for gathering large sets of orthologous markers in nonmodel organisms. Transcriptome-based exon capture utilizes transcript sequences to design capture probes, typically using a reference genome to identify intron-exon boundaries to exclude shorter exons (<200 bp). Here, we test directly using transcript sequences for probe design, which are often composed of multiple exons of varying lengths. Using 1260 orthologous transcripts, we conducted sequence captures across multiple phylogenetic scales for frogs, including outgroups ~100 Myr divergent from the ingroup. We recovered a large phylogenomic data set consisting of sequence alignments for 1047 of the 1260 transcriptome-based loci (~561 000 bp) and a large quantity of highly variable regions flanking the exons in transcripts (~70 000 bp), the latter improving substantially by only including ingroup species (~797 000 bp). We recovered both shorter (<100 bp) and longer exons (>200 bp), with no major reduction in coverage towards the ends of exons. We observed significant differences in the performance of blocking oligos for target enrichment and nontarget depletion during captures, and differences in PCR duplication rates resulting from the number of individuals pooled for capture reactions. We explicitly tested the effects of phylogenetic distance on capture sensitivity, specificity, and missing data, and provide a baseline estimate of expectations for these metrics based on a priori knowledge of nuclear pairwise differences among samples. We provide recommendations for transcriptome-based exon capture design based on our results, cost estimates and offer multiple pipelines for data assembly and analysis. PMID:27241806

  18. An evaluation of transcriptome-based exon capture for frog phylogenomics across multiple scales of divergence (Class: Amphibia, Order: Anura).

    PubMed

    Portik, Daniel M; Smith, Lydia L; Bi, Ke

    2016-09-01

    Custom sequence capture experiments are becoming an efficient approach for gathering large sets of orthologous markers in nonmodel organisms. Transcriptome-based exon capture utilizes transcript sequences to design capture probes, typically using a reference genome to identify intron-exon boundaries to exclude shorter exons (<200 bp). Here, we test directly using transcript sequences for probe design, which are often composed of multiple exons of varying lengths. Using 1260 orthologous transcripts, we conducted sequence captures across multiple phylogenetic scales for frogs, including outgroups ~100 Myr divergent from the ingroup. We recovered a large phylogenomic data set consisting of sequence alignments for 1047 of the 1260 transcriptome-based loci (~561 000 bp) and a large quantity of highly variable regions flanking the exons in transcripts (~70 000 bp), the latter improving substantially by only including ingroup species (~797 000 bp). We recovered both shorter (<100 bp) and longer exons (>200 bp), with no major reduction in coverage towards the ends of exons. We observed significant differences in the performance of blocking oligos for target enrichment and nontarget depletion during captures, and differences in PCR duplication rates resulting from the number of individuals pooled for capture reactions. We explicitly tested the effects of phylogenetic distance on capture sensitivity, specificity, and missing data, and provide a baseline estimate of expectations for these metrics based on a priori knowledge of nuclear pairwise differences among samples. We provide recommendations for transcriptome-based exon capture design based on our results, cost estimates and offer multiple pipelines for data assembly and analysis.

  19. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels

    PubMed Central

    Petit, Elsa; Coppi, Maddalena V.; Hayes, James C.; Tolonen, Andrew C.; Warnick, Thomas; Latouf, William G.; Amisano, Danielle; Biddle, Amy; Mukherjee, Supratim; Ivanova, Natalia; Lykidis, Athanassios; Land, Miriam; Hauser, Loren; Kyrpides, Nikos; Henrissat, Bernard; Lau, Joanne; Schnell, Danny J.; Church, George M.; Leschine, Susan B.; Blanchard, Jeffrey L.

    2015-01-01

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels. PMID:26035711

  20. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels

    SciTech Connect

    Petit, Elsa; Coppi, Maddalena V.; Hayes, James C.; Tolonen, Andrew C.; Warnick, Thomas; Latouf, William G.; Amisano, Danielle; Biddle, Amy; Mukherjee, Supratim; Ivanova, Natalia; Lykidis, Athanassios; Land, Miriam; Hauser, Loren; Kyrpides, Nikos; Henrissat, Bernard; Lau, Joanne; Schnell, Danny J.; Church, George M.; Leschine, Susan B.; Blanchard, Jeffrey L.

    2015-06-02

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of our present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. Lastly, these characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.

  1. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels

    DOE PAGESBeta

    Petit, Elsa; Coppi, Maddalena V.; Hayes, James C.; Tolonen, Andrew C.; Warnick, Thomas; Latouf, William G.; Amisano, Danielle; Biddle, Amy; Mukherjee, Supratim; Ivanova, Natalia; et al

    2015-06-02

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of our present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer.more » These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. Lastly, these characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.« less

  2. Genome, Transcriptome, and Functional Analyses of Penicillium expansum Provide New Insights Into Secondary Metabolism and Pathogenicity.

    PubMed

    Ballester, Ana-Rosa; Marcet-Houben, Marina; Levin, Elena; Sela, Noa; Selma-Lázaro, Cristina; Carmona, Lourdes; Wisniewski, Michael; Droby, Samir; González-Candelas, Luis; Gabaldón, Toni

    2015-03-01

    The relationship between secondary metabolism and infection in pathogenic fungi has remained largely elusive. The genus Penicillium comprises a group of plant pathogens with varying host specificities and with the ability to produce a wide array of secondary metabolites. The genomes of three Penicillium expansum strains, the main postharvest pathogen of pome fruit, and one Pencillium italicum strain, a postharvest pathogen of citrus fruit, were sequenced and compared with 24 other fungal species. A genomic analysis of gene clusters responsible for the production of secondary metabolites was performed. Putative virulence factors in P. expansum were identified by means of a transcriptomic analysis of apple fruits during the course of infection. Despite a major genome contraction, P. expansum is the Penicillium species with the largest potential for the production of secondary metabolites. Results using knockout mutants clearly demonstrated that neither patulin nor citrinin are required by P. expansum to successfully infect apples. Li et al. ( MPMI-12-14-0398-FI ) reported similar results and conclusions in their recently accepted paper.

  3. Identification of metastasis-associated genes in colorectal cancer through an integrated genomic and transcriptomic analysis

    PubMed Central

    Peng, Sihua

    2013-01-01

    Objective Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of microarray data was presented, by combined with evidence acquired from comparative genomic hybridization (CGH) data. Methods Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify metastasis-associated genes in CRC. Results A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions Our results demonstrated that integration analysis is an effective strategy for mining cancer-associated genes. PMID:24385689

  4. Comprehensive Comparative Genomic and Transcriptomic Analyses of the Legume Genes Controlling the Nodulation Process.

    PubMed

    Qiao, Zhenzhen; Pingault, Lise; Nourbakhsh-Rey, Mehrnoush; Libault, Marc

    2016-01-01

    Nitrogen is one of the most essential plant nutrients and one of the major factors limiting crop productivity. Having the goal to perform a more sustainable agriculture, there is a need to maximize biological nitrogen fixation, a feature of legumes. To enhance our understanding of the molecular mechanisms controlling the interaction between legumes and rhizobia, the symbiotic partner fixing and assimilating the atmospheric nitrogen for the plant, researchers took advantage of genetic and genomic resources developed across different legume models (e.g., Medicago truncatula, Lotus japonicus, Glycine max, and Phaseolus vulgaris) to identify key regulatory protein coding genes of the nodulation process. In this study, we are presenting the results of a comprehensive comparative genomic analysis to highlight orthologous and paralogous relationships between the legume genes controlling nodulation. Mining large transcriptomic datasets, we also identified several orthologous and paralogous genes characterized by the induction of their expression during nodulation across legume plant species. This comprehensive study prompts new insights into the evolution of the nodulation process in legume plant and will benefit the scientific community interested in the transfer of functional genomic information between species.

  5. Comprehensive Comparative Genomic and Transcriptomic Analyses of the Legume Genes Controlling the Nodulation Process.

    PubMed

    Qiao, Zhenzhen; Pingault, Lise; Nourbakhsh-Rey, Mehrnoush; Libault, Marc

    2016-01-01

    Nitrogen is one of the most essential plant nutrients and one of the major factors limiting crop productivity. Having the goal to perform a more sustainable agriculture, there is a need to maximize biological nitrogen fixation, a feature of legumes. To enhance our understanding of the molecular mechanisms controlling the interaction between legumes and rhizobia, the symbiotic partner fixing and assimilating the atmospheric nitrogen for the plant, researchers took advantage of genetic and genomic resources developed across different legume models (e.g., Medicago truncatula, Lotus japonicus, Glycine max, and Phaseolus vulgaris) to identify key regulatory protein coding genes of the nodulation process. In this study, we are presenting the results of a comprehensive comparative genomic analysis to highlight orthologous and paralogous relationships between the legume genes controlling nodulation. Mining large transcriptomic datasets, we also identified several orthologous and paralogous genes characterized by the induction of their expression during nodulation across legume plant species. This comprehensive study prompts new insights into the evolution of the nodulation process in legume plant and will benefit the scientific community interested in the transfer of functional genomic information between species. PMID:26858743

  6. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels.

    PubMed

    Petit, Elsa; Coppi, Maddalena V; Hayes, James C; Tolonen, Andrew C; Warnick, Thomas; Latouf, William G; Amisano, Danielle; Biddle, Amy; Mukherjee, Supratim; Ivanova, Natalia; Lykidis, Athanassios; Land, Miriam; Hauser, Loren; Kyrpides, Nikos; Henrissat, Bernard; Lau, Joanne; Schnell, Danny J; Church, George M; Leschine, Susan B; Blanchard, Jeffrey L

    2015-01-01

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.

  7. Genomic and Transcriptomic Approaches to Study Cancer in Small Aquarium Fish Models.

    PubMed

    Regneri, J; Klotz, B; Schartl, M

    2016-01-01

    Zebrafish and medaka that develop tumors have become valuable tools for experimental cancer research. With the advent of microarrays and new sequencing technologies it has become feasible to perform whole genome, exome, and transcriptome analyses in these fish models. Analyses that compare the two fish models with each other and with data from human tumors have revealed a plethora of important insights. An unexpected high degree of comparability of molecular features of fish and human tumors has been detected. Furthermore, analyses of the fish model data have uncovered molecules that have not received appropriate attention in studies on their human tumor counterparts and thus have provided valuable candidates for novel biomarkers and therapeutic targets.

  8. An integrated genomic and transcriptomic survey of mucormycosis-causing fungi

    PubMed Central

    Chibucos, Marcus C.; Soliman, Sameh; Gebremariam, Teclegiorgis; Lee, Hongkyu; Daugherty, Sean; Orvis, Joshua; Shetty, Amol C.; Crabtree, Jonathan; Hazen, Tracy H.; Etienne, Kizee A.; Kumari, Priti; O'Connor, Timothy D.; Rasko, David A.; Filler, Scott G.; Fraser, Claire M.; Lockhart, Shawn R.; Skory, Christopher D.; Ibrahim, Ashraf S.; Bruno, Vincent M.

    2016-01-01

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. Here we sequence 30 fungal genomes, and perform transcriptomics with three representative Rhizopus and Mucor strains and with human airway epithelial cells during fungal invasion, to reveal key host and fungal determinants contributing to pathogenesis. Analysis of the host transcriptional response to Mucorales reveals platelet-derived growth factor receptor B (PDGFRB) signaling as part of a core response to divergent pathogenic fungi; inhibition of PDGFRB reduces Mucorales-induced damage to host cells. The unique presence of CotH invasins in all invasive Mucorales, and the correlation between CotH gene copy number and clinical prevalence, are consistent with an important role for these proteins in mucormycosis pathogenesis. Our work provides insight into the evolution of this medically and economically important group of fungi, and identifies several molecular pathways that might be exploited as potential therapeutic targets. PMID:27447865

  9. An integrated genomic and transcriptomic survey of mucormycosis-causing fungi.

    PubMed

    Chibucos, Marcus C; Soliman, Sameh; Gebremariam, Teclegiorgis; Lee, Hongkyu; Daugherty, Sean; Orvis, Joshua; Shetty, Amol C; Crabtree, Jonathan; Hazen, Tracy H; Etienne, Kizee A; Kumari, Priti; O'Connor, Timothy D; Rasko, David A; Filler, Scott G; Fraser, Claire M; Lockhart, Shawn R; Skory, Christopher D; Ibrahim, Ashraf S; Bruno, Vincent M

    2016-01-01

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. Here we sequence 30 fungal genomes, and perform transcriptomics with three representative Rhizopus and Mucor strains and with human airway epithelial cells during fungal invasion, to reveal key host and fungal determinants contributing to pathogenesis. Analysis of the host transcriptional response to Mucorales reveals platelet-derived growth factor receptor B (PDGFRB) signaling as part of a core response to divergent pathogenic fungi; inhibition of PDGFRB reduces Mucorales-induced damage to host cells. The unique presence of CotH invasins in all invasive Mucorales, and the correlation between CotH gene copy number and clinical prevalence, are consistent with an important role for these proteins in mucormycosis pathogenesis. Our work provides insight into the evolution of this medically and economically important group of fungi, and identifies several molecular pathways that might be exploited as potential therapeutic targets. PMID:27447865

  10. Transcriptomic and genomic evolution under constant cold in Antarctic notothenioid fish

    PubMed Central

    Chen, Zuozhou; Cheng, C.-H. Christina; Zhang, Junfang; Cao, Lixue; Chen, Lei; Zhou, Longhai; Jin, Yudong; Ye, Hua; Deng, Cheng; Dai, Zhonghua; Xu, Qianghua; Hu, Peng; Sun, Shouhong; Shen, Yu; Chen, Liangbiao

    2008-01-01

    The antifreeze glycoprotein-fortified Antarctic notothenioid fishes comprise the predominant fish suborder in the isolated frigid Southern Ocean. Their ecological success undoubtedly entailed evolutionary acquisition of a full suite of cold-stable functions besides antifreeze protection. Prior studies of adaptive changes in these teleost fishes generally examined a single genotype or phenotype. We report here the genome-wide investigations of transcriptional and genomic changes associated with Antarctic notothenioid cold adaptation. We sequenced and characterized 33,560 ESTs from four tissues of the Antarctic notothenioid Dissostichus mawsoni and derived 3,114 nonredundant protein gene families and their expression profiles. Through comparative analyses of same-tissue transcriptome profiles of D. mawsoni and temperate/tropical teleost fishes, we identified 177 notothenioid protein families that were expressed many fold over the latter, indicating cold-related up-regulation. These up-regulated gene families operate in protein biosynthesis, protein folding and degradation, lipid metabolism, antioxidation, antiapoptosis, innate immunity, choriongenesis, and others, all of recognizable functional importance in mitigating stresses in freezing temperatures during notothenioid life histories. We further examined the genomic and evolutionary bases for this expressional up-regulation by comparative genomic hybridization of DNA from four pairs of Antarctic and basal non-Antarctic notothenioids to 10,700 D. mawsoni cDNA probes and discovered significant to astounding (3- to >300-fold, P < 0.05) Antarctic-specific duplications of 118 protein-coding genes, many of which correspond to the up-regulated gene families. Results of our integrative tripartite study strongly suggest that evolution under constant cold has resulted in dramatic genomic expansions of specific protein gene families, augmenting gene expression and gene functions contributing to physiological fitness of

  11. Genome-Wide Transcriptome and Proteome Analysis on Different Developmental Stages of Cordyceps militaris

    PubMed Central

    Yin, Yalin; Yu, Guojun; Chen, Yijie; Jiang, Shuai; Wang, Man; Jin, Yanxia; Lan, Xianqing; Liang, Yi; Sun, Hui

    2012-01-01

    Background Cordyceps militaris, an ascomycete caterpillar fungus, has been used as a traditional Chinese medicine for many years owing to its anticancer and immunomodulatory activities. Currently, artificial culturing of this beneficial fungus has been widely used and can meet the market, but systematic molecular studies on the developmental stages of cultured C. militaris at transcriptional and translational levels have not been determined. Methodology/Principal Findings We utilized high-throughput Illumina sequencing to obtain the transcriptomes of C. militaris mycelium and fruiting body. All clean reads were mapped to C. militaris genome and most of the reads showed perfect coverage. Alternative splicing and novel transcripts were predicted to enrich the database. Gene expression analysis revealed that 2,113 genes were up-regulated in mycelium and 599 in fruiting body. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were performed to analyze the genes with expression differences. Moreover, the putative cordycepin metabolism difference between different developmental stages was studied. In addition, the proteome data of mycelium and fruiting body were obtained by one-dimensional gel electrophoresis (1-DGE) coupled with nano-electrospray ionization liquid chromatography tandem mass spectrometry (nESI-LC-MS/MS). 359 and 214 proteins were detected from mycelium and fruiting body respectively. GO, KEGG and Cluster of Orthologous Groups (COG) analysis were further conducted to better understand their difference. We analyzed the amounts of some noteworthy proteins in these two samples including lectin, superoxide dismutase, glycoside hydrolase and proteins involved in cordycepin metabolism, providing important information for further protein studies. Conclusions/Significance The results reveal the difference in gene expression between the mycelium and fruiting body of artificially cultivated C. militaris by transcriptome and proteome

  12. An Integrative Genomic and Transcriptomic Analysis Reveals Potential Targets Associated with Cell Proliferation in Uterine Leiomyomas

    PubMed Central

    Cirilo, Priscila Daniele Ramos; Marchi, Fábio Albuquerque; Barros Filho, Mateus de Camargo; Rocha, Rafael Malagoli; Domingues, Maria Aparecida Custódio; Jurisica, Igor; Pontes, Anagloria; Rogatto, Silvia Regina

    2013-01-01

    Background Uterine Leiomyomas (ULs) are the most common benign tumours affecting women of reproductive age. ULs represent a major problem in public health, as they are the main indication for hysterectomy. Approximately 40–50% of ULs have non-random cytogenetic abnormalities, and half of ULs may have copy number alterations (CNAs). Gene expression microarrays studies have demonstrated that cell proliferation genes act in response to growth factors and steroids. However, only a few genes mapping to CNAs regions were found to be associated with ULs. Methodology We applied an integrative analysis using genomic and transcriptomic data to identify the pathways and molecular markers associated with ULs. Fifty-one fresh frozen specimens were evaluated by array CGH (JISTIC) and gene expression microarrays (SAM). The CONEXIC algorithm was applied to integrate the data. Principal Findings The integrated analysis identified the top 30 significant genes (P<0.01), which comprised genes associated with cancer, whereas the protein-protein interaction analysis indicated a strong association between FANCA and BRCA1. Functional in silico analysis revealed target molecules for drugs involved in cell proliferation, including FGFR1 and IGFBP5. Transcriptional and protein analyses showed that FGFR1 (P = 0.006 and P<0.01, respectively) and IGFBP5 (P = 0.0002 and P = 0.006, respectively) were up-regulated in the tumours when compared with the adjacent normal myometrium. Conclusions The integrative genomic and transcriptomic approach indicated that FGFR1 and IGFBP5 amplification, as well as the consequent up-regulation of the protein products, plays an important role in the aetiology of ULs and thus provides data for potential drug therapies development to target genes associated with cellular proliferation in ULs. PMID:23483937

  13. Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq.

    PubMed

    Lu, Bingxin; Zeng, Zhenbing; Shi, Tieliu

    2013-02-01

    Transcriptome reconstruction is an important application of RNA-Seq, providing critical information for further analysis of transcriptome. Although RNA-Seq offers the potential to identify the whole picture of transcriptome, it still presents special challenges. To handle these difficulties and reconstruct transcriptome as completely as possible, current computational approaches mainly employ two strategies: de novo assembly and genome-guided assembly. In order to find the similarities and differences between them, we firstly chose five representative assemblers belonging to the two classes respectively, and then investigated and compared their algorithm features in theory and real performances in practice. We found that all the methods can be reduced to graph reduction problems, yet they have different conceptual and practical implementations, thus each assembly method has its specific advantages and disadvantages, performing worse than others in certain aspects while outperforming others in anther aspects at the same time. Finally we merged assemblies of the five assemblers and obtained a much better assembly. Additionally we evaluated an assembler using genome-guided de novo assembly approach, and achieved good performance. Based on these results, we suggest that to obtain a comprehensive set of recovered transcripts, it is better to use a combination of de novo assembly and genome-guided assembly. PMID:23393030

  14. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

    SciTech Connect

    Seaver, Samuel M.D.; Bradbury, Louis M.T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-03-10

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  15. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

    DOE PAGESBeta

    Seaver, Samuel M.D.; Bradbury, Louis M.T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-03-10

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions andmore » possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.« less

  16. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

    PubMed Central

    Seaver, Samuel M. D.; Bradbury, Louis M. T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-01-01

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes. PMID:25806041

  17. Transcriptome sequencing of two phenotypic mosaic Eucalyptus trees reveals large scale transcriptome re-modelling.

    PubMed

    Padovan, Amanda; Patel, Hardip R; Chuah, Aaron; Huttley, Gavin A; Krause, Sandra T; Degenhardt, Jörg; Foley, William J; Külheim, Carsten

    2015-01-01

    Phenotypic mosaic trees offer an ideal system for studying differential gene expression. We have investigated two mosaic eucalypt trees from two closely related species (Eucalyptus melliodora and E. sideroxylon), which each support two types of leaves: one part of the canopy is resistant to insect herbivory and the remaining leaves are susceptible. Driving this ecological distinction are differences in plant secondary metabolites. We used these phenotypic mosaics to investigate genome wide patterns of foliar gene expression with the aim of identifying patterns of differential gene expression and the somatic mutation(s) that lead to this phenotypic mosaicism. We sequenced the mRNA pool from leaves of the resistant and susceptible ecotypes from both mosaic eucalypts using the Illumina HiSeq 2000 platform. We found large differences in pathway regulation and gene expression between the ecotypes of each mosaic. The expression of the genes in the MVA and MEP pathways is reflected by variation in leaf chemistry, however this is not the case for the terpene synthases. Apart from the terpene biosynthetic pathway, there are several other metabolic pathways that are differentially regulated between the two ecotypes, suggesting there is much more phenotypic diversity than has been described. Despite the close relationship between the two species, they show large differences in the global patterns of gene and pathway regulation.

  18. Transcriptome Sequencing of Two Phenotypic Mosaic Eucalyptus Trees Reveals Large Scale Transcriptome Re-Modelling

    PubMed Central

    Padovan, Amanda; Patel, Hardip R.; Chuah, Aaron; Huttley, Gavin A.; Krause, Sandra T.; Degenhardt, Jörg; Foley, William J.; Külheim, Carsten

    2015-01-01

    Phenotypic mosaic trees offer an ideal system for studying differential gene expression. We have investigated two mosaic eucalypt trees from two closely related species (Eucalyptus melliodora and E. sideroxylon), which each support two types of leaves: one part of the canopy is resistant to insect herbivory and the remaining leaves are susceptible. Driving this ecological distinction are differences in plant secondary metabolites. We used these phenotypic mosaics to investigate genome wide patterns of foliar gene expression with the aim of identifying patterns of differential gene expression and the somatic mutation(s) that lead to this phenotypic mosaicism. We sequenced the mRNA pool from leaves of the resistant and susceptible ecotypes from both mosaic eucalypts using the Illumina HiSeq 2000 platform. We found large differences in pathway regulation and gene expression between the ecotypes of each mosaic. The expression of the genes in the MVA and MEP pathways is reflected by variation in leaf chemistry, however this is not the case for the terpene synthases. Apart from the terpene biosynthetic pathway, there are several other metabolic pathways that are differentially regulated between the two ecotypes, suggesting there is much more phenotypic diversity than has been described. Despite the close relationship between the two species, they show large differences in the global patterns of gene and pathway regulation. PMID:25978451

  19. Genome-Wide SNP Discovery from Transcriptome of Four Common Carp Strains

    PubMed Central

    Xu, Jian; Ji, Peifeng; Zhao, Zixia; Zhang, Yan; Feng, Jianxin; Wang, Jian; Li, Jiongtang; Zhang, Xiaofeng; Zhao, Lan; Liu, Guangzan; Xu, Peng; Sun, Xiaowen

    2012-01-01

    Background Single nucleotide polymorphisms (SNPs) have been used as genetic marker for genome-wide association studies in many species. Gene-associated SNPs could offer sufficient coverage in trait related research and further more could themselves be causative SNPs for traits. Common carp (Cyprinus carpio) is one of the most important aquaculture species in the world accounting for nearly 14% of freshwater aquaculture production. There are various strains of common carp with different economic traits, however, the genetic mechanism underlying the different traits have not been elucidated yet. In this project, we identified a large number of gene-associated SNPs from four strains of common carp using next-generation sequencing. Results Transcriptome sequencing of four strains of common carp (mirror carp, purse red carp, Xingguo red carp, Yellow River carp) was performed with Solexa HiSeq2000 platform. De novo assembled transcriptome was used as reference for alignments, and SNP calling was done through BWA and SAMtools. A total of 712,042 Intra-strain SNPs were discovered in four strains, of which 483,276 SNPs for mirror carp, 486,629 SNPs for purse red carp, 478,028 SNPs for Xingguo red carp and 488,281 SNPs for Yellow River carp were discovered, respectively. Besides, 53,893 inter-SNPs were identified. Strain-specific SNPs of four strains were 53,938, 53,866, 48,701, 40,131 in mirror carp, purse red carp, Xingguo red carp and Yellow River carp, respectively. GO and KEGG pathway analysis were done to reveal strain-specific genes affected by strain-specific non-synonymous SNPs. Validation of selected SNPs revealed that 48% percent of SNPs (12 of 25) were tested to be true SNPs. Conclusions Transcriptome analysis of common carp using RNA-Seq is a cost-effective way of generating numerous reads for SNP discovery. After validation of identified SNPs, these data will provide a solid base for SNP array designing and genome-wide association studies. PMID:23110192

  20. Genome and Transcriptome Analysis of the Basidiomycetous Yeast Pseudozyma antarctica Producing Extracellular Glycolipids, Mannosylerythritol Lipids

    PubMed Central

    Hagiwara, Hiroko; Ito, Emi; Machida, Masayuki; Sato, Shun; Habe, Hiroshi; Kitamoto, Dai

    2014-01-01

    Pseudozyma antarctica is a non-pathogenic phyllosphere yeast known as an excellent producer of mannosylerythritol lipids (MELs), multi-functional extracellular glycolipids, from vegetable oils. To clarify the genetic characteristics of P. antarctica, we analyzed the 18 Mb genome of P. antarctica T-34. On the basis of KOG analysis, the number of genes (219 genes) categorized into lipid transport and metabolism classification in P. antarctica was one and a half times larger than that of yeast Saccharomyces cerevisiae (140 genes). The gene encoding an ATP/citrate lyase (ACL) related to acetyl-CoA synthesis conserved in oleaginous strains was found in P. antarctica genome: the single ACL gene possesses the four domains identical to that of the human gene, whereas the other oleaginous ascomycetous species have the two genes covering the four domains. P. antarctica genome exhibited a remarkable degree of synteny to U. maydis genome, however, the comparison of the gene expression profiles under the culture on the two carbon sources, glucose and soybean oil, by the DNA microarray method revealed that transcriptomes between the two species were significantly different. In P. antarctica, expression of the gene sets relating fatty acid metabolism were markedly up-regulated under the oily conditions compared with glucose. Additionally, MEL biosynthesis cluster of P. antarctica was highly expressed regardless of the carbon source as compared to U. maydis. These results strongly indicate that P. antarctica has an oleaginous nature which is relevant to its non-pathogenic and MEL-overproducing characteristics. The analysis and dataset contribute to stimulate the development of improved strains with customized properties for high yield production of functional bio-based materials. PMID:24586250

  1. Genome and transcriptome analysis of the basidiomycetous yeast Pseudozyma antarctica producing extracellular glycolipids, mannosylerythritol lipids.

    PubMed

    Morita, Tomotake; Koike, Hideaki; Hagiwara, Hiroko; Ito, Emi; Machida, Masayuki; Sato, Shun; Habe, Hiroshi; Kitamoto, Dai

    2014-01-01

    Pseudozyma antarctica is a non-pathogenic phyllosphere yeast known as an excellent producer of mannosylerythritol lipids (MELs), multi-functional extracellular glycolipids, from vegetable oils. To clarify the genetic characteristics of P. antarctica, we analyzed the 18 Mb genome of P. antarctica T-34. On the basis of KOG analysis, the number of genes (219 genes) categorized into lipid transport and metabolism classification in P. antarctica was one and a half times larger than that of yeast Saccharomyces cerevisiae (140 genes). The gene encoding an ATP/citrate lyase (ACL) related to acetyl-CoA synthesis conserved in oleaginous strains was found in P. antarctica genome: the single ACL gene possesses the four domains identical to that of the human gene, whereas the other oleaginous ascomycetous species have the two genes covering the four domains. P. antarctica genome exhibited a remarkable degree of synteny to U. maydis genome, however, the comparison of the gene expression profiles under the culture on the two carbon sources, glucose and soybean oil, by the DNA microarray method revealed that transcriptomes between the two species were significantly different. In P. antarctica, expression of the gene sets relating fatty acid metabolism were markedly up-regulated under the oily conditions compared with glucose. Additionally, MEL biosynthesis cluster of P. antarctica was highly expressed regardless of the carbon source as compared to U. maydis. These results strongly indicate that P. antarctica has an oleaginous nature which is relevant to its non-pathogenic and MEL-overproducing characteristics. The analysis and dataset contribute to stimulate the development of improved strains with customized properties for high yield production of functional bio-based materials. PMID:24586250

  2. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    SciTech Connect

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  3. The draft genome and transcriptome of Amaranthus hypochondriacus: a C4 dicot producing high-lysine edible pseudo-cereal.

    PubMed

    Sunil, Meeta; Hariharan, Arun K; Nayak, Soumya; Gupta, Saurabh; Nambisan, Suran R; Gupta, Ravi P; Panda, Binay; Choudhary, Bibha; Srinivasan, Subhashini

    2014-12-01

    Grain amaranths, edible C4 dicots, produce pseudo-cereals high in lysine. Lysine being one of the most limiting essential amino acids in cereals and C4 photosynthesis being one of the most sought-after phenotypes in protein-rich legume crops, the genome of one of the grain amaranths is likely to play a critical role in crop research. We have sequenced the genome and transcriptome of Amaranthus hypochondriacus, a diploid (2n = 32) belonging to the order Caryophyllales with an estimated genome size of 466 Mb. Of the 411 linkage single-nucleotide polymorphisms (SNPs) reported for grain amaranths, 355 SNPs (86%) are represented in the scaffolds and 74% of the 8.6 billion bases of the sequenced transcriptome map to the genomic scaffolds. The genome of A. hypochondriacus, codes for at least 24,829 proteins, shares the paleohexaploidy event with species under the superorders Rosids and Asterids, harbours 1 SNP in 1,000 bases, and contains 13.76% of repeat elements. Annotation of all the genes in the lysine biosynthetic pathway using comparative genomics and expression analysis offers insights into the high-lysine phenotype. As the first grain species under Caryophyllales and the first C4 dicot genome reported, the work presented here will be beneficial in improving crops and in expanding our understanding of angiosperm evolution.

  4. The Draft Genome and Transcriptome of Amaranthus hypochondriacus: A C4 Dicot Producing High-Lysine Edible Pseudo-Cereal

    PubMed Central

    Sunil, Meeta; Hariharan, Arun K.; Nayak, Soumya; Gupta, Saurabh; Nambisan, Suran R.; Gupta, Ravi P.; Panda, Binay; Choudhary, Bibha; Srinivasan, Subhashini

    2014-01-01

    Grain amaranths, edible C4 dicots, produce pseudo-cereals high in lysine. Lysine being one of the most limiting essential amino acids in cereals and C4 photosynthesis being one of the most sought-after phenotypes in protein-rich legume crops, the genome of one of the grain amaranths is likely to play a critical role in crop research. We have sequenced the genome and transcriptome of Amaranthus hypochondriacus, a diploid (2n = 32) belonging to the order Caryophyllales with an estimated genome size of 466 Mb. Of the 411 linkage single-nucleotide polymorphisms (SNPs) reported for grain amaranths, 355 SNPs (86%) are represented in the scaffolds and 74% of the 8.6 billion bases of the sequenced transcriptome map to the genomic scaffolds. The genome of A. hypochondriacus, codes for at least 24,829 proteins, shares the paleohexaploidy event with species under the superorders Rosids and Asterids, harbours 1 SNP in 1,000 bases, and contains 13.76% of repeat elements. Annotation of all the genes in the lysine biosynthetic pathway using comparative genomics and expression analysis offers insights into the high-lysine phenotype. As the first grain species under Caryophyllales and the first C4 dicot genome reported, the work presented here will be beneficial in improving crops and in expanding our understanding of angiosperm evolution. PMID:25071079

  5. The genome and transcriptome of the pine saprophyte Ophiostoma piceae, and a comparison with the bark beetle-associated pine pathogen Grosmannia clavigera

    PubMed Central

    2013-01-01

    Background Ophiostoma piceae is a wood-staining fungus that grows in the sapwood of conifer logs and lumber. We sequenced its genome and analyzed its transcriptomes under a range of growth conditions. A comparison with the genome and transcriptomes of the mountain pine beetle-associated pathogen Grosmannia clavigera highlights differences between a pathogen that colonizes and kills living pine trees and a saprophyte that colonizes wood and the inner bark of dead trees. Results We assembled a 33 Mbp genome in 45 scaffolds, and predicted approximately 8,884 genes. The genome size and gene content were similar to those of other ascomycetes. Despite having similar ecological niches, O. piceae and G. clavigera showed no large-scale synteny. We identified O. piceae genes involved in the biosynthesis of melanin, which causes wood discoloration and reduces the commercial value of wood products. We also identified genes and pathways involved in growth on simple carbon sources and in sapwood, O. piceae’s natural substrate. Like the pathogen, the saprophyte is able to tolerate terpenes, which are a major class of pine tree defense compounds; unlike the pathogen, it cannot utilize monoterpenes as a carbon source. Conclusions This work makes available the second annotated genome of a softwood ophiostomatoid fungus, and suggests that O. piceae’s tolerance to terpenes may be due in part to these chemicals being removed from the cells by an ABC transporter that is highly induced by terpenes. The data generated will provide the research community with resources for work on host-vector-fungus interactions for wood-inhabiting, beetle-associated saprophytes and pathogens. PMID:23725015

  6. Paired-end diTagging for transcriptome and genome analysis.

    PubMed

    Ng, Patrick; Wei, Chia-Lin; Ruan, Yijun

    2007-07-01

    The Paired-End diTagging (PET) procedure enables one to obtain sequence information from both termini of any contiguous DNA fragment. This is achieved by a series of enzymatic manipulations that introduce MmeI sites directly flanking each DNA insert during the construction of a plasmid library. Subsequent MmeI digestion and self-ligation results in the production of covalently-linked paired-end ditags (PETs) that can be extracted and then concatenated for efficient sequencing. By mapping the PET sequences to assembled genomes, the original DNA fragments from which the PETs were derived can be precisely localized. This unit details two applications of PET technology. In GIS-PET, ditagging of mRNA converted to full-length cDNA enables whole-transcriptome analysis, including novel gene identification, gene prediction validation, and gene expression studies. In ChIP-PET, ditagging of chromatin immunoprecipitation-enriched genomic DNA fragments enables the global mapping of transcription factor binding sites. A recent innovation (Multiplex Sequencing of Paired-End ditags; MS-PET) enables PETs to be sequenced using high-throughput 454 sequencing, greatly increasing the amount of data that can be collected in each run.

  7. Genomic heterogeneity of historical gene flow between two species of newts inferred from transcriptome data.

    PubMed

    Stuglik, Michał T; Babik, Wiesław

    2016-07-01

    The role of gene flow in species formation is a major unresolved issue in speciation biology. Progress in this area requires information on the long-term patterns of gene flow between diverging species. Here, we used thousands of single-nucleotide polymorphisms derived from transcriptome resequencing and a method modeling the joint frequency spectrum of these polymorphisms to reconstruct patterns of historical gene flow between two Lissotriton newts: L. vulgaris (Lv) and L. montandoni (Lm). We tested several models of divergence including complete isolation and various scenarios of historical gene flow. The model of secondary contact received the highest support. According to this model, the species split from their common ancestor ca. 5.5 million years (MY) ago, evolved in isolation for ca. 2 MY, and have been exchanging genes for the last 3.5 MY Demographic changes have been inferred in both species, with the current effective population size of ca. 0.7 million in Lv and 0.2 million in Lm. The postdivergence gene flow resulted in two-directional introgression which affected the genomes of both species, but was more pronounced from Lv to Lm. Interestingly, we found evidence for genomic heterogeneity of interspecific gene flow. This study demonstrates the complexity of long-term gene flow between distinct but incompletely reproductively isolated taxa which divergence was initiated millions of years ago. PMID:27386093

  8. Whole genome and transcriptome sequencing of matched primary and peritoneal metastatic gastric carcinoma

    PubMed Central

    Zhang, J.; Huang, J. Y.; Chen, Y. N.; Yuan, F.; Zhang, H.; Yan, F. H.; Wang, M. J.; Wang, G.; Su, M.; Lu, G; Huang, Y.; Dai, H.; Ji, J.; Zhang, J.; Zhang, J. N.; Jiang, Y. N.; Chen, S. J.; Zhu, Z. G.; Yu, Y. Y.

    2015-01-01

    Gastric cancer is one of the most aggressive cancers and is the second leading cause of cancer death worldwide. Approximately 40% of global gastric cancer cases occur in China, with peritoneal metastasis being the prevalent form of recurrence and metastasis in advanced disease. Currently, there are limited clinical approaches for predicting and treatment of peritoneal metastasis, resulting in a 6-month average survival time. By comprehensive genome analysis will uncover the pathogenesis of peritoneal metastasis. Here we describe a comprehensive whole-genome and transcriptome sequencing analysis of one advanced gastric cancer case, including non-cancerous mucosa, primary cancer and matched peritoneal metastatic cancer. The peripheral blood is used as normal control. We identified 27 mutated genes, of which 19 genes are reported in COSMIC database (ZNF208, CRNN, ATXN3, DCTN1, RP1L1, PRB4, PRB1, MUC4, HS6ST3, MUC17, JAM2, ITGAD, IREB2, IQUB, CORO1B, CCDC121, AKAP2, ACAN and ACADL), and eight genes have not previously been described in gastric cancer (CCDC178, ARMC4, TUBB6, PLIN4, PKLR, PDZD2, DMBT1and DAB1).Additionally,GPX4 and MPND in 19q13.3-13.4 region, is characterized as a novel fusion-gene. This study disclosed novel biological markers and tumorigenic pathways that would predict gastric cancer occurring peritoneal metastasis. PMID:26330360

  9. The genome and transcriptome of the zoonotic hookworm Ancylostoma ceylanicum identify infection-specific gene families

    PubMed Central

    Schwarz, Erich M; Hu, Yan; Antoshechkin, Igor; Miller, Melanie M; Sternberg, Paul W; Aroian, Raffi V

    2015-01-01

    Hookworms infect over 400 million people, stunting and impoverishing them1–3. Sequencing hookworm genomes and finding which genes they express during infection should help in devising new drugs or vaccines against hookworms4,5. Unlike other hookworms, Ancylostoma ceylanicum infects both humans and other mammals, providing a laboratory model for hookworm disease6,7. We determined an A. ceylanicum genome sequence of 313 Mb, with transcriptomic data throughout infection showing expression of 30,738 genes. Approximately 900 genes were upregulated during early infection in vivo, including ASPRs, a cryptic subfamily of activation-associated secreted proteins (ASPs)8. Genes downregulated during early infection included ion channels and G protein–coupled receptors; this downregulation was observed in both parasitic and free-living nematodes. Later, at the onset of heavy blood feeding, C-lectin genes were upregulated along with genes for secreted clade V proteins (SCVPs), encoding a previously undescribed protein family. These findings provide new drug and vaccine targets and should help elucidate hookworm pathogenesis. PMID:25730766

  10. Heavy metals induce oxidative stress and genome-wide modulation in transcriptome of rice root.

    PubMed

    Dubey, Sonali; Shri, Manju; Misra, Prashant; Lakhwani, Deepika; Bag, Sumit Kumar; Asif, Mehar H; Trivedi, Prabodh Kumar; Tripathi, Rudro Deo; Chakrabarty, Debasis

    2014-06-01

    Industrial growth, ecological disturbances and agricultural practices have contaminated the soil and water with many harmful compounds, including heavy metals. These heavy metals affect growth and development of plants as well as cause severe human health hazards through food chain contamination. In past, studies have been made to identify biochemical and molecular networks associated with heavy metal toxicity and uptake in plants. Studies suggested that most of the physiological and molecular processes affected by different heavy metals are similar to those affected by other abiotic stresses. To identify common and unique responses by different metals, we have studied biochemical and genome-wide modulation in transcriptome of rice (IR-64 cultivar) root after exposure to cadmium (Cd), arsenate [As(V)], lead (Pb) and chromium [Cr(VI)] in hydroponic condition. We observed that root tissue shows variable responses for antioxidant enzyme system for different heavy metals. Genome-wide expression analysis suggests variable number of genes differentially expressed in root in response to As(V), Cd, Pb and Cr(VI) stresses. In addition to unique genes, each heavy metal modulated expression of a large number of common genes. Study also identified cis-acting regions of the promoters which can be determinants for the modulated expression of the genes in response to different heavy metals. Our study advances understanding related to various processes and networks which might be responsible for heavy metal stresses, accumulation and detoxification. PMID:24553786

  11. Genomic and transcriptomic analysis of NDM-1 Klebsiella pneumoniae in spaceflight reveal mechanisms underlying environmental adaptability.

    PubMed

    Li, Jia; Liu, Fei; Wang, Qi; Ge, Pupu; Woo, Patrick C Y; Yan, Jinghua; Zhao, Yanlin; Gao, George F; Liu, Cui Hua; Liu, Changting

    2014-01-01

    The emergence and rapid spread of New Delhi Metallo-beta-lactamase-1 (NDM-1)-producing Klebsiella pneumoniae strains has caused a great concern worldwide. To better understand the mechanisms underlying environmental adaptation of those highly drug-resistant K. pneumoniae strains, we took advantage of the China's Shenzhou 10 spacecraft mission to conduct comparative genomic and transcriptomic analysis of a NDM-1 K. pneumoniae strain (ATCC BAA-2146) being cultivated under different conditions. The samples were recovered from semisolid medium placed on the ground (D strain), in simulated space condition (M strain), or in Shenzhou 10 spacecraft (T strain) for analysis. Our data revealed multiple variations underlying pathogen adaptation into different environments in terms of changes in morphology, H2O2 tolerance and biofilm formation ability, genomic stability and regulation of metabolic pathways. Additionally, we found a few non-coding RNAs to be differentially regulated. The results are helpful for better understanding the adaptive mechanisms of drug-resistant bacterial pathogens. PMID:25163721

  12. Genomic and transcriptomic analysis of NDM-1 Klebsiella pneumoniae in spaceflight reveal mechanisms underlying environmental adaptability

    PubMed Central

    Li, Jia; Liu, Fei; Wang, Qi; Ge, Pupu; Woo, Patrick C. Y.; Yan, Jinghua; Zhao, Yanlin; Gao, George F.; Liu, Cui Hua; Liu, Changting

    2014-01-01

    The emergence and rapid spread of New Delhi Metallo-beta-lactamase-1 (NDM-1)-producing Klebsiella pneumoniae strains has caused a great concern worldwide. To better understand the mechanisms underlying environmental adaptation of those highly drug-resistant K. pneumoniae strains, we took advantage of the China's Shenzhou 10 spacecraft mission to conduct comparative genomic and transcriptomic analysis of a NDM-1 K. pneumoniae strain (ATCC BAA-2146) being cultivated under different conditions. The samples were recovered from semisolid medium placed on the ground (D strain), in simulated space condition (M strain), or in Shenzhou 10 spacecraft (T strain) for analysis. Our data revealed multiple variations underlying pathogen adaptation into different environments in terms of changes in morphology, H2O2 tolerance and biofilm formation ability, genomic stability and regulation of metabolic pathways. Additionally, we found a few non-coding RNAs to be differentially regulated. The results are helpful for better understanding the adaptive mechanisms of drug-resistant bacterial pathogens. PMID:25163721

  13. Genomic, proteomic, and transcriptomic analysis of virulent and avirulent Rickettsia prowazekii reveals its adaptive mutation capabilities.

    PubMed

    Bechah, Yassina; El Karkouri, Khalid; Mediannikov, Oleg; Leroy, Quentin; Pelletier, Nicolas; Robert, Catherine; Médigue, Claudine; Mege, Jean-Louis; Raoult, Didier

    2010-05-01

    Rickettsia prowazekii, the agent of epidemic typhus, is an obligate intracellular bacterium that is transmitted to human beings by the body louse. Several strains that differ considerably in virulence are recognized, but the genetic basis for these variations has remained unknown since the initial description of the avirulent vaccine strain nearly 70 yr ago. We use a recently developed murine model of epidemic typhus and transcriptomic, proteomic, and genetic techniques to identify the factors associated with virulence. We identified four phenotypes of R. prowazekii that differed in virulence, associated with the up-regulation of antiapoptotic genes or the interferon I pathway in the host cells. Transcriptional and proteomic analyses of R. prowazekii surface protein expression and protein methylation varied with virulence. By sequencing a virulent strain and using comparative genomics, we found hotspots of mutations in homopolymeric tracts of poly(A) and poly(T) in eight genes in an avirulent strain that split and inactivated these genes. These included recO, putative methyltransferase, and exported protein. Passage of the avirulent Madrid E strain in cells or in experimental animals was associated with a cascade of gene reactivations, beginning with recO, that restored the virulent phenotype. An area of genomic plasticity appears to determine virulence in R. prowazekii and represents an example of adaptive mutation for this pathogen. PMID:20368341

  14. Genomic heterogeneity of historical gene flow between two species of newts inferred from transcriptome data.

    PubMed

    Stuglik, Michał T; Babik, Wiesław

    2016-07-01

    The role of gene flow in species formation is a major unresolved issue in speciation biology. Progress in this area requires information on the long-term patterns of gene flow between diverging species. Here, we used thousands of single-nucleotide polymorphisms derived from transcriptome resequencing and a method modeling the joint frequency spectrum of these polymorphisms to reconstruct patterns of historical gene flow between two Lissotriton newts: L. vulgaris (Lv) and L. montandoni (Lm). We tested several models of divergence including complete isolation and various scenarios of historical gene flow. The model of secondary contact received the highest support. According to this model, the species split from their common ancestor ca. 5.5 million years (MY) ago, evolved in isolation for ca. 2 MY, and have been exchanging genes for the last 3.5 MY Demographic changes have been inferred in both species, with the current effective population size of ca. 0.7 million in Lv and 0.2 million in Lm. The postdivergence gene flow resulted in two-directional introgression which affected the genomes of both species, but was more pronounced from Lv to Lm. Interestingly, we found evidence for genomic heterogeneity of interspecific gene flow. This study demonstrates the complexity of long-term gene flow between distinct but incompletely reproductively isolated taxa which divergence was initiated millions of years ago.

  15. Population genomic footprints of fine-scale differentiation between habitats in Mediterranean blue tits.

    PubMed

    Szulkin, M; Gagnaire, P-A; Bierne, N; Charmantier, A

    2016-01-01

    Linking population genetic variation to the spatial heterogeneity of the environment is of fundamental interest to evolutionary biology and ecology, in particular when phenotypic differences between populations are observed at biologically small spatial scales. Here, we applied restriction-site associated DNA sequencing (RAD-Seq) to test whether phenotypically differentiated populations of wild blue tits (Cyanistes caeruleus) breeding in a highly heterogeneous environment exhibit genetic structure related to habitat type. Using 12 106 SNPs in 197 individuals from deciduous and evergreen oak woodlands, we applied complementary population genomic analyses, which revealed that genetic variation is influenced by both geographical distance and habitat type. A fine-scale genetic differentiation supported by genome- and transcriptome-wide analyses was found within Corsica, between two adjacent habitats where blue tits exhibit marked differences in breeding time while nesting < 6 km apart. Using redundancy analysis (RDA), we show that genomic variation remains associated with habitat type when controlling for spatial and temporal effects. Finally, our results suggest that the observed patterns of genomic differentiation were not driven by a small proportion of highly differentiated loci, but rather emerged through a process such as habitat choice, which reduces gene flow between habitats across the entire genome. The pattern of genomic isolation-by-environment closely matches differentiation observed at the phenotypic level, thereby offering significant potential for future inference of phenotype-genotype associations in a heterogeneous environment.

  16. Transcriptomic signature to oxidative stress exposure at the time of embryonic genome activation in bovine blastocysts.

    PubMed

    Cagnone, Gael L M; Sirard, Marc-André

    2013-04-01

    In order to understand how in vitro culture affects embryonic quality, we analyzed survival and global gene expression in bovine blastocysts after exposure to increased oxidative stress conditions. Two pro-oxidant agents, one that acts extracellularly by promoting reactive oxygen species (ROS) production (0.01 mM 2,2'-azobis (2-amidinopropane) dihydrochloride [AAPH]) or another that acts intracellularly by inhibiting glutathione synthesis (0.4 mM buthionine sulfoximine [BSO]) were added separately to in vitro culture media from Day 3 (8-16-cell stage) onward. Transcriptomic analysis was then performed on resulting Day-7 blastocysts. In the literature, these two pro-oxidant conditions were shown to induce delayed degeneration in a proportion of Day-8 blastocysts. In our experiment, no morphological difference was visible, but AAPH tended to decrease the blastocyst rate while BSO significantly reduced it, indicating a differential impact on the surviving population. At the transcriptomic level, blastocysts that survived either pro-oxidant exposure showed oxidative stress and an inflammatory response (ARRB2), although AAPH induced higher disturbances in cellular homeostasis (SERPINE1). Functional genomics of the BSO profile, however, identified differential expression of genes related to glycine metabolism and energy metabolism (TPI1). These differential features might be indicative of pre-degenerative blastocysts (IGFBP7) in the AAPH population whereas BSO exposure would select the most viable individuals (TKDP1). Together, these results illustrate how oxidative disruption of pre-attachment development is associated with systematic up-regulation of several metabolic markers. Moreover, it indicates that a better capacity to survive anti-oxidant depletion may allow for the survival of blastocysts with a quieter metabolism after compaction.

  17. Genomics of Compositae crops: reference transcriptome assemblies and evidence of hybridization with wild relatives.

    PubMed

    Hodgins, Kathryn A; Lai, Zhao; Oliveira, Luiz O; Still, David W; Scascitelli, Moira; Barker, Michael S; Kane, Nolan C; Dempewolf, Hannes; Kozik, Alex; Kesseli, Richard V; Burke, John M; Michelmore, Richard W; Rieseberg, Loren H

    2014-01-01

    Although the Compositae harbours only two major food crops, sunflower and lettuce, many other species in this family are utilized by humans and have experienced various levels of domestication. Here, we have used next-generation sequencing technology to develop 15 reference transcriptome assemblies for Compositae crops or their wild relatives. These data allow us to gain insight into the evolutionary and genomic consequences of plant domestication. Specifically, we performed Illumina sequencing of Cichorium endivia, Cichorium intybus, Echinacea angustifolia, Iva annua, Helianthus tuberosus, Dahlia hybrida, Leontodon taraxacoides and Glebionis segetum, as well 454 sequencing of Guizotia scabra, Stevia rebaudiana, Parthenium argentatum and Smallanthus sonchifolius. Illumina reads were assembled using Trinity, and 454 reads were assembled using MIRA and CAP3. We evaluated the coverage of the transcriptomes using BLASTX analysis of a set of ultra-conserved orthologs (UCOs) and recovered most of these genes (88-98%). We found a correlation between contig length and read length for the 454 assemblies, and greater contig lengths for the 454 compared with the Illumina assemblies. This suggests that longer reads can aid in the assembly of more complete transcripts. Finally, we compared the divergence of orthologs at synonymous sites (Ks) between Compositae crops and their wild relatives and found greater divergence when the progenitors were self-incompatible. We also found greater divergence between pairs of taxa that had some evidence of postzygotic isolation. For several more distantly related congeners, such as chicory and endive, we identified a signature of introgression in the distribution of Ks values. PMID:24103297

  18. Genome-wide transcriptome analysis of expression in rice seedling roots in response to supplemental nitrogen.

    PubMed

    Chandran, Anil Kumar Nalini; Priatama, Ryza A; Kumar, Vikranth; Xuan, Yuanhu; Je, Byoung Il; Kim, Chul Min; Jung, Ki-Hong; Han, Chang-Deok

    2016-08-01

    Nitrogen (N) is the most important macronutrient for plant growth and grain yields. For rice crops, nitrate and ammonium are the major N sources. To explore the genomic responses to ammonium supplements in rice roots, we used 17-day-old seedlings grown in the absence of external N that were then exposed to 0.5mM (NH4)2SO4 for 3h. Transcriptomic profiles were examined by microarray experiments. In all, 634 genes were up-regulated at least two-fold by the N-supplement when compared with expression in roots from untreated control plants. Gene Ontology (GO) enrichment analysis revealed that those upregulated genes are associated with 23 GO terms. Among them, metabolic processes for diverse amino acids (i.e., aspartate, threonine, tryptophan, glutamine, l-phenylalanine, and thiamin) as well as nitrogen compounds are highly over-represented, demonstrating that our selected genes are suitable for studying the N-response in roots. This enrichment analysis also indicated that nitrogen is closely linked to diverse transporter activities by primary metabolites, including proteins (amino acids), lipids, and carbohydrates, and is associated with carbohydrate catabolism and cell wall organization. Integration of results from omics analysis of metabolic pathways and transcriptome data using the MapMan tool suggested that the TCA cycle and pathway for mitochondrial electron transport are co-regulated when rice roots are exposed to ammonium. We also investigated the expression of N-responsive marker genes by performing a comparative analysis with root samples from plants grown under different NH4(+) treatments. The diverse responses to such treatment provide useful insight into the global changes related to the shift from an N-deficiency to an enhanced N-supply in rice, a model crop plant. PMID:27340859

  19. Genomics and transcriptomics characterization of genes expressed during postharvest at 4°C by the edible basidiomycete Pleurotus ostreatus.

    PubMed

    Ramírez, Lucía; Oguiza, José Antonio; Pérez, Gúmer; Lavín, José Luis; Omarini, Alejandra; Santoyo, Francisco; Alfaro, Manuel; Castanera, Raúl; Parenti, Alejandra; Muguerza, Elaia; Pisabarro, Antonio G

    2011-06-01

    Pleurotus ostreatus is an industrially cultivated basidiomycete with nutritional and environmental applications. Its genome, which was sequenced by the Joint Genome Institute, has become a model for lignin degradation and for fungal genomics and transcriptomics studies. The complete P. ostreatus genome contains 35 Mbp organized in 11 chromosomes, and two different haploid genomes have been individually sequenced. In this work, genomics and transcriptomics approaches were employed in the study of P. ostreatus under different physiological conditions. Specifically, we analyzed a collection of expressed sequence tags (EST) obtained from cut fruit bodies that had been stored at 4°C for 7 days (postharvest conditions). Studies of the 253 expressed clones that had been automatically and manually annotated provided a detailed picture of the life characteristics of the self-sustained fruit bodies. The results suggested a complex metabolism in which autophagy, RNA metabolism, and protein and carbohydrate turnover are increased. Genes involved in environment sensing and morphogenesis were expressed under these conditions. The data improve our understanding of the decay process in postharvest mushrooms and highlight the use of high-throughput techniques to construct models of living organisms subjected to different environmental conditions. PMID:22069155

  20. Coding SNPs as intrinsic markers for sample tracking in large-scale transcriptome studies

    PubMed Central

    Xu, Weihong; Gao, Hong; Seok, Junhee; Wilhelmy, Julie; Mindrinos, Michael N.; Davis, Ronald W.; Xiao, Wenzhong

    2014-01-01

    Large-scale transcriptome profiling in clinical studies often involves assaying multiple samples of a patient to monitor disease progression, treatment effect, and host response in multiple tissues. Such profiling is prone to human error, which often results in mislabeled samples. Here, we present a method to detect mislabeled sample outliers using coding single nucleotide polymorphisms (cSNPs) specifically designed on the microarray and demonstrate that the mislabeled samples can be efficiently identified by either simple clustering of allele-specific expression scores or Mahalanobis distance-based outlier detection method. Based on our results, we recommend the incorporation of cSNPs into future transcriptome array designs as intrinsic markers for sample tracking. PMID:22668418

  1. Coding SNPs as intrinsic markers for sample tracking in large-scale transcriptome studies.

    PubMed

    Xu, Weihong; Gao, Hong; Seok, Junhee; Wilhelmy, Julie; Mindrinos, Michael N; Davis, Ronald W; Xiao, Wenzhong

    2012-06-01

    Large-scale transcriptome profiling in clinical studies often involves assaying multiple samples of a patient to monitor disease progression, treatment effect, and host response in multiple tissues. Such profiling is prone to human error, which often results in mislabeled samples. Here, we present a method to detect mislabeled sample outliers using coding single nucleotide polymorphisms (cSNPs) specifically designed on the microarray and demonstrate that the mislabeled samples can be efficiently identified by either simple clustering of allele-specific expression scores or Mahalanobis distance-based outlier detection method. Based on our results, we recommend the incorporation of cSNPs into future transcriptome array designs as intrinsic markers for sample tracking.

  2. Genomic and transcriptome profiling identified both human and HBV genetic variations and their interactions in Chinese hepatocellular carcinoma.

    PubMed

    Dong, Hua; Qian, Ziliang; Zhang, Lan; Chen, Yunqin; Ren, Zhenggang; Ji, Qunsheng

    2015-12-01

    Interaction between HBV and host genome integrations in hepatocellular carcinoma (HCC) development is a complex process and the mechanism is still unclear. Here we described in details the quality controls and data mining of aCGH and transcriptome sequencing data on 50 HCC samples from the Chinese patients, published by Dong et al. (2015) (GEO#: GSE65486). In additional to the HBV-MLL4 integration discovered, we also investigated the genetic aberrations of HBV and host genes as well as their genetic interactions. We reported human genome copy number changes and frequent transcriptome variations (e.g. TP53, CTNNB1 mutation, especially MLL family mutations) in this cohort of the patients. For HBV genotype C, we identified a novel linkage disequilibrium region covering HBV replication regulatory elements, including basal core promoter, DR1, epsilon and poly-A regions, which is associated with HBV core antigen over-expression and almost exclusive to HBV-MLL4 integration.

  3. From genes to milk: genomic organization and epigenetic regulation of the mammary transcriptome.

    PubMed

    Lemay, Danielle G; Pollard, Katherine S; Martin, William F; Freeman Zadrowski, Courtneay; Hernandez, Joseph; Korf, Ian; German, J Bruce; Rijnkels, Monique

    2013-01-01

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin state contributes to the co-regulation of gene neighborhoods. The mammary gland represents a unique evolutionary model, due to its recent appearance, in the context of vertebrate genomes. An understanding of how the mammary gland is regulated to produce milk is also of biomedical and agricultural importance for human lactation and dairying. Here, we integrate epigenomic and transcriptomic data to develop a comprehensive regulatory model. Neighborhoods of mammary-expressed genes were determined using expression data derived from pregnant and lactating mice and a neighborhood scoring tool, G-NEST. Regions of open and closed chromatin were identified by ChIP-Seq of histone modifications H3K36me3, H3K4me2, and H3K27me3 in the mouse mammary gland and liver tissue during lactation. We found that neighborhoods of genes in regions of uniquely active chromatin in the lactating mammary gland, compared with liver tissue, were extremely rare. Rather, genes in most neighborhoods were suppressed during lactation as reflected in their expression levels and their location in regions of silenced chromatin. Chromatin silencing was largely shared between the liver and mammary gland during lactation, and what distinguished the mammary gland was mainly a small tissue-specific repertoire of isolated, expressed genes. These findings suggest that an advantage of the neighborhood organization is in the collective repression of groups of genes via a shared mechanism of chromatin repression. Genes essential to the mammary gland's uniqueness are isolated from neighbors, and likely have less tolerance for variation in expression, properties they share with genes responsible for an organism's survival.

  4. RUMINANT NUTRITION SYMPOSIUM: Use of genomics and transcriptomics to identify strategies to lower ruminal methanogenesis.

    PubMed

    McAllister, T A; Meale, S J; Valle, E; Guan, L L; Zhou, M; Kelly, W J; Henderson, G; Attwood, G T; Janssen, P H

    2015-04-01

    Globally, methane (CH4) emissions account for 40% to 45% of greenhouse gas emissions from ruminant livestock, with over 90% of these emissions arising from enteric fermentation. Reduction of carbon dioxide to CH4 is critical for efficient ruminal fermentation because it prevents the accumulation of reducing equivalents in the rumen. Methanogens exist in a symbiotic relationship with rumen protozoa and fungi and within biofilms associated with feed and the rumen wall. Genomics and transcriptomics are playing an increasingly important role in defining the ecology of ruminal methanogenesis and identifying avenues for its mitigation. Metagenomic approaches have provided information on changes in abundances as well as the species composition of the methanogen community among ruminants that vary naturally in their CH4 emissions, their feed efficiency, and their response to CH4 mitigators. Sequencing the genomes of rumen methanogens has provided insight into surface proteins that may prove useful in the development of vaccines and has allowed assembly of biochemical pathways for use in chemogenomic approaches to lowering ruminal CH4 emissions. Metagenomics and metatranscriptomic analysis of entire rumen microbial communities are providing new perspectives on how methanogens interact with other members of this ecosystem and how these relationships may be altered to reduce methanogenesis. Identification of community members that produce antimethanogen agents that either inhibit or kill methanogens could lead to the identification of new mitigation approaches. Discovery of a lytic archaeophage that specifically lyses methanogens is 1 such example. Efforts in using genomic data to alter methanogenesis have been hampered by a lack of sequence information that is specific to the microbial community of the rumen. Programs such as Hungate1000 and the Global Rumen Census are increasing the breadth and depth of our understanding of global ruminal microbial communities, steps that

  5. Genomic and transcriptomic insights into the thermo-regulated biosynthesis of validamycin in Streptomyces hygroscopicus 5008

    PubMed Central

    2012-01-01

    Background Streptomyces hygroscopicus 5008 has been used for the production of the antifungal validamycin/jinggangmycin for more than 40 years. A high yield of validamycin is achieved by culturing the strain at 37°C, rather than at 30°C for normal growth and sporulation. The mechanism(s) of its thermo-regulated biosynthesis was largely unknown. Results The 10,383,684-bp genome of strain 5008 was completely sequenced and composed of a linear chromosome, a 164.57-kb linear plasmid, and a 73.28-kb circular plasmid. Compared with other Streptomyces genomes, the chromosome of strain 5008 has a smaller core region and shorter terminal inverted repeats, encodes more α/β hydrolases, major facilitator superfamily transporters, and Mg2+/Mn2+-dependent regulatory phosphatases. Transcriptomic analysis revealed that the expression of 7.5% of coding sequences was increased at 37°C, including biosynthetic genes for validamycin and other three secondary metabolites. At 37°C, a glutamate dehydrogenase was transcriptionally up-regulated, and further proved its involvement in validamycin production by gene replacement. Moreover, efficient synthesis and utilization of intracellular glutamate were noticed in strain 5008 at 37°C, revealing glutamate as the nitrogen source for validamycin biosynthesis. Furthermore, a SARP-family regulatory gene with enhanced transcription at 37°C was identified and confirmed to be positively involved in the thermo-regulation of validamycin production by gene inactivation and transcriptional analysis. Conclusions Strain 5008 seemed to have evolved with specific genomic components to facilitate the thermo-regulated validamycin biosynthesis. The data obtained here will facilitate future studies for validamycin yield improvement and industrial bioprocess optimization. PMID:22827618

  6. RUMINANT NUTRITION SYMPOSIUM: Use of genomics and transcriptomics to identify strategies to lower ruminal methanogenesis.

    PubMed

    McAllister, T A; Meale, S J; Valle, E; Guan, L L; Zhou, M; Kelly, W J; Henderson, G; Attwood, G T; Janssen, P H

    2015-04-01

    Globally, methane (CH4) emissions account for 40% to 45% of greenhouse gas emissions from ruminant livestock, with over 90% of these emissions arising from enteric fermentation. Reduction of carbon dioxide to CH4 is critical for efficient ruminal fermentation because it prevents the accumulation of reducing equivalents in the rumen. Methanogens exist in a symbiotic relationship with rumen protozoa and fungi and within biofilms associated with feed and the rumen wall. Genomics and transcriptomics are playing an increasingly important role in defining the ecology of ruminal methanogenesis and identifying avenues for its mitigation. Metagenomic approaches have provided information on changes in abundances as well as the species composition of the methanogen community among ruminants that vary naturally in their CH4 emissions, their feed efficiency, and their response to CH4 mitigators. Sequencing the genomes of rumen methanogens has provided insight into surface proteins that may prove useful in the development of vaccines and has allowed assembly of biochemical pathways for use in chemogenomic approaches to lowering ruminal CH4 emissions. Metagenomics and metatranscriptomic analysis of entire rumen microbial communities are providing new perspectives on how methanogens interact with other members of this ecosystem and how these relationships may be altered to reduce methanogenesis. Identification of community members that produce antimethanogen agents that either inhibit or kill methanogens could lead to the identification of new mitigation approaches. Discovery of a lytic archaeophage that specifically lyses methanogens is 1 such example. Efforts in using genomic data to alter methanogenesis have been hampered by a lack of sequence information that is specific to the microbial community of the rumen. Programs such as Hungate1000 and the Global Rumen Census are increasing the breadth and depth of our understanding of global ruminal microbial communities, steps that

  7. Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane

    PubMed Central

    Benevenuto, Juliana; Peters, Leila P.; Carvalho, Giselle; Palhares, Alessandra; Quecine, Maria C.; Nunes, Filipe R. S.; Kmit, Maria C. P.; Wai, Alvan; Hausner, Georg; Aitken, Karen S.; Berkman, Paul J.; Fraser, James A.; Moolhuijzen, Paula M.; Coutinho, Luiz L.; Creste, Silvana; Vieira, Maria L. C.; Kitajima, João P.; Monteiro-Vitorello, Claudia B.

    2015-01-01

    Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence) revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions. PMID:26065709

  8. Comparative Genomic and Transcriptomic Analyses Reveal Habitat Differentiation and Different Transcriptional Responses during Pectin Metabolism in Alishewanella Species

    PubMed Central

    Jung, Jaejoon

    2013-01-01

    Alishewanella species are expected to have high adaptability to diverse environments because they are isolated from different natural habitats. To investigate how the evolutionary history of Alishewanella species is reflected in their genomes, we performed comparative genomic and transcriptomic analyses of A. jeotgali, A. aestuarii, and A. agri, which were isolated from fermented seafood, tidal flat sediment, and soil, respectively. Genomic islands with variable GC contents indicated that invasion of prophage and transposition events occurred in A. jeotgali and A. agri but not in A. aestuarii. Habitat differentiation of A. agri from a marine environment to a terrestrial environment was proposed because the species-specific genes of A. agri were similar to those of soil bacteria, whereas those of A. jeotgali and A. aestuarii were more closely related to marine bacteria. Comparative transcriptomic analysis with pectin as a sole carbon source revealed different transcriptional responses in Alishewanella species, especially in oxidative stress-, methylglyoxal detoxification-, membrane maintenance-, and protease/chaperone activity-related genes. Transcriptomic and experimental data demonstrated that A. agri had a higher pectin degradation rate and more resistance to oxidative stress under pectin-amended conditions than the other 2 Alishewanella species. However, expression patterns of genes in the pectin metabolic pathway and of glyoxylate bypass genes were similar among all 3 Alishewanella species. Our comparative genomic and transcriptomic data revealed that Alishewanella species have evolved through horizontal gene transfer and habitat differentiation and that pectin degradation pathways in Alishewanella species are highly conserved, although stress responses of each Alishewanella species differed under pectin culture conditions. PMID:23934491

  9. Genome-Scale Variation of Tubeworm Symbionts

    NASA Astrophysics Data System (ADS)

    Robidart, J.; Felbeck, H.

    2005-12-01

    Hydrothermal vent tubeworms are completely dependent on their bacterial symbionts for nutrition. Despite this dependency, many studies have concluded that bacterial symbionts are acquired anew from the environment, every generation rather than the more reliable mode of symbiont transmission from parent directly to offspring. Ribosomal 16S sequences have shown little variation of symbiont phylogeny from worm to worm, but higher resolution genome-scale analyses have found that there is genomic heterogeneity between symbionts from worms in different environments. What genes can be "spared," while resulting in an intact symbiosis? Have symbionts from one environment gained physiological capabilities that make them more fit in that environment? In order to answer these questions, subtractive hybridization was used on symbionts of Riftia pachyptila tubeworms from different environments to gain insight into which genes are present in one symbiont and absent in the other. Many genes were found to be unique to each symbiont and these results will be presented. This technique will be applied to answer many fundamental questions regarding microbial symbiont evolution to a specific physico-chemical environment, to a different host species, and more.

  10. Dynamic reorganization of the AC16 cardiomyocyte transcriptome in response to TNFα signaling revealed by integrated genomic analyses

    PubMed Central

    2014-01-01

    Background Defining cell type-specific transcriptomes in mammals can be challenging, especially for unannotated regions of the genome. We have developed an analytical pipeline called groHMM for annotating primary transcripts using global nuclear run-on sequencing (GRO-seq) data. Herein, we use this pipeline to characterize the transcriptome of an immortalized adult human ventricular cardiomyocyte cell line (AC16) in response to signaling by tumor necrosis factor alpha (TNFα), which is controlled in part by NF-κB, a key transcriptional regulator of inflammation. A unique aspect of this work is the use of the RNA polymerase II (Pol II) inhibitor α-amanitin, which we used to define a set of RNA polymerase I and III (Pol I and Pol III) transcripts. Results Using groHMM, we identified ~30,000 coding and non-coding transcribed regions in AC16 cells, which includes a set of unique Pol I and Pol III primary transcripts. Many of these transcripts have not been annotated previously, including enhancer RNAs originating from NF-κB binding sites. In addition, we observed that AC16 cells rapidly and dynamically reorganize their transcriptomes in response to TNFα stimulation in an NF-κB-dependent manner, switching from a basal state to a proinflammatory state affecting a spectrum of cardiac-associated protein-coding and non-coding genes. Moreover, we observed distinct Pol II dynamics for up- and downregulated genes, with a rapid release of Pol II into productive elongation for TNFα-stimulated genes. As expected, the TNFα-induced changes in the AC16 transcriptome resulted in corresponding changes in cognate mRNA and protein levels in a similar manner, but with delayed kinetics. Conclusions Our studies illustrate how computational genomics can be used to characterize the signal-regulated transcriptome in biologically relevant cell types, providing new information about how the human genome is organized, transcribed and regulated. In addition, they show how α-amanitin can

  11. Integrated genomic and transcriptomic analysis of human brain metastases identifies alterations of potential clinical significance.

    PubMed

    Saunus, Jodi M; Quinn, Michael C J; Patch, Ann-Marie; Pearson, John V; Bailey, Peter J; Nones, Katia; McCart Reed, Amy E; Miller, David; Wilson, Peter J; Al-Ejeh, Fares; Mariasegaram, Mythily; Lau, Queenie; Withers, Teresa; Jeffree, Rosalind L; Reid, Lynne E; Da Silva, Leonard; Matsika, Admire; Niland, Colleen M; Cummings, Margaret C; Bruxner, Timothy J C; Christ, Angelika N; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Anderson, Matthew J; Fink, J Lynn; Holmes, Oliver; Kazakoff, Stephen; Leonard, Conrad; Newell, Felicity; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Kassahn, Karin S; Narayanan, Vairavan; Taib, Nur Aishah; Teo, Soo-Hwang; Chow, Yock Ping; kConFab; Jat, Parmjit S; Brandner, Sebastian; Flanagan, Adrienne M; Khanna, Kum Kum; Chenevix-Trench, Georgia; Grimmond, Sean M; Simpson, Peter T; Waddell, Nicola; Lakhani, Sunil R

    2015-11-01

    Treatment options for patients with brain metastases (BMs) have limited efficacy and the mortality rate is virtually 100%. Targeted therapy is critically under-utilized, and our understanding of mechanisms underpinning metastatic outgrowth in the brain is limited. To address these deficiencies, we investigated the genomic and transcriptomic landscapes of 36 BMs from breast, lung, melanoma and oesophageal cancers, using DNA copy-number analysis and exome- and RNA-sequencing. The key findings were as follows. (a) Identification of novel candidates with possible roles in BM development, including the significantly mutated genes DSC2, ST7, PIK3R1 and SMC5, and the DNA repair, ERBB-HER signalling, axon guidance and protein kinase-A signalling pathways. (b) Mutational signature analysis was applied to successfully identify the primary cancer type for two BMs with unknown origins. (c) Actionable genomic alterations were identified in 31/36 BMs (86%); in one case we retrospectively identified ERBB2 amplification representing apparent HER2 status conversion, then confirmed progressive enrichment for HER2-positivity across four consecutive metastatic deposits by IHC and SISH, resulting in the deployment of HER2-targeted therapy for the patient. (d) In the ERBB/HER pathway, ERBB2 expression correlated with ERBB3 (r(2)  = 0.496; p < 0.0001) and HER3 and HER4 were frequently activated in an independent cohort of 167 archival BM from seven primary cancer types: 57.6% and 52.6% of cases were phospho-HER3(Y1222) or phospho-HER4(Y1162) membrane-positive, respectively. The HER3 ligands NRG1/2 were barely detectable by RNAseq, with NRG1 (8p12) genomic loss in 63.6% breast cancer-BMs, suggesting a microenvironmental source of ligand. In summary, this is the first study to characterize the genomic landscapes of BM. The data revealed novel candidates, potential clinical applications for genomic profiling of resectable BMs, and highlighted the possibility of therapeutically targeting

  12. Identification of Candidate Adherent-Invasive E. coli Signature Transcripts by Genomic/Transcriptomic Analysis

    PubMed Central

    Zhang, Yuanhao; Rowehl, Leahana; Krumsiek, Julia M.; Orner, Erika P.; Shaikh, Nurmohammad; Tarr, Phillip I.; Sodergren, Erica; Weinstock, George M.; Boedeker, Edgar C.; Xiong, Xuejian; Parkinson, John; Frank, Daniel N.; Li, Ellen; Gathungu, Grace

    2015-01-01

    quantitative polymerase chain reaction assays for 6 genes were conducted on fecal and ileal RNA samples from 22 inflammatory bowel disease (IBD), and 32 patients without IBD (non-IBD). The expression of Cas loci was detected in a higher proportion of CD than non-IBD fecal and ileal RNA samples (p <0.05). These results support a comparative genomic/transcriptomic approach towards identifying candidate AIEC signature transcripts. PMID:26125937

  13. Analyses of transcriptome sequences reveal multiple ancient large-scale duplication events in the ancestor of Sphagnopsida (Bryophyta).

    PubMed

    Devos, Nicolas; Szövényi, Péter; Weston, David J; Rothfels, Carl J; Johnson, Matthew G; Shaw, A Jonathan

    2016-07-01

    The goal of this research was to investigate whether there has been a whole-genome duplication (WGD) in the ancestry of Sphagnum (peatmoss) or the class Sphagnopsida, and to determine if the timing of any such duplication(s) and patterns of paralog retention could help explain the rapid radiation and current ecological dominance of peatmosses. RNA sequencing (RNA-seq) data were generated for nine taxa in Sphagnopsida (Bryophyta). Analyses of frequency plots for synonymous substitutions per synonymous site (Ks ) between paralogous gene pairs and reconciliation of 578 gene trees were conducted to assess evidence of large-scale or genome-wide duplication events in each transcriptome. Both Ks frequency plots and gene tree-based analyses indicate multiple duplication events in the history of the Sphagnopsida. The most recent WGD event predates divergence of Sphagnum from the two other genera of Sphagnopsida. Duplicate retention is highly variable across species, which might be best explained by local adaptation. Our analyses indicate that the last WGD could have been an important factor underlying the diversification of peatmosses and facilitated their rise to ecological dominance in peatlands. The timing of the duplication events and their significance in the evolutionary history of peat mosses are discussed. PMID:26900928

  14. Analyses of transcriptome sequences reveal multiple ancient large-scale duplication events in the ancestor of Sphagnopsida (Bryophyta).

    PubMed

    Devos, Nicolas; Szövényi, Péter; Weston, David J; Rothfels, Carl J; Johnson, Matthew G; Shaw, A Jonathan

    2016-07-01

    The goal of this research was to investigate whether there has been a whole-genome duplication (WGD) in the ancestry of Sphagnum (peatmoss) or the class Sphagnopsida, and to determine if the timing of any such duplication(s) and patterns of paralog retention could help explain the rapid radiation and current ecological dominance of peatmosses. RNA sequencing (RNA-seq) data were generated for nine taxa in Sphagnopsida (Bryophyta). Analyses of frequency plots for synonymous substitutions per synonymous site (Ks ) between paralogous gene pairs and reconciliation of 578 gene trees were conducted to assess evidence of large-scale or genome-wide duplication events in each transcriptome. Both Ks frequency plots and gene tree-based analyses indicate multiple duplication events in the history of the Sphagnopsida. The most recent WGD event predates divergence of Sphagnum from the two other genera of Sphagnopsida. Duplicate retention is highly variable across species, which might be best explained by local adaptation. Our analyses indicate that the last WGD could have been an important factor underlying the diversification of peatmosses and facilitated their rise to ecological dominance in peatlands. The timing of the duplication events and their significance in the evolutionary history of peat mosses are discussed.

  15. Defining the RNA polymerase III transcriptome: Genome-wide localization of the RNA polymerase III transcription machinery in human cells

    PubMed Central

    Canella, Donatella; Praz, Viviane; Reina, Jaime H.; Cousin, Pascal; Hernandez, Nouria

    2010-01-01

    Our view of the RNA polymerase III (Pol III) transcription machinery in mammalian cells arises mostly from studies of the RN5S (5S) gene, the Ad2 VAI gene, and the RNU6 (U6) gene, as paradigms for genes with type 1, 2, and 3 promoters. Recruitment of Pol III onto these genes requires prior binding of well-characterized transcription factors. Technical limitations in dealing with repeated genomic units, typically found at mammalian Pol III genes, have so far hampered genome-wide studies of the Pol III transcription machinery and transcriptome. We have localized, genome-wide, Pol III and some of its transcription factors. Our results reveal broad usage of the known Pol III transcription machinery and define a minimal Pol III transcriptome in dividing IMR90hTert fibroblasts. This transcriptome consists of some 500 actively transcribed genes including a few dozen candidate novel genes, of which we confirmed nine as Pol III transcription units by additional methods. It does not contain any of the microRNA genes previously described as transcribed by Pol III, but reveals two other microRNA genes, MIR886 (hsa-mir-886) and MIR1975 (RNY5, hY5, hsa-mir-1975), which are genuine Pol III transcription units. PMID:20413673

  16. Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa

    PubMed Central

    2012-01-01

    Introduction Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. Results cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. Conclusions We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable

  17. Genomic analysis of host - Peste des petits ruminants vaccine viral transcriptome uncovers transcription factors modulating immune regulatory pathways.

    PubMed

    Manjunath, Siddappa; Kumar, Gandham Ravi; Mishra, Bishnu Prasad; Mishra, Bina; Sahoo, Aditya Prasad; Joshi, Chaitanya G; Tiwari, Ashok K; Rajak, Kaushal Kishore; Janga, Sarath Chandra

    2015-01-01

    Peste des petits ruminants (PPR), is an acute transboundary viral disease of economic importance, affecting goats and sheep. Mass vaccination programs around the world resulted in the decline of PPR outbreaks. Sungri 96 is a live attenuated vaccine, widely used in Northern India against PPR. This vaccine virus, isolated from goat works efficiently both in sheep and goat. Global gene expression changes under PPR vaccine virus infection are not yet well defined. Therefore, in this study we investigated the host-vaccine virus interactions by infecting the peripheral blood mononuclear cells isolated from goat with PPRV (Sungri 96 vaccine virus), to quantify the global changes in the transcriptomic signature by RNA-sequencing. Viral genome of Sungri 96 vaccine virus was assembled from the PPRV infected transcriptome confirming the infection and demonstrating the feasibility of building a complete non-host genome from the blood transcriptome. Comparison of infected transcriptome with control transcriptome revealed 985 differentially expressed genes. Functional analysis showed enrichment of immune regulatory pathways under PPRV infection. Key genes involved in immune system regulation, spliceosomal and apoptotic pathways were identified to be dysregulated. Network analysis revealed that the protein - protein interaction network among differentially expressed genes is significantly disrupted in infected state. Several genes encoding TFs that govern immune regulatory pathways were identified to co-regulate the differentially expressed genes. These data provide insights into the host - PPRV vaccine virus interactome for the first time. Our findings suggested dysregulation of immune regulatory pathways and genes encoding Transcription Factors (TFs) that govern these pathways in response to viral infection. PMID:25827022

  18. Large-Scale Transcriptome Analysis in Faba Bean (Vicia faba L.) under Ascochyta fabae Infection

    PubMed Central

    Ocaña, Sara; Seoane, Pedro; Bautista, Rocio; Palomino, Carmen; Claros, Gonzalo M.; Torres, Ana M.; Madrid, Eva

    2015-01-01

    Faba bean is an important food crop worldwide. However, progress in faba bean genomics lags far behind that of model systems due to limited availability of genetic and genomic information. Using the Illumina platform the faba bean transcriptome from leaves of two lines (29H and Vf136) subjected to Ascochyta fabae infection have been characterized. De novo transcriptome assembly provided a total of 39,185 different transcripts that were functionally annotated, and among these, 13,266 were assigned to gene ontology against Arabidopsis. Quality of the assembly was validated by RT-qPCR amplification of selected transcripts differentially expressed. Comparison of faba bean transcripts with those of better-characterized plant genomes such as Arabidopsis thaliana, Medicago truncatula and Cicer arietinum revealed a sequence similarity of 68.3%, 72.8% and 81.27%, respectively. Moreover, 39,060 single nucleotide polymorphism (SNP) and 3,669 InDels were identified for genotyping applications. Mapping of the sequence reads generated onto the assembled transcripts showed that 393 and 457 transcripts were overexpressed in the resistant (29H) and susceptible genotype (Vf136), respectively. Transcripts involved in plant-pathogen interactions such as leucine rich proteins (LRR) or plant growth regulators involved in plant adaptation to abiotic and biotic stresses were found to be differently expressed in the resistant line. The results reported here represent the most comprehensive transcript database developed so far in faba bean, providing valuable information that could be used to gain insight into the pathways involved in the resistance mechanism against A. fabae and to identify potential resistance genes to be further used in marker assisted selection. PMID:26267359

  19. Large-Scale Transcriptome Analysis in Faba Bean (Vicia faba L.) under Ascochyta fabae Infection.

    PubMed

    Ocaña, Sara; Seoane, Pedro; Bautista, Rocio; Palomino, Carmen; Claros, Gonzalo M; Torres, Ana M; Madrid, Eva

    2015-01-01

    Faba bean is an important food crop worldwide. However, progress in faba bean genomics lags far behind that of model systems due to limited availability of genetic and genomic information. Using the Illumina platform the faba bean transcriptome from leaves of two lines (29H and Vf136) subjected to Ascochyta fabae infection have been characterized. De novo transcriptome assembly provided a total of 39,185 different transcripts that were functionally annotated, and among these, 13,266 were assigned to gene ontology against Arabidopsis. Quality of the assembly was validated by RT-qPCR amplification of selected transcripts differentially expressed. Comparison of faba bean transcripts with those of better-characterized plant genomes such as Arabidopsis thaliana, Medicago truncatula and Cicer arietinum revealed a sequence similarity of 68.3%, 72.8% and 81.27%, respectively. Moreover, 39,060 single nucleotide polymorphism (SNP) and 3,669 InDels were identified for genotyping applications. Mapping of the sequence reads generated onto the assembled transcripts showed that 393 and 457 transcripts were overexpressed in the resistant (29H) and susceptible genotype (Vf136), respectively. Transcripts involved in plant-pathogen interactions such as leucine rich proteins (LRR) or plant growth regulators involved in plant adaptation to abiotic and biotic stresses were found to be differently expressed in the resistant line. The results reported here represent the most comprehensive transcript database developed so far in faba bean, providing valuable information that could be used to gain insight into the pathways involved in the resistance mechanism against A. fabae and to identify potential resistance genes to be further used in marker assisted selection.

  20. The Genomics, Epigenomics, and Transcriptomics of HPV-Associated Oropharyngeal Cancer--Understanding the Basis of a Rapidly Evolving Disease.

    PubMed

    Lechner, M; Fenton, T R

    2016-01-01

    Human papillomavirus (HPV) has been shown to represent a major independent risk factor for head and neck squamous cell cancer, in particular for oropharyngeal carcinoma. This type of cancer is rapidly evolving in the Western world, with rising trends particularly in the young, and represents a distinct epidemiological, clinical, and molecular entity. It is the aim of this review to give a detailed description of genomic, epigenomic, transcriptomic, and posttranscriptional changes that underlie the phenotype of this deadly disease. The review will also link these changes and examine what is known about the interactions between the host genome and viral genome, and investigate changes specific for the viral genome. These data are then integrated into an updated model of HPV-induced head and neck carcinogenesis.

  1. Comparative and Transcriptome Analyses Uncover Key Aspects of Coding- and Long Noncoding RNAs in Flatworm Mitochondrial Genomes

    PubMed Central

    Ross, Eric; Blair, David; Guerrero-Hernández, Carlos; Alvarado, Alejandro Sánchez

    2016-01-01

    Exploiting the conservation of various features of mitochondrial genomes has been instrumental in resolving phylogenetic relationships. Despite extensive sequence evidence, it has not previously been possible to conclusively resolve some key aspects of flatworm mitochondrial genomes, including generally conserved traits, such as start codons, noncoding regions, the full complement of tRNAs, and whether ATP8 is, or is not, encoded by this extranuclear genome. In an effort to address these difficulties, we sought to determine the mitochondrial transcriptomes and genomes of sexual and asexual taxa of freshwater triclads, a group previously poorly represented in flatworm mitogenomic studies. We have discovered evidence for an alternative start codon, an extended cox1 gene, a previously undescribed conserved open reading frame, long noncoding RNAs, and a highly conserved gene order across the large evolutionary distances represented within the triclads. Our findings contribute to the expansion and refinement of mitogenomics to address evolutionary issues in this diverse group of animals. PMID:26921295

  2. Comparative and Transcriptome Analyses Uncover Key Aspects of Coding- and Long Noncoding RNAs in Flatworm Mitochondrial Genomes.

    PubMed

    Ross, Eric; Blair, David; Guerrero-Hernández, Carlos; Sánchez Alvarado, Alejandro

    2016-01-01

    Exploiting the conservation of various features of mitochondrial genomes has been instrumental in resolving phylogenetic relationships. Despite extensive sequence evidence, it has not previously been possible to conclusively resolve some key aspects of flatworm mitochondrial genomes, including generally conserved traits, such as start codons, noncoding regions, the full complement of tRNAs, and whether ATP8 is, or is not, encoded by this extranuclear genome. In an effort to address these difficulties, we sought to determine the mitochondrial transcriptomes and genomes of sexual and asexual taxa of freshwater triclads, a group previously poorly represented in flatworm mitogenomic studies. We have discovered evidence for an alternative start codon, an extended cox1 gene, a previously undescribed conserved open reading frame, long noncoding RNAs, and a highly conserved gene order across the large evolutionary distances represented within the triclads. Our findings contribute to the expansion and refinement of mitogenomics to address evolutionary issues in this diverse group of animals.

  3. A systematic comparison of genome-scale clustering algorithms

    PubMed Central

    2012-01-01

    Background A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. Methods For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each cluster's agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. Results Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. Conclusions Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further

  4. Identification of the minimal connected network of transcription factors by transcriptomic and genomic data integration.

    PubMed

    Essaghir, Ahmed

    2014-01-01

    Thanks to high-throughput experiments, biological conditions can be investigated at both the entire genomic and transcriptomic levels. In addition, protein-protein interaction (PPI) data are widely available for well-studied organisms, such as human. In this chapter, we will present an integrative approach that makes use of these data to find the PPI module involving the key regulated transcription factors shared by a number of given conditions. These conditions could be for instance different cancer types. Briefly, for the studied conditions, we need to identify commonly affected chromosomal regions subjected to copy number alterations together with the identification of differentially expressed list of genes in each condition. Transcription factor activity will be inferred from these regulated gene lists. Then, we will define TFs, for which the activity could be explained by an associative effect of both loci copy number alteration and gene expression levels of their coding genes. PPI networks could be mined, afterwards, using appropriate algorithms to find the significant module that connect those TFs together. This module could be viewed as the minimal connected network of TFs, the regulation of which is shared between the investigated conditions.

  5. Breast cancer genome and transcriptome integration implicates specific mutational signatures with immune cell infiltration

    PubMed Central

    Smid, Marcel; Rodríguez-González, F. Germán; Sieuwerts, Anieta M.; Salgado, Roberto; Prager-Van der Smissen, Wendy J. C.; Vlugt-Daane, Michelle van der; van Galen, Anne; Nik-Zainal, Serena; Staaf, Johan; Brinkman, Arie B.; van de Vijver, Marc J.; Richardson, Andrea L.; Fatima, Aquila; Berentsen, Kim; Butler, Adam; Martin, Sancha; Davies, Helen R.; Debets, Reno; Gelder, Marion E. Meijer-Van; van Deurzen, Carolien H. M.; MacGrogan, Gaëtan; Van den Eynden, Gert G. G. M.; Purdie, Colin; Thompson, Alastair M.; Caldas, Carlos; Span, Paul N.; Simpson, Peter T.; Lakhani, Sunil R.; Van Laere, Steven; Desmedt, Christine; Ringnér, Markus; Tommasi, Stefania; Eyford, Jorunn; Broeks, Annegien; Vincent-Salomon, Anne; Futreal, P. Andrew; Knappskog, Stian; King, Tari; Thomas, Gilles; Viari, Alain; Langerød, Anita; Børresen-Dale, Anne-Lise; Birney, Ewan; Stunnenberg, Hendrik G.; Stratton, Mike; Foekens, John A.; Martens, John W. M.

    2016-01-01

    A recent comprehensive whole genome analysis of a large breast cancer cohort was used to link known and novel drivers and substitution signatures to the transcriptome of 266 cases. Here, we validate that subtype-specific aberrations show concordant expression changes for, for example, TP53, PIK3CA, PTEN, CCND1 and CDH1. We find that CCND3 expression levels do not correlate with amplification, while increased GATA3 expression in mutant GATA3 cancers suggests GATA3 is an oncogene. In luminal cases the total number of substitutions, irrespective of type, associates with cell cycle gene expression and adverse outcome, whereas the number of mutations of signatures 3 and 13 associates with immune-response specific gene expression, increased numbers of tumour-infiltrating lymphocytes and better outcome. Thus, while earlier reports imply that the sheer number of somatic aberrations could trigger an immune-response, our data suggests that substitutions of a particular type are more effective in doing so than others. PMID:27666519

  6. Transcriptome analysis reveals novel regulatory mechanisms in a genome-reduced bacterium

    PubMed Central

    Mazin, Pavel V.; Fisunov, Gleb Y.; Gorbachev, Alexey Y.; Kapitskaya, Kristina Y.; Altukhov, Ilya A.; Semashko, Tatiana A.; Alexeev, Dmitry G.; Govorun, Vadim M.

    2014-01-01

    The avian bacterial pathogen Mycoplasma gallisepticum is a good model for systems studies due to small genome and simplicity of regulatory pathways. In this study, we used RNA-Seq and MS-based proteomics to accurately map coding sequences, transcription start sites (TSSs) and transcript 3′-ends (T3Es). We used obtained data to investigate roles of TSSs and T3Es in stress-induced transcriptional responses. We identified 1061 TSSs at a false discovery rate of 10% and showed that almost all transcription in M. gallisepticum is initiated from classic TATAAT promoters surrounded by A/T-rich sequences. Our analysis revealed the pronounced operon structure complexity: on average, each coding operon has one internal TSS and T3Es in addition to the primary ones. Our transcriptomic approach based on the intervals between the two nearest transcript ends allowed us to identify two classes of T3Es: strong, unregulated, hairpin-containing T3Es and weak, heat shock-regulated, hairpinless T3Es. Comparing gene expression levels under different conditions revealed widespread and divergent transcription regulation in M. gallisepticum. Modeling suggested that the core promoter structure plays an important role in gene expression regulation. We have shown that the heat stress activation of cryptic promoters combined with the hairpinless T3Es suppression leads to widespread, seemingly non-functional transcription. PMID:25361977

  7. Stepwise Evolution of Coral Biomineralization Revealed with Genome-Wide Proteomics and Transcriptomics

    PubMed Central

    Sawada, Hitoshi; Satoh, Noriyuki

    2016-01-01

    Despite the importance of stony corals in many research fields related to global issues, such as marine ecology, climate change, paleoclimatogy, and metazoan evolution, very little is known about the evolutionary origin of coral skeleton formation. In order to investigate the evolution of coral biomineralization, we have identified skeletal organic matrix proteins (SOMPs) in the skeletal proteome of the scleractinian coral, Acropora digitifera, for which large genomic and transcriptomic datasets are available. Scrupulous gene annotation was conducted based on comparisons of functional domain structures among metazoans. We found that SOMPs include not only coral-specific proteins, but also protein families that are widely conserved among cnidarians and other metazoans. We also identified several conserved transmembrane proteins in the skeletal proteome. Gene expression analysis revealed that expression of these conserved genes continues throughout development. Therefore, these genes are involved not only skeleton formation, but also in basic cellular functions, such as cell-cell interaction and signaling. On the other hand, genes encoding coral-specific proteins, including extracellular matrix domain-containing proteins, galaxins, and acidic proteins, were prominently expressed in post-settlement stages, indicating their role in skeleton formation. Taken together, the process of coral skeleton formation is hypothesized as: 1) formation of initial extracellular matrix between epithelial cells and substrate, employing pre-existing transmembrane proteins; 2) additional extracellular matrix formation using novel proteins that have emerged by domain shuffling and rapid molecular evolution and; 3) calcification controlled by coral-specific SOMPs. PMID:27253604

  8. Genomic and transcriptomic analyses of the tangerine pathotype of Alternaria alternata in response to oxidative stress

    PubMed Central

    Wang, Mingshuang; Sun, Xuepeng; Yu, Dongliang; Xu, Jianping; Chung, Kuangren; Li, Hongye

    2016-01-01

    The tangerine pathotype of Alternaria alternata produces the A. citri toxin (ACT) and is the causal agent of citrus brown spot that results in significant yield losses worldwide. Both the production of ACT and the ability to detoxify reactive oxygen species (ROS) are required for A. alternata pathogenicity in citrus. In this study, we report the 34.41 Mb genome sequence of strain Z7 of the tangerine pathotype of A. alternata. The host selective ACT gene cluster in strain Z7 was identified, which included 25 genes with 19 of them not reported previously. Of these, 10 genes were present only in the tangerine pathotype, representing the most likely candidate genes for this pathotype specialization. A transcriptome analysis of the global effects of H2O2 on gene expression revealed 1108 up-regulated and 498 down-regulated genes. Expressions of those genes encoding catalase, peroxiredoxin, thioredoxin and glutathione were highly induced. Genes encoding several protein families including kinases, transcription factors, transporters, cytochrome P450, ubiquitin and heat shock proteins were found associated with adaptation to oxidative stress. Our data not only revealed the molecular basis of ACT biosynthesis but also provided new insights into the potential pathways that the phytopathogen A. alternata copes with oxidative stress. PMID:27582273

  9. Stepwise Evolution of Coral Biomineralization Revealed with Genome-Wide Proteomics and Transcriptomics.

    PubMed

    Takeuchi, Takeshi; Yamada, Lixy; Shinzato, Chuya; Sawada, Hitoshi; Satoh, Noriyuki

    2016-01-01

    Despite the importance of stony corals in many research fields related to global issues, such as marine ecology, climate change, paleoclimatogy, and metazoan evolution, very little is known about the evolutionary origin of coral skeleton formation. In order to investigate the evolution of coral biomineralization, we have identified skeletal organic matrix proteins (SOMPs) in the skeletal proteome of the scleractinian coral, Acropora digitifera, for which large genomic and transcriptomic datasets are available. Scrupulous gene annotation was conducted based on comparisons of functional domain structures among metazoans. We found that SOMPs include not only coral-specific proteins, but also protein families that are widely conserved among cnidarians and other metazoans. We also identified several conserved transmembrane proteins in the skeletal proteome. Gene expression analysis revealed that expression of these conserved genes continues throughout development. Therefore, these genes are involved not only skeleton formation, but also in basic cellular functions, such as cell-cell interaction and signaling. On the other hand, genes encoding coral-specific proteins, including extracellular matrix domain-containing proteins, galaxins, and acidic proteins, were prominently expressed in post-settlement stages, indicating their role in skeleton formation. Taken together, the process of coral skeleton formation is hypothesized as: 1) formation of initial extracellular matrix between epithelial cells and substrate, employing pre-existing transmembrane proteins; 2) additional extracellular matrix formation using novel proteins that have emerged by domain shuffling and rapid molecular evolution and; 3) calcification controlled by coral-specific SOMPs. PMID:27253604

  10. Genomic and transcriptomic analyses of the tangerine pathotype of Alternaria alternata in response to oxidative stress.

    PubMed

    Wang, Mingshuang; Sun, Xuepeng; Yu, Dongliang; Xu, Jianping; Chung, Kuangren; Li, Hongye

    2016-01-01

    The tangerine pathotype of Alternaria alternata produces the A. citri toxin (ACT) and is the causal agent of citrus brown spot that results in significant yield losses worldwide. Both the production of ACT and the ability to detoxify reactive oxygen species (ROS) are required for A. alternata pathogenicity in citrus. In this study, we report the 34.41 Mb genome sequence of strain Z7 of the tangerine pathotype of A. alternata. The host selective ACT gene cluster in strain Z7 was identified, which included 25 genes with 19 of them not reported previously. Of these, 10 genes were present only in the tangerine pathotype, representing the most likely candidate genes for this pathotype specialization. A transcriptome analysis of the global effects of H2O2 on gene expression revealed 1108 up-regulated and 498 down-regulated genes. Expressions of those genes encoding catalase, peroxiredoxin, thioredoxin and glutathione were highly induced. Genes encoding several protein families including kinases, transcription factors, transporters, cytochrome P450, ubiquitin and heat shock proteins were found associated with adaptation to oxidative stress. Our data not only revealed the molecular basis of ACT biosynthesis but also provided new insights into the potential pathways that the phytopathogen A. alternata copes with oxidative stress. PMID:27582273

  11. Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

    PubMed

    Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry

    2016-11-01

    Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools. PMID:27685099

  12. Transcriptome sequencing and microarray design for functional genomics in the extremophile Arabidopsis relative Thellungiella salsuginea (Eutrema salsugineum)

    PubMed Central

    2013-01-01

    Background Most molecular studies of plant stress tolerance have been performed with Arabidopsis thaliana, although it is not particularly stress tolerant and may lack protective mechanisms required to survive extreme environmental conditions. Thellungiella salsuginea has attracted interest as an alternative plant model species with high tolerance of various abiotic stresses. While the T. salsuginea genome has recently been sequenced, its annotation is still incomplete and transcriptomic information is scarce. In addition, functional genomics investigations in this species are severely hampered by a lack of affordable tools for genome-wide gene expression studies. Results Here, we report the results of Thellungiella de novo transcriptome assembly and annotation based on 454 pyrosequencing and development and validation of a T. salsuginea microarray. ESTs were generated from a non-normalized and a normalized library synthesized from RNA pooled from samples covering different tissues and abiotic stress conditions. Both libraries yielded partially unique sequences, indicating their necessity to obtain comprehensive transcriptome coverage. More than 1 million sequence reads were assembled into 42,810 unigenes, approximately 50% of which could be functionally annotated. These unigenes were compared to all available Thellungiella genome sequence information. In addition, the groups of Late Embryogenesis Abundant (LEA) proteins, Mitogen Activated Protein (MAP) kinases and protein phosphatases were annotated in detail. We also predicted the target genes for 384 putative miRNAs. From the sequence information, we constructed a 44 k Agilent oligonucleotide microarray. Comparison of same-species and cross-species hybridization results showed superior performance of the newly designed array for T. salsuginea samples. The developed microarrays were used to investigate transcriptional responses of T. salsuginea and Arabidopsis during cold acclimation using the MapMan software

  13. Transcriptome complexity in cardiac development and diseases--an expanding universe between genome and phenome.

    PubMed

    Gao, Chen; Wang, Yibin

    2014-01-01

    With the advancement of transcriptome profiling by micro-arrays and high-throughput RNA-sequencing, transcriptome complexity and its dynamics are revealed at different levels in cardiovascular development and diseases. In this review, we will highlight the recent progress in our knowledge of cardiovascular transcriptome complexity contributed by RNA splicing, RNA editing and noncoding RNAs. The emerging importance of many of these previously under-explored aspects of gene regulation in cardiovascular development and pathology will be discussed.

  14. Identification of genes for controlling swine adipose deposition by integrating transcriptome, whole-genome resequencing, and quantitative trait loci data.

    PubMed

    Xing, Kai; Zhu, Feng; Zhai, LiWei; Chen, ShaoKang; Tan, Zhen; Sun, YangYang; Hou, ZhuoCheng; Wang, ChuDuan

    2016-01-01

    Backfat thickness is strongly associated with meat quality, fattening efficiency, reproductive performance, and immunity in pigs. Fat storage and fatty acid synthesis mainly occur in adipose tissue. Therefore, we used a high-throughput massively parallel sequencing approach to identify transcriptomes in adipose tissue, and whole-genome differences from three full-sibling pairs of pigs with opposite (high and low) backfat thickness phenotypes. We obtained an average of 38.69 million reads for six samples, 78.68% of which were annotated in the reference genome. Eighty-nine overlapping differentially expressed genes were identified among the three pair comparisons. Whole-genome resequencing also detected multiple genetic variations between the pools of DNA from the two groups. Compared with the animal quantitative trait loci (QTL) database, 20 differentially expressed genes were matched to the QTLs associated with fatness in pigs. Our technique of integrating transcriptome, whole-genome resequencing, and QTL database information provided a rich source of important differentially expressed genes and variations. Associate analysis between selected SNPs and backfat thickness revealed that two SNPs and one haplotype of ME1 significantly affected fat deposition in pigs. Moreover, genetic analysis confirmed that variations in the differentially expressed genes may affect fat deposition. PMID:26996612

  15. Identification of genes for controlling swine adipose deposition by integrating transcriptome, whole-genome resequencing, and quantitative trait loci data

    PubMed Central

    Xing, Kai; Zhu, Feng; Zhai, LiWei; Chen, ShaoKang; Tan, Zhen; Sun, YangYang; Hou, ZhuoCheng; Wang, ChuDuan

    2016-01-01

    Backfat thickness is strongly associated with meat quality, fattening efficiency, reproductive performance, and immunity in pigs. Fat storage and fatty acid synthesis mainly occur in adipose tissue. Therefore, we used a high-throughput massively parallel sequencing approach to identify transcriptomes in adipose tissue, and whole-genome differences from three full-sibling pairs of pigs with opposite (high and low) backfat thickness phenotypes. We obtained an average of 38.69 million reads for six samples, 78.68% of which were annotated in the reference genome. Eighty-nine overlapping differentially expressed genes were identified among the three pair comparisons. Whole-genome resequencing also detected multiple genetic variations between the pools of DNA from the two groups. Compared with the animal quantitative trait loci (QTL) database, 20 differentially expressed genes were matched to the QTLs associated with fatness in pigs. Our technique of integrating transcriptome, whole-genome resequencing, and QTL database information provided a rich source of important differentially expressed genes and variations. Associate analysis between selected SNPs and backfat thickness revealed that two SNPs and one haplotype of ME1 significantly affected fat deposition in pigs. Moreover, genetic analysis confirmed that variations in the differentially expressed genes may affect fat deposition. PMID:26996612

  16. Global expression analysis of the brown alga Ectocarpus siliculosus (Phaeophyceae) reveals large-scale reprogramming of the transcriptome in response to abiotic stress

    PubMed Central

    Dittami, Simon M; Scornet, Delphine; Petit, Jean-Louis; Ségurens, Béatrice; Da Silva, Corinne; Corre, Erwan; Dondrup, Michael; Glatting, Karl-Heinz; König, Rainer; Sterck, Lieven; Rouzé, Pierre; Van de Peer, Yves; Cock, J Mark; Boyen, Catherine; Tonon, Thierry

    2009-01-01

    Background Brown algae (Phaeophyceae) are phylogenetically distant from red and green algae and an important component of the coastal ecosystem. They have developed unique mechanisms that allow them to inhabit the intertidal zone, an environment with high levels of abiotic stress. Ectocarpus siliculosus is being established as a genetic and genomic model for the brown algal lineage, but little is known about its response to abiotic stress. Results Here we examine the transcriptomic changes that occur during the short-term acclimation of E. siliculosus to three different abiotic stress conditions (hyposaline, hypersaline and oxidative stress). Our results show that almost 70% of the expressed genes are regulated in response to at least one of these stressors. Although there are several common elements with terrestrial plants, such as repression of growth-related genes, switching from primary production to protein and nutrient recycling processes, and induction of genes involved in vesicular trafficking, many of the stress-regulated genes are either not known to respond to stress in other organisms or are have been found exclusively in E. siliculosus. Conclusions This first large-scale transcriptomic study of a brown alga demonstrates that, unlike terrestrial plants, E. siliculosus undergoes extensive reprogramming of its transcriptome during the acclimation to mild abiotic stress. We identify several new genes and pathways with a putative function in the stress response and thus pave the way for more detailed investigations of the mechanisms underlying the stress tolerance ofbrown algae. PMID:19531237

  17. Modeling cancer metabolism on a genome scale

    PubMed Central

    Yizhak, Keren; Chaneton, Barbara; Gottlieb, Eyal; Ruppin, Eytan

    2015-01-01

    Cancer cells have fundamentally altered cellular metabolism that is associated with their tumorigenicity and malignancy. In addition to the widely studied Warburg effect, several new key metabolic alterations in cancer have been established over the last decade, leading to the recognition that altered tumor metabolism is one of the hallmarks of cancer. Deciphering the full scope and functional implications of the dysregulated metabolism in cancer requires both the advancement of a variety of omics measurements and the advancement of computational approaches for the analysis and contextualization of the accumulated data. Encouragingly, while the metabolic network is highly interconnected and complex, it is at the same time probably the best characterized cellular network. Following, this review discusses the challenges that genome-scale modeling of cancer metabolism has been facing. We survey several recent studies demonstrating the first strides that have been done, testifying to the value of this approach in portraying a network-level view of the cancer metabolism and in identifying novel drug targets and biomarkers. Finally, we outline a few new steps that may further advance this field. PMID:26130389

  18. Ensembl Genomes 2013: scaling up access to genome-wide data

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...

  19. Novel Tools for Conservation Genomics: Comparing Two High-Throughput Approaches for SNP Discovery in the Transcriptome of the European Hake

    PubMed Central

    Milano, Ilaria; Babbucci, Massimiliano; Panitz, Frank; Ogden, Rob; Nielsen, Rasmus O.; Taylor, Martin I.; Helyar, Sarah J.; Carvalho, Gary R.; Espiñeira, Montserrat; Atanassova, Miroslava; Tinti, Fausto; Maes, Gregory E.; Patarnello, Tomaso; Bargelloni, Luca

    2011-01-01

    The growing accessibility to genomic resources using next-generation sequencing (NGS) technologies has revolutionized the application of molecular genetic tools to ecology and evolutionary studies in non-model organisms. Here we present the case study of the European hake (Merluccius merluccius), one of the most important demersal resources of European fisheries. Two sequencing platforms, the Roche 454 FLX (454) and the Illumina Genome Analyzer (GAII), were used for Single Nucleotide Polymorphisms (SNPs) discovery in the hake muscle transcriptome. De novo transcriptome assembly into unique contigs, annotation, and in silico SNP detection were carried out in parallel for 454 and GAII sequence data. High-throughput genotyping using the Illumina GoldenGate assay was performed for validating 1,536 putative SNPs. Validation results were analysed to compare the performances of 454 and GAII methods and to evaluate the role of several variables (e.g. sequencing depth, intron-exon structure, sequence quality and annotation). Despite well-known differences in sequence length and throughput, the two approaches showed similar assay conversion rates (approximately 43%) and percentages of polymorphic loci (67.5% and 63.3% for GAII and 454, respectively). Both NGS platforms therefore demonstrated to be suitable for large scale identification of SNPs in transcribed regions of non-model species, although the lack of a reference genome profoundly affects the genotyping success rate. The overall efficiency, however, can be improved using strict quality and filtering criteria for SNP selection (sequence quality, intron-exon structure, target region score). PMID:22132191

  20. Guitar: An R/Bioconductor Package for Gene Annotation Guided Transcriptomic Analysis of RNA-Related Genomic Features.

    PubMed

    Cui, Xiaodong; Wei, Zhen; Zhang, Lin; Liu, Hui; Sun, Lei; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia

    2016-01-01

    Biological features, such as genes and transcription factor binding sites, are often denoted with genome-based coordinates as the genomic features. While genome-based representation is usually very effective in correlating various biological features, it can be tedious to examine the relationship between RNA-related genomic features and the landmarks of RNA transcripts with existing tools due to the difficulty in the conversion between genome-based coordinates and RNA-based coordinates. We developed here an open source Guitar R/Bioconductor package for sketching the transcriptomic view of RNA-related biological features represented by genome based coordinates. Internally, Guitar package extracts the standardized RNA coordinates with respect to the landmarks of RNA transcripts, with which hundreds of millions of RNA-related genomic features can then be efficiently analyzed within minutes. We demonstrated the usage of Guitar package in analyzing posttranscriptional RNA modifications (5-methylcytosine and N6-methyladenosine) derived from high-throughput sequencing approaches (MeRIP-Seq and RNA BS-Seq) and show that RNA 5-methylcytosine (m(5)C) is enriched in 5'UTR. The newly developed Guitar R/Bioconductor package achieves stable performance on the data tested and revealed novel biological insights. It will effectively facilitate the analysis of RNA methylation data and other RNA-related biological features in the future.

  1. Guitar: An R/Bioconductor Package for Gene Annotation Guided Transcriptomic Analysis of RNA-Related Genomic Features

    PubMed Central

    Cui, Xiaodong; Wei, Zhen; Zhang, Lin; Liu, Hui; Sun, Lei; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia

    2016-01-01

    Biological features, such as genes and transcription factor binding sites, are often denoted with genome-based coordinates as the genomic features. While genome-based representation is usually very effective in correlating various biological features, it can be tedious to examine the relationship between RNA-related genomic features and the landmarks of RNA transcripts with existing tools due to the difficulty in the conversion between genome-based coordinates and RNA-based coordinates. We developed here an open source Guitar R/Bioconductor package for sketching the transcriptomic view of RNA-related biological features represented by genome based coordinates. Internally, Guitar package extracts the standardized RNA coordinates with respect to the landmarks of RNA transcripts, with which hundreds of millions of RNA-related genomic features can then be efficiently analyzed within minutes. We demonstrated the usage of Guitar package in analyzing posttranscriptional RNA modifications (5-methylcytosine and N6-methyladenosine) derived from high-throughput sequencing approaches (MeRIP-Seq and RNA BS-Seq) and show that RNA 5-methylcytosine (m5C) is enriched in 5′UTR. The newly developed Guitar R/Bioconductor package achieves stable performance on the data tested and revealed novel biological insights. It will effectively facilitate the analysis of RNA methylation data and other RNA-related biological features in the future. PMID:27239475

  2. Guitar: An R/Bioconductor Package for Gene Annotation Guided Transcriptomic Analysis of RNA-Related Genomic Features.

    PubMed

    Cui, Xiaodong; Wei, Zhen; Zhang, Lin; Liu, Hui; Sun, Lei; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia

    2016-01-01

    Biological features, such as genes and transcription factor binding sites, are often denoted with genome-based coordinates as the genomic features. While genome-based representation is usually very effective in correlating various biological features, it can be tedious to examine the relationship between RNA-related genomic features and the landmarks of RNA transcripts with existing tools due to the difficulty in the conversion between genome-based coordinates and RNA-based coordinates. We developed here an open source Guitar R/Bioconductor package for sketching the transcriptomic view of RNA-related biological features represented by genome based coordinates. Internally, Guitar package extracts the standardized RNA coordinates with respect to the landmarks of RNA transcripts, with which hundreds of millions of RNA-related genomic features can then be efficiently analyzed within minutes. We demonstrated the usage of Guitar package in analyzing posttranscriptional RNA modifications (5-methylcytosine and N6-methyladenosine) derived from high-throughput sequencing approaches (MeRIP-Seq and RNA BS-Seq) and show that RNA 5-methylcytosine (m(5)C) is enriched in 5'UTR. The newly developed Guitar R/Bioconductor package achieves stable performance on the data tested and revealed novel biological insights. It will effectively facilitate the analysis of RNA methylation data and other RNA-related biological features in the future. PMID:27239475

  3. Oxidative Stress and Heat-Shock Responses in Desulfovibrio vulgaris by Genome-Wide Transcriptomic Analysis

    SciTech Connect

    Zhang, Weiwen; Culley, David E.; Hogan, Mike; Vitiritti, Luigi; Brockman, Fred J.

    2006-05-30

    Abstract Sulfate-reducing bacteria, like Desulfovibrio vulgaris have developed a set of reactions allowing them to survive in environments. To obtain further knowledge of the protecting mechanisms employed in D. vulgaris against the oxidative stress and heat shock, we performed a genome-wide transcriptomic analysis to determine the cellular responses to both stimuli. The results showed that 130 genes were responsive to oxidative stress, while 427 genes responsive to heat-shock, respectively. Functional analyses suggested that the genes regulated were involved in a variety of cellular functions. Metabolic analysis showed that amino acid biosynthetic pathways were induced by both oxidative stress and heat shock treatments, while fatty acid metabolism, purine and cofactor biosynthesis were induced by heat shock only. Rubrerythrin gene (rbR) were upregulated by the oxidative stress, suggesting its important role in the oxidative resistance, whereas the expression of rubredoxin oxidoreductase (rbO), superoxide ismutase (sodB) and catalase (katA) genes were not subjected to regulation by oxidative stress in D. vulgaris. In addition, the results showed that thioredoxin reductase (trxB) was responsive to oxidative stress, suggesting the thiol-specific redox system might be involved in oxidative protection in D. vulgaris. Comparison of cellular responses to oxidative stress and heat-shock allowed the identification of 66 genes that showed a similar drastic response to both environmental stimuli, implying that they might be part of the general stress response (GSR) network in D. vulgaris, which was further supported by the finding of a conserved motif upstream these common-responsive genes.

  4. Identification of cancer-related lncRNAs through integrating genome, regulome and transcriptome features.

    PubMed

    Zhao, Tingting; Xu, Jinyuan; Liu, Ling; Bai, Jing; Xu, Chaohan; Xiao, Yun; Li, Xia; Zhang, Liming

    2015-01-01

    LncRNAs have become rising stars in biology and medicine, due to their versatile functions in a wide range of important biological processes and active roles in various human cancers. Here, we developed a computational method based on the naïve Bayesian classifier method to identify cancer-related lncRNAs by integrating genome, regulome and transcriptome data, and identified 707 potential cancer-related lncRNAs. We demonstrated the performance of the method by ten-fold cross-validation, and found that integration of multi-omic data was necessary to identify cancer-related lncRNAs. We identified 707 potential cancer-related lncRNAs and our results showed that these lncRNAs tend to exhibit significant differential expression and differential DNA methylation in multiple cancer types, and prognosis effects in prostate cancer. We also found that these lncRNAs were more likely to be direct targets of TP53 family members than others. Moreover, based on 147 lncRNA knockdown data in mice, we validated that four of six mouse orthologous lncRNAs were significantly involved in many cancer-related processes, such as cell differentiation and the Wnt signaling pathway. Notably, one lncRNA, lnc-SNURF-1, which was found to be associated with TNF-mediated signaling pathways, was up-regulated in prostate cancer and the protein-coding genes affected by knockdown of the lncRNA were also significantly aberrant in prostate cancer patients, suggesting its probable importance in tumorigenesis. Taken together, our method underlines the power of integrating multi-omic data to uncover cancer-related lncRNAs.

  5. A genomic and transcriptomic approach to investigate the blue pigment phenotype in Pseudomonas fluorescens.

    PubMed

    Andreani, Nadia Andrea; Carraro, Lisa; Martino, Maria Elena; Fondi, Marco; Fasolato, Luca; Miotto, Giovanni; Magro, Massimiliano; Vianello, Fabio; Cardazzo, Barbara

    2015-11-20

    Pseudomonas fluorescens is a well-known food spoiler, able to cause serious economic losses in the food industry due to its ability to produce many extracellular, and often thermostable, compounds. The most outstanding spoilage events involving P. fluorescens were blue discoloration of several food stuffs, mainly dairy products. The bacteria involved in such high-profile cases have been identified as belonging to a clearly distinct phylogenetic cluster of the P. fluorescens group. Although the blue pigment has recently been investigated in several studies, the biosynthetic pathway leading to the pigment formation, as well as its chemical nature, remain challenging and unsolved points. In the present paper, genomic and transcriptomic data of 4 P. fluorescens strains (2 blue-pigmenting strains and 2 non-pigmenting strains) were analyzed to evaluate the presence and the expression of blue strain-specific genes. In particular, the pangenome analysis showed the presence in the blue-pigmenting strains of two copies of genes involved in the tryptophan biosynthesis pathway (including trpABCDF). The global expression profiling of blue-pigmenting strains versus non-pigmenting strains showed a general up-regulation of genes involved in iron uptake and a down-regulation of genes involved in primary metabolism. Chromogenic reaction of the blue-pigmenting bacterial cells with Kovac's reagent indicated an indole-derivative as the precursor of the blue pigment. Finally, solubility tests and MALDI-TOF mass spectrometry analysis of the isolated pigment suggested that its molecular structure is very probably a hydrophobic indigo analog.

  6. Capturing the response of Clostridium acetobutylicum to chemical stressors using a regulated genome-scale metabolic model

    DOE PAGESBeta

    Dash, Satyakam; Mueller, Thomas J.; Venkataramanan, Keerthi P.; Papoutsakis, Eleftherios T.; Maranas, Costas D.

    2014-10-14

    Clostridia are anaerobic Gram-positive Firmicutes containing broad and flexible systems for substrate utilization, which have been used successfully to produce a range of industrial compounds. Clostridium acetobutylicum has been used to produce butanol on an industrial scale through acetone-butanol-ethanol (ABE) fermentation. A genome-scale metabolic (GSM) model is a powerful tool for understanding the metabolic capacities of an organism and developing metabolic engineering strategies for strain development. The integration of stress related specific transcriptomics information with the GSM model provides opportunities for elucidating the focal points of regulation.

  7. Genomic and transcriptomic hallmarks of poorly differentiated and anaplastic thyroid cancers

    PubMed Central

    Ibrahimpasic, Tihana; Boucai, Laura; Shah, Ronak H.; Dogan, Snjezana; Ricarte-Filho, Julio C.; Krishnamoorthy, Gnana P.; Schultz, Nikolaus; Berger, Michael F.; Sander, Chris; Taylor, Barry S.; Ghossein, Ronald; Ganly, Ian; Fagin, James A.

    2016-01-01

    BACKGROUND. Poorly differentiated thyroid cancer (PDTC) and anaplastic thyroid cancer (ATC) are rare and frequently lethal tumors that so far have not been subjected to comprehensive genetic characterization. METHODS. We performed next-generation sequencing of 341 cancer genes from 117 patient-derived PDTCs and ATCs and analyzed the transcriptome of a representative subset of 37 tumors. Results were analyzed in the context of The Cancer Genome Atlas study (TCGA study) of papillary thyroid cancers (PTC). RESULTS. Compared to PDTCs, ATCs had a greater mutation burden, including a higher frequency of mutations in TP53, TERT promoter, PI3K/AKT/mTOR pathway effectors, SWI/SNF subunits, and histone methyltransferases. BRAF and RAS were the predominant drivers and dictated distinct tropism for nodal versus distant metastases in PDTC. RAS and BRAF sharply distinguished between PDTCs defined by the Turin (PDTC-Turin) versus MSKCC (PDTC-MSK) criteria, respectively. Mutations of EIF1AX, a component of the translational preinitiation complex, were markedly enriched in PDTCs and ATCs and had a striking pattern of co-occurrence with RAS mutations. While TERT promoter mutations were rare and subclonal in PTCs, they were clonal and highly prevalent in advanced cancers. Application of the TCGA-derived BRAF-RAS score (a measure of MAPK transcriptional output) revealed a preserved relationship with BRAF/RAS mutation in PDTCs, whereas ATCs were BRAF-like irrespective of driver mutation. CONCLUSIONS. These data support a model of tumorigenesis whereby PDTCs and ATCs arise from well-differentiated tumors through the accumulation of key additional genetic abnormalities, many of which have prognostic and possible therapeutic relevance. The widespread genomic disruptions in ATC compared with PDTC underscore their greater virulence and higher mortality. FUNDING. This work was supported in part by NIH grants CA50706, CA72597, P50-CA72012, P30-CA008748, and 5T32-CA160001; the Lefkovsky Family

  8. Genomic analysis and temperature-dependent transcriptome profiles of the rhizosphere originating strain Pseudomonas aeruginosa M18

    PubMed Central

    2011-01-01

    Background Our previously published reports have described an effective biocontrol agent named Pseudomonas sp. M18 as its 16S rDNA sequence and several regulator genes share homologous sequences with those of P. aeruginosa, but there are several unusual phenotypic features. This study aims to explore its strain specific genomic features and gene expression patterns at different temperatures. Results The complete M18 genome is composed of a single chromosome of 6,327,754 base pairs containing 5684 open reading frames. Seven genomic islands, including two novel prophages and five specific non-phage islands were identified besides the conserved P. aeruginosa core genome. Each prophage contains a putative chitinase coding gene, and the prophage II contains a capB gene encoding a putative cold stress protein. The non-phage genomic islands contain genes responsible for pyoluteorin biosynthesis, environmental substance degradation and type I and III restriction-modification systems. Compared with other P. aeruginosa strains, the fewest number (3) of insertion sequences and the most number (3) of clustered regularly interspaced short palindromic repeats in M18 genome may contribute to the relative genome stability. Although the M18 genome is most closely related to that of P. aeruginosa strain LESB58, the strain M18 is more susceptible to several antimicrobial agents and easier to be erased in a mouse acute lung infection model than the strain LESB58. The whole M18 transcriptomic analysis indicated that 10.6% of the expressed genes are temperature-dependent, with 22 genes up-regulated at 28°C in three non-phage genomic islands and one prophage but none at 37°C. Conclusions The P. aeruginosa strain M18 has evolved its specific genomic structures and temperature dependent expression patterns to meet the requirement of its fitness and competitiveness under selective pressures imposed on the strain in rhizosphere niche. PMID:21884571

  9. Transcriptomic and proteomic analyses on the supercooling ability and mining of antifreeze proteins of the Chinese white wax scale insect.

    PubMed

    Yu, Shu-Hui; Yang, Pu; Sun, Tao; Qi, Qian; Wang, Xue-Qing; Chen, Xiao-Ming; Feng, Ying; Liu, Bo-Wen

    2016-06-01

    The Chinese white wax scale insect, Ericerus pela, can survive at extremely low temperatures, and some overwintering individuals exhibit supercooling at temperatures below -30°C. To investigate the deep supercooling ability of E. pela, transcriptomic and proteomic analyses were performed to delineate the major gene and protein families responsible for the deep supercooling ability of overwintering females. Gene Ontology (GO) classification and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis indicated that genes involved in the mitogen-activated protein kinase, calcium, and PI3K-Akt signaling pathways and pathways associated with the biosynthesis of soluble sugars, sugar alcohols and free amino acids were dominant. Proteins responsible for low-temperature stress, such as cold acclimation proteins, glycerol biosynthesis-related enzymes and heat shock proteins (HSPs) were identified. However, no antifreeze proteins (AFPs) were identified through sequence similarity search methods. A random forest approach identified 388 putative AFPs in the proteome. The AFP gene ep-afp was expressed in Escherichia coli, and the expressed protein exhibited a thermal hysteresis activity of 0.97°C, suggesting its potential role in the deep supercooling ability of E. pela.

  10. Large-Scale Transcriptome Analysis of Two Sugarcane Genotypes Contrasting for Lignin Content.

    PubMed

    Vicentini, Renato; Bottcher, Alexandra; Brito, Michael Dos Santos; Dos Santos, Adriana Brombini; Creste, Silvana; Landell, Marcos Guimarães de Andrade; Cesarino, Igor; Mazzafera, Paulo

    2015-01-01

    Sugarcane is an important crop worldwide for sugar and first generation ethanol production. Recently, the residue of sugarcane mills, named bagasse, has been considered a promising lignocellulosic biomass to produce the second-generation ethanol. Lignin is a major factor limiting the use of bagasse and other plant lignocellulosic materials to produce second-generation ethanol. Lignin biosynthesis pathway is a complex network and changes in the expression of genes of this pathway have in general led to diverse and undesirable impacts on plant structure and physiology. Despite its economic importance, sugarcane genome was still not sequenced. In this study a high-throughput transcriptome evaluation of two sugarcane genotypes contrasting for lignin content was carried out. We generated a set of 85,151 transcripts of sugarcane using RNA-seq and de novo assembling. More than 2,000 transcripts showed differential expression between the genotypes, including several genes involved in the lignin biosynthetic pathway. This information can give valuable knowledge on the lignin biosynthesis and its interactions with other metabolic pathways in the complex sugarcane genome.

  11. Large-Scale Transcriptome Analysis of Two Sugarcane Genotypes Contrasting for Lignin Content

    PubMed Central

    Vicentini, Renato; Bottcher, Alexandra; Brito, Michael dos Santos; dos Santos, Adriana Brombini; Creste, Silvana; Landell, Marcos Guimarães de Andrade; Cesarino, Igor; Mazzafera, Paulo

    2015-01-01

    Sugarcane is an important crop worldwide for sugar and first generation ethanol production. Recently, the residue of sugarcane mills, named bagasse, has been considered a promising lignocellulosic biomass to produce the second-generation ethanol. Lignin is a major factor limiting the use of bagasse and other plant lignocellulosic materials to produce second-generation ethanol. Lignin biosynthesis pathway is a complex network and changes in the expression of genes of this pathway have in general led to diverse and undesirable impacts on plant structure and physiology. Despite its economic importance, sugarcane genome was still not sequenced. In this study a high-throughput transcriptome evaluation of two sugarcane genotypes contrasting for lignin content was carried out. We generated a set of 85,151 transcripts of sugarcane using RNA-seq and de novo assembling. More than 2,000 transcripts showed differential expression between the genotypes, including several genes involved in the lignin biosynthetic pathway. This information can give valuable knowledge on the lignin biosynthesis and its interactions with other metabolic pathways in the complex sugarcane genome. PMID:26241317

  12. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae)

    DOE PAGESBeta

    Hovde, Blake T.; Deodato, Chloe R.; Hunsperger, Heather M.; Ryken, Scott A.; Yost, Will; Jha, Ramesh K.; Patterson, Johnathan; Monnat, Raymond J.; Barlow, Steven B.; Starkenburg, Shawn R.; et al

    2015-09-23

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. Themore » nuclear genome of C. tobin is small (59 Mb), compact (∼40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two “red” RuBisCO activases that are shared across many algal lineages. In conclusion, the Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.« less

  13. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae)

    PubMed Central

    Hovde, Blake T.; Deodato, Chloe R.; Hunsperger, Heather M.; Ryken, Scott A.; Yost, Will; Jha, Ramesh K.; Patterson, Johnathan; Monnat, Raymond J.; Barlow, Steven B.; Starkenburg, Shawn R.; Cattolico, Rose Ann

    2015-01-01

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb), compact (∼40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two “red” RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes. PMID:26397803

  14. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae)

    SciTech Connect

    Hovde, Blake T.; Deodato, Chloe R.; Hunsperger, Heather M.; Ryken, Scott A.; Yost, Will; Jha, Ramesh K.; Patterson, Johnathan; Monnat, Raymond J.; Barlow, Steven B.; Starkenburg, Shawn R.; Cattolico, Rose Ann; Richardson, Paul M.

    2015-09-23

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb), compact (∼40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two “red” RuBisCO activases that are shared across many algal lineages. In conclusion, the Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.

  15. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae).

    PubMed

    Hovde, Blake T; Deodato, Chloe R; Hunsperger, Heather M; Ryken, Scott A; Yost, Will; Jha, Ramesh K; Patterson, Johnathan; Monnat, Raymond J; Barlow, Steven B; Starkenburg, Shawn R; Cattolico, Rose Ann

    2015-01-01

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb), compact (∼ 40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two "red" RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.

  16. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae).

    PubMed

    Hovde, Blake T; Deodato, Chloe R; Hunsperger, Heather M; Ryken, Scott A; Yost, Will; Jha, Ramesh K; Patterson, Johnathan; Monnat, Raymond J; Barlow, Steven B; Starkenburg, Shawn R; Cattolico, Rose Ann

    2015-01-01

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb), compact (∼ 40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two "red" RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes. PMID:26397803

  17. Integrating transcriptome and genome re-sequencing data to identify key genes and mutations affecting chicken eggshell qualities.

    PubMed

    Zhang, Quan; Zhu, Feng; Liu, Long; Zheng, Chuan Wei; Wang, De He; Hou, Zhuo Cheng; Ning, Zhong Hua

    2015-01-01

    Eggshell damages lead to economic losses in the egg production industry and are a threat to human health. We examined 49-wk-old Rhode Island White hens (Gallus gallus) that laid eggs having shells with significantly different strengths and thicknesses. We used HiSeq 2000 (Illumina) sequencing to characterize the chicken transcriptome and whole genome to identify the key genes and genetic mutations associated with eggshell calcification. We identified a total of 14,234 genes expressed in the chicken uterus, representing 89% of all annotated chicken genes. A total of 889 differentially expressed genes were identified by comparing low eggshell strength (LES) and normal eggshell strength (NES) genomes. The DEGs are enriched in calcification-related processes, including calcium ion transport and calcium signaling pathways as revealed by gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis. Some important matrix proteins, such as OC-116, LTF and SPP1, were also expressed differentially between two groups. A total of 3,671,919 single-nucleotide polymorphisms (SNPs) and 508,035 Indels were detected in protein coding genes by whole-genome re-sequencing, including 1775 non-synonymous variations and 19 frame-shift Indels in DEGs. SNPs and Indels found in this study could be further investigated for eggshell traits. This is the first report to integrate the transcriptome and genome re-sequencing to target the genetic variations which decreased the eggshell qualities. These findings further advance our understanding of eggshell calcification in the chicken uterus.

  18. Genome and Transcriptome Analysis of the Fungal Pathogen Fusarium oxysporum f. sp. cubense Causing Banana Vascular Wilt Disease

    PubMed Central

    Zeng, Huicai; Fan, Dingding; Zhu, Yabin; Feng, Yue; Wang, Guofen; Peng, Chunfang; Jiang, Xuanting; Zhou, Dajie; Ni, Peixiang; Liang, Changcong; Liu, Lei; Wang, Jun; Mao, Chao

    2014-01-01

    Background The asexual fungus Fusarium oxysporum f. sp. cubense (Foc) causing vascular wilt disease is one of the most devastating pathogens of banana (Musa spp.). To understand the molecular underpinning of pathogenicity in Foc, the genomes and transcriptomes of two Foc isolates were sequenced. Methodology/Principal Findings Genome analysis revealed that the genome structures of race 1 and race 4 isolates were highly syntenic with those of F. oxysporum f. sp. lycopersici strain Fol4287. A large number of putative virulence associated genes were identified in both Foc genomes, including genes putatively involved in root attachment, cell degradation, detoxification of toxin, transport, secondary metabolites biosynthesis and signal transductions. Importantly, relative to the Foc race 1 isolate (Foc1), the Foc race 4 isolate (Foc4) has evolved with some expanded gene families of transporters and transcription factors for transport of toxins and nutrients that may facilitate its ability to adapt to host environments and contribute to pathogenicity to banana. Transcriptome analysis disclosed a significant difference in transcriptional responses between Foc1 and Foc4 at 48 h post inoculation to the banana ‘Brazil’ in comparison with the vegetative growth stage. Of particular note, more virulence-associated genes were up regulated in Foc4 than in Foc1. Several signaling pathways like the mitogen-activated protein kinase Fmk1 mediated invasion growth pathway, the FGA1-mediated G protein signaling pathway and a pathogenicity associated two-component system were activated in Foc4 rather than in Foc1. Together, these differences in gene content and transcription response between Foc1 and Foc4 might account for variation in their virulence during infection of the banana variety ‘Brazil’. Conclusions/Significance Foc genome sequences will facilitate us to identify pathogenicity mechanism involved in the banana vascular wilt disease development. These will thus advance

  19. Analysis of tigecycline resistance development in clinical Acinetobacter baumannii isolates through a combined genomic and transcriptomic approach

    PubMed Central

    Liu, Lin; Cui, Yujun; Zheng, Beiwen; Jiang, Saiping; Yu, Wei; Shen, Ping; Ji, Jinru; Li, Lanjuan; Qin, Nan; Xiao, Yonghong

    2016-01-01

    Tigecycline (Tgc) is considered a last-resort antibiotic for the treatment of multi-drug resistant bacteria. To study Tgc resistance development in the important nosocomial pathogen Acinetobacter baumannii, we adopted six clinical isolates from three patients undergoing antibiotic treatment, and bacterial genomic sequences and seven strand-specific transcriptomes were studied. Interestingly, the Tgc-intermediate 2015ZJAB1 only differed from Tgc-resistant 2015ZJAB2 in an SNP-clustered region including OprD, a sugar-type MFS permease, and a LuxR-type transcriptional regulator. Surprisingly, an almost identical region was found in 2015ZJAB3, which supports the possibility of a homologous recombination event that increased Tgc resistance. Furthermore, comparative transcriptomic analysis identified significantly regulated genes associated with Tgc resistance, which was verified using qRT-PCR. Three enriched COG categories included amino acid transport and metabolism, transcription, and inorganic ion transport and metabolism. KEGG analysis revealed common features under Tgc conditions, including up regulated benzoate degradation and a less active TCA cycle. This may be related to selective antimicrobial pressure in the environment and adaptation by lowering metabolism. This study provides the first report of an in vivo evolutionary process that included a putative homologous recombination event conferring Tgc resistance in clinical A. baumannii isolates in which transcriptome analysis revealed resistance-conferring genes and related metabolism characteristics. PMID:27240484

  20. Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome

    PubMed Central

    Honaas, Loren A.; Wafula, Eric K.; Wickett, Norman J.; Der, Joshua P.; Zhang, Yeting; Edger, Patrick P.; Altman, Naomi S.; Pires, J. Chris; Leebens-Mack, James H.; dePamphilis, Claude W.

    2016-01-01

    Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1) proportion of reads mapping to an assembly 2) recovery of conserved, widely expressed genes, 3) N50 length statistics, and 4) the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation. PMID:26731733

  1. Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome.

    PubMed

    Honaas, Loren A; Wafula, Eric K; Wickett, Norman J; Der, Joshua P; Zhang, Yeting; Edger, Patrick P; Altman, Naomi S; Pires, J Chris; Leebens-Mack, James H; dePamphilis, Claude W

    2016-01-01

    Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1) proportion of reads mapping to an assembly 2) recovery of conserved, widely expressed genes, 3) N50 length statistics, and 4) the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation.

  2. Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome.

    PubMed

    Honaas, Loren A; Wafula, Eric K; Wickett, Norman J; Der, Joshua P; Zhang, Yeting; Edger, Patrick P; Altman, Naomi S; Pires, J Chris; Leebens-Mack, James H; dePamphilis, Claude W

    2016-01-01

    Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1) proportion of reads mapping to an assembly 2) recovery of conserved, widely expressed genes, 3) N50 length statistics, and 4) the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation. PMID:26731733

  3. Local Adaptation at the Transcriptome Level in Brown Trout: Evidence from Early Life History Temperature Genomic Reaction Norms

    PubMed Central

    Meier, Kristian; Hansen, Michael Møller; Normandeau, Eric; Mensberg, Karen-Lise D.; Frydenberg, Jane; Larsen, Peter Foged; Bekkevold, Dorte; Bernatchez, Louis

    2014-01-01

    Local adaptation and its underlying molecular basis has long been a key focus in evolutionary biology. There has recently been increased interest in the evolutionary role of plasticity and the molecular mechanisms underlying local adaptation. Using transcriptome analysis, we assessed differences in gene expression profiles for three brown trout (Salmo trutta) populations, one resident and two anadromous, experiencing different temperature regimes in the wild. The study was based on an F2 generation raised in a common garden setting. A previous study of the F1 generation revealed different reaction norms and significantly higher QST than FST among populations for two early life-history traits. In the present study we investigated if genomic reaction norm patterns were also present at the transcriptome level. Eggs from the three populations were incubated at two temperatures (5 and 8 degrees C) representing conditions encountered in the local environments. Global gene expression for fry at the stage of first feeding was analysed using a 32k cDNA microarray. The results revealed differences in gene expression between populations and temperatures and population × temperature interactions, the latter indicating locally adapted reaction norms. Moreover, the reaction norms paralleled those observed previously at early life-history traits. We identified 90 cDNA clones among the genes with an interaction effect that were differently expressed between the ecologically divergent populations. These included genes involved in immune- and stress response. We observed less plasticity in the resident as compared to the anadromous populations, possibly reflecting that the degree of environmental heterogeneity encountered by individuals throughout their life cycle will select for variable level of phenotypic plasticity at the transcriptome level. Our study demonstrates the usefulness of transcriptome approaches to identify genes with different temperature reaction norms. The

  4. Intron-genome size relationship on a large evolutionary scale.

    PubMed

    Vinogradov, A E

    1999-09-01

    The intron-genome size relationship was studied across a wide evolutionary range (from slime mold and yeast to human and maize), as well as the relationship between genome size and the ratio of intervening/coding sequence size. The average intron size is scaled to genome size with a slope of about one-fourth for the log-transformed values; i.e., on the global scale its increase in evolution is lower than the increase in genome size by four orders of magnitude. There are exceptions to the general trend. In baker's yeast introns are extraordinarily long for its genome size. Tetrapods also have longer introns than expected for their genome sizes. In teleost fish the mean intron size does not differ significantly, notwithstanding the differences in genome size. In contrast to previous reports, avian introns were not found to be significantly shorter than introns of mammals, although avian genomes are smaller than genomes of mammals on average by about a factor of 2.5. The extra-/intragenic ratio of noncoding DNA can be higher in fungi than in animals, notwithstanding the smaller fungal genomes. In vertebrates and invertebrates taken separately, this ratio is increasing as the increase in genome size. Two hypotheses are proposed to explain the variation in the extra-/intragenic ratio of noncoding DNA in organisms with similar numbers of genes: transition (dynamic) and equilibrium (static). According to the transition model, this variation arises with the rapid shift of genome size because the bulk of extragenic DNA can be changed more rapidly than the finely interspersed intron sequences. The equilibrium model assumes that this variation is a result of selective adjustment of genome size with constraints imposed on the intron size due to its putative link to chromatin structure (and constraints of the splicing machinery). PMID:10473779

  5. LEMONS – A Tool for the Identification of Splice Junctions in Transcriptomes of Organisms Lacking Reference Genomes

    PubMed Central

    Bouskila, Amos; Chorev, Michal; Carmel, Liran; Mishmar, Dan

    2015-01-01

    RNA-seq is becoming a preferred tool for genomics studies of model and non-model organisms. However, DNA-based analysis of organisms lacking sequenced genomes cannot rely on RNA-seq data alone to isolate most genes of interest, as DNA codes both exons and introns. With this in mind, we designed a novel tool, LEMONS, that exploits the evolutionary conservation of both exon/intron boundary positions and splice junction recognition signals to produce high throughput splice-junction predictions in the absence of a reference genome. When tested on multiple annotated vertebrate mRNA data, LEMONS accurately identified 87% (average) of the splice-junctions. LEMONS was then applied to our updated Mediterranean chameleon transcriptome, which lacks a reference genome, and predicted a total of 90,820 exon-exon junctions. We experimentally verified these splice-junction predictions by amplifying and sequencing twenty randomly selected genes from chameleon DNA templates. Exons and introns were detected in 19 of 20 of the positions predicted by LEMONS. To the best of our knowledge, LEMONS is currently the only experimentally verified tool that can accurately predict splice-junctions in organisms that lack a reference genome. PMID:26606265

  6. Impact of a short-term exposure to spaceflight on the phenotype, genome, transcriptome and proteome of Escherichia coli

    NASA Astrophysics Data System (ADS)

    Li, Tianzhi; Chang, De; Xu, Huiwen; Chen, Jiapeng; Su, Longxiang; Guo, Yinghua; Chen, Zhenhong; Wang, Yajuan; Wang, Li; Wang, Junfeng; Fang, Xiangqun; Liu, Changting

    2015-07-01

    Escherichia coli (E. coli) is the most widely applied model organism in current biological science. As a widespread opportunistic pathogen, E. coli can survive not only by symbiosis with human, but also outside the host as well, which necessitates the evaluation of its response to the space environment. Therefore, to keep humans safe in space, it is necessary to understand how the bacteria respond to this environment. Despite extensive investigations for a few decades, the response of E. coli to the real space environment is still controversial. To better understand the mechanisms how E. coli overcomes harsh environments such as microgravity in space and to investigate whether these factors may induce pathogenic changes in E. coli that are potentially detrimental to astronauts, we conducted detailed genomics, transcriptomic and proteomic studies on E. coli that experienced 17 days of spaceflight. By comparing two flight strains LCT-EC52 and LCT-EC59 to a control strain LCT-EC106 that was cultured under the same temperature conditions on the ground, we identified metabolism changes, polymorphism changes, differentially expressed genes and proteins in the two flight strains. The flight strains differed from the control in the utilization of more than 30 carbon sources. Two single nucleotide polymorphisms (SNPs) and one deletion were identified in the flight strains. The expression level of more than 1000 genes altered in flight strains. Genes involved in chemotaxis, lipid metabolism and cell motility express differently. Moreover, the two flight strains also differed extensively from each other in terms of metabolism, transcriptome and proteome, indicating the impact of space environment on individual cells is heterogeneous and probably genotype-dependent. This study presents the first systematic profile of E. coli genome, transcriptome and proteome after spaceflight, which helps to elucidate the mechanism that controls the adaptation of microbes to the space

  7. Using Genome-Scale Models to Predict Biological Capabilities

    PubMed Central

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods at the genome-scale have been under development since the first whole genome sequences appeared in the mid-1990s. A few years ago this approach began to demonstrate the ability to predict a range of cellular functions including cellular growth capabilities on various substrates and the effect of gene knockouts at the genome-scale. Thus, much interest has developed in understanding and applying these methods to areas such as metabolic engineering, antibiotic design, and organismal and enzyme evolution. This primer will get you started. PMID:26000478

  8. Identification of Cilia Genes That Affect Cell-Cycle Progression Using Whole-Genome Transcriptome Analysis in Chlamydomonas reinhardtti

    PubMed Central

    Albee, Alison J.; Kwan, Alan L.; Lin, Huawen; Granas, David; Stormo, Gary D.; Dutcher, Susan K.

    2013-01-01

    Cilia are microtubule based organelles that project from cells. Cilia are found on almost every cell type of the human body and numerous diseases, collectively termed ciliopathies, are associated with defects in cilia, including respiratory infections, male infertility, situs inversus, polycystic kidney disease, retinal degeneration, and Bardet-Biedl Syndrome. Here we show that Illumina-based whole-genome transcriptome analysis in the biflagellate green alga Chlamydomonas reinhardtii identifies 1850 genes up-regulated during ciliogenesis, 4392 genes down-regulated, and 4548 genes with no change in expression during ciliogenesis. We examined four genes up-regulated and not previously known to be involved with cilia (ZMYND10, NXN, GLOD4, SPATA4) by knockdown of the human orthologs in human retinal pigment epithelial cells (hTERT-RPE1) cells to ask whether they are involved in cilia-related processes that include cilia assembly, cilia length control, basal body/centriole numbers, and the distance between basal bodies/centrioles. All of the genes have cilia-related phenotypes and, surprisingly, our data show that knockdown of GLOD4 and SPATA4 also affects the cell cycle. These results demonstrate that whole-genome transcriptome analysis during ciliogenesis is a powerful tool to gain insight into the molecular mechanism by which centrosomes and cilia are assembled. PMID:23604077

  9. Comparative transcriptome assembly and genome-guided profiling for Brettanomyces bruxellensis LAMAP2480 during p-coumaric acid stress

    PubMed Central

    Godoy, Liliana; Vera-Wolf, Patricia; Martinez, Claudio; Ugalde, Juan A.; Ganga, María Angélica

    2016-01-01

    Brettanomyces bruxellensis has been described as the main contaminant yeast in wine production, due to its ability to convert the hydroxycinnamic acids naturally present in the grape phenolic derivatives, into volatile phenols. Currently, there are no studies in B. bruxellensis which explains the resistance mechanisms to hydroxycinnamic acids, and in particular to p-coumaric acid which is directly involved in alterations to wine. In this work, we performed a transcriptome analysis of B. bruxellensis LAMAP248rown in the presence and absence of p-coumaric acid during lag phase. Because of reported genetic variability among B. bruxellensis strains, to complement de novo assembly of the transcripts, we used the high-quality genome of B. bruxellensis AWRI1499, as well as the draft genomes of strains CBS2499 and0 g LAMAP2480. The results from the transcriptome analysis allowed us to propose a model in which the entrance of p-coumaric acid to the cell generates a generalized stress condition, in which the expression of proton pump and efflux of toxic compounds are induced. In addition, these mechanisms could be involved in the outflux of nitrogen compounds, such as amino acids, decreasing the overall concentration and triggering the expression of nitrogen metabolism genes. PMID:27678167

  10. The use of transcriptomic next-generation sequencing data to assemble mitochondrial genomes of Ancistrus spp. (Loricariidae).

    PubMed

    Moreira, Daniel A; Furtado, Carolina; Parente, Thiago E

    2015-11-15

    Mitochondrial genes and genomes have long been applied in phylogenetics. Current protocols to sequence mitochondrial genomes rely almost exclusively on long range PCR or on the direct sequencing. While long range PCR includes unnecessary biases, the purification of mtDNA for direct sequencing is not straightforward. We used total RNA extracted from liver and Illumina HiSeq technology to sequence mitochondrial transcripts from three fish (Ancistrus spp.) and assemble their mitogenomes. Based on the mtDNA sequence of a close related species, we estimate to have sequenced 92%, 95% and 99% of the mitogenomes. Taken the sequences together, we sequenced all the 13 protein-coding genes, two ribosomal RNAs, 22 tRNAs and the D-loop known in vertebrate mitogenomes. The use of transcriptomic data allowed the observation of the punctuation pattern of mtRNA maturation, to analyze the transcriptional profile, and to detect heteroplasmic sites. The assembly of mtDNA from transcriptomic data is complementary to other approaches and overcomes some limitations of traditional strategies for sequencing mitogenomes. Moreover, this approach is faster than traditional methods and allows a clear identification of genes, in particular for tRNAs and rRNAs.

  11. Comparative sequence analyses of genome and transcriptome reveal novel transcripts and variants in the Asian elephant Elephas maximus.

    PubMed

    Reddy, Puli Chandramouli; Sinha, Ishani; Kelkar, Ashwin; Habib, Farhat; Pradhan, Saurabh J; Sukumar, Raman; Galande, Sanjeev

    2015-12-01

    The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ago exhibit differences in their physiology, behaviour and morphology. A comparative genomics approach would be useful and necessary for evolutionary and functional genetic studies of elephants. We performed sequencing of E. maximus and map to L. africana at ~15X coverage. Through comparative sequence analyses, we have identified Asian elephant specific homozygous, non-synonymous single nucleotide variants (SNVs) that map to 1514 protein coding genes, many of which are involved in olfaction. We also present the first report of a high-coverage transcriptome sequence in E. maximus from peripheral blood lymphocytes. We have identified 103 novel protein coding transcripts and 66-long non-coding (lnc)RNAs. We also report the presence of 181 protein domains unique to elephants when compared to other Afrotheria species. Each of these findings can be further investigated to gain a better understanding of functional differences unique to elephant species, as well as those unique to elephantids in comparison with other mammals. This work therefore provides a valuable resource to explore the immense research potential of comparative analyses of transcriptome and genome sequences in the Asian elephant.

  12. The use of transcriptomic next-generation sequencing data to assemble mitochondrial genomes of Ancistrus spp. (Loricariidae).

    PubMed

    Moreira, Daniel A; Furtado, Carolina; Parente, Thiago E

    2015-11-15

    Mitochondrial genes and genomes have long been applied in phylogenetics. Current protocols to sequence mitochondrial genomes rely almost exclusively on long range PCR or on the direct sequencing. While long range PCR includes unnecessary biases, the purification of mtDNA for direct sequencing is not straightforward. We used total RNA extracted from liver and Illumina HiSeq technology to sequence mitochondrial transcripts from three fish (Ancistrus spp.) and assemble their mitogenomes. Based on the mtDNA sequence of a close related species, we estimate to have sequenced 92%, 95% and 99% of the mitogenomes. Taken the sequences together, we sequenced all the 13 protein-coding genes, two ribosomal RNAs, 22 tRNAs and the D-loop known in vertebrate mitogenomes. The use of transcriptomic data allowed the observation of the punctuation pattern of mtRNA maturation, to analyze the transcriptional profile, and to detect heteroplasmic sites. The assembly of mtDNA from transcriptomic data is complementary to other approaches and overcomes some limitations of traditional strategies for sequencing mitogenomes. Moreover, this approach is faster than traditional methods and allows a clear identification of genes, in particular for tRNAs and rRNAs. PMID:26344710

  13. Comparative sequence analyses of genome and transcriptome reveal novel transcripts and variants in the Asian elephant Elephas maximus.

    PubMed

    Reddy, Puli Chandramouli; Sinha, Ishani; Kelkar, Ashwin; Habib, Farhat; Pradhan, Saurabh J; Sukumar, Raman; Galande, Sanjeev

    2015-12-01

    The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ago exhibit differences in their physiology, behaviour and morphology. A comparative genomics approach would be useful and necessary for evolutionary and functional genetic studies of elephants. We performed sequencing of E. maximus and map to L. africana at ~15X coverage. Through comparative sequence analyses, we have identified Asian elephant specific homozygous, non-synonymous single nucleotide variants (SNVs) that map to 1514 protein coding genes, many of which are involved in olfaction. We also present the first report of a high-coverage transcriptome sequence in E. maximus from peripheral blood lymphocytes. We have identified 103 novel protein coding transcripts and 66-long non-coding (lnc)RNAs. We also report the presence of 181 protein domains unique to elephants when compared to other Afrotheria species. Each of these findings can be further investigated to gain a better understanding of functional differences unique to elephant species, as well as those unique to elephantids in comparison with other mammals. This work therefore provides a valuable resource to explore the immense research potential of comparative analyses of transcriptome and genome sequences in the Asian elephant. PMID:26648035

  14. De novo assembly of a genome-wide transcriptome map of Vicia faba (L.) for transfer cell research

    PubMed Central

    Arun-Chinnappa, Kiruba S.; McCurdy, David W.

    2015-01-01

    Vicia faba (L.) is an important cool-season grain legume species used widely in agriculture but also in plant physiology research, particularly as an experimental model to study transfer cell (TC) development. TCs are specialized nutrient transport cells in plants, characterized by invaginated wall ingrowths with amplified plasma membrane surface area enriched with transporter proteins that facilitate nutrient transfer. Many TCs are formed by trans-differentiation from differentiated cells at apoplasmic/symplasmic boundaries in nutrient transport. Adaxial epidermal cells of isolated cotyledons can be induced to form functional TCs, thus providing a valuable experimental system to investigate genetic regulation of TC trans-differentiation. The genome of V. faba is exceedingly large (ca. 13 Gb), however, and limited genomic information is available for this species. To provide a resource for future transcript profiling of epidermal TC differentiation, we have undertaken de novo assembly of a genome-wide transcriptome map for V. faba. Illumina paired-end sequencing of total RNA pooled from different tissues and different stages, including isolated cotyledons induced to form epidermal TCs, generated 69.5 M reads, of which 65.8 M were used for assembly following trimming and quality control. Assembly using a De-Bruijn graph-based approach generated 21,297 contigs, of which 80.6% were successfully annotated against GO terms. The assembly was validated against known V. faba cDNAs held in GenBank, including transcripts previously identified as being specifically expressed in epidermal cells across TC trans-differentiation. This genome-wide transcriptome map therefore provides a valuable tool for future transcript profiling of epidermal TC trans-differentiation, and also enriches the genetic resources available for this important legume crop species. PMID:25914703

  15. Genome-scale resources for Thermoanaerobacterium saccharolyticum

    DOE PAGESBeta

    Currie, Devin H.; Raman, Babu; Gowen, Christopher M.; Tschaplinski, Timothy J.; Land, Miriam L.; Brown, Steven D.; Covalla, Sean; Klingeman, Dawn Marie; Yang, Zamin Koo; Engle, Nancy L.; et al

    2015-06-26

    Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. For this research, a major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation.

  16. Transcriptome population genomics reveals severe bottleneck and domestication cost in the African rice (Oryza glaberrima).

    PubMed

    Nabholz, Benoit; Sarah, Gautier; Sabot, François; Ruiz, Manuel; Adam, Hélène; Nidelet, Sabine; Ghesquière, Alain; Santoni, Sylvain; David, Jacques; Glémin, Sylvain

    2014-05-01

    The African cultivated rice (Oryza glaberrima) was domesticated in West Africa 3000 years ago. Although less cultivated than the Asian rice (O. sativa), O. glaberrima landraces often display interesting adaptation to rustic environment (e.g. drought). Here, using RNA-seq technology, we were able to compare more than 12,000 transcripts between 9 O. glaberrima, 10 wild O. barthii and one O. meridionalis individuals. With a synonymous nucleotide diversity πs = 0.0006 per site, O. glaberrima appears as the least genetically diverse crop grass ever documented. Using approximate Bayesian computation, we estimated that O. glaberrima experienced a severe bottleneck during domestication. This demographic scenario almost fully accounts for the pattern of genetic diversity across O. glaberrima genome as we detected very few outliers regions where positive selection may have further impacted genetic diversity. Moreover, the large excess of derived nonsynonymous substitution that we detected suggests that the O. glaberrima population suffered from the 'cost of domestication'. In addition, we used this genome-scale data set to demonstrate that (i) O. barthii genetic diversity is positively correlated with recombination rate and negatively with gene density, (ii) expression level is negatively correlated with evolutionary constraint, and (iii) one region on chromosome 5 (position 4-6 Mb) exhibits a clear signature of introgression with a yet unidentified Oryza species. This work represents the first genome-wide survey of the African rice genetic diversity and paves the way for further comparison between the African and the Asian rice, notably regarding the genetics underlying domestication traits.

  17. The OME Framework for genome-scale systems biology

    SciTech Connect

    Palsson, Bernhard O.; Ebrahim, Ali; Federowicz, Steve

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  18. Comparative Life Cycle Transcriptomics Revises Leishmania mexicana Genome Annotation and Links a Chromosome Duplication with Parasitism of Vertebrates

    PubMed Central

    Fiebig, Michael; Kelly, Steven; Gluenz, Eva

    2015-01-01

    Leishmania spp. are protozoan parasites that have two principal life cycle stages: the motile promastigote forms that live in the alimentary tract of the sandfly and the amastigote forms, which are adapted to survive and replicate in the harsh conditions of the phagolysosome of mammalian macrophages. Here, we used Illumina sequencing of poly-A selected RNA to characterise and compare the transcriptomes of L. mexicana promastigotes, axenic amastigotes and intracellular amastigotes. These data allowed the production of the first transcriptome evidence-based annotation of gene models for this species, including genome-wide mapping of trans-splice sites and poly-A addition sites. The revised genome annotation encompassed 9,169 protein-coding genes including 936 novel genes as well as modifications to previously existing gene models. Comparative analysis of gene expression across promastigote and amastigote forms revealed that 3,832 genes are differentially expressed between promastigotes and intracellular amastigotes. A large proportion of genes that were downregulated during differentiation to amastigotes were associated with the function of the motile flagellum. In contrast, those genes that were upregulated included cell surface proteins, transporters, peptidases and many uncharacterized genes, including 293 of the 936 novel genes. Genome-wide distribution analysis of the differentially expressed genes revealed that the tetraploid chromosome 30 is highly enriched for genes that were upregulated in amastigotes, providing the first evidence of a link between this whole chromosome duplication event and adaptation to the vertebrate host in this group. Peptide evidence for 42 proteins encoded by novel transcripts supports the idea of an as yet uncharacterised set of small proteins in Leishmania spp. with possible implications for host-pathogen interactions. PMID:26452044

  19. Genomic and Transcriptomic Associations Identify a New Insecticide Resistance Phenotype for the Selective Sweep at the Cyp6g1 Locus of Drosophila melanogaster.

    PubMed

    Battlay, Paul; Schmidt, Joshua M; Fournier-Level, Alexandre; Robin, Charles

    2016-01-01

    Scans of the Drosophila melanogaster genome have identified organophosphate resistance loci among those with the most pronounced signature of positive selection. In this study, the molecular basis of resistance to the organophosphate insecticide azinphos-methyl was investigated using the Drosophila Genetic Reference Panel, and genome-wide association. Recently released full transcriptome data were used to extend the utility of the Drosophila Genetic Reference Panel resource beyond traditional genome-wide association studies to allow systems genetics analyses of phenotypes. We found that both genomic and transcriptomic associations independently identified Cyp6g1, a gene involved in resistance to DDT and neonicotinoid insecticides, as the top candidate for azinphos-methyl resistance. This was verified by transgenically overexpressing Cyp6g1 using natural regulatory elements from a resistant allele, resulting in a 6.5-fold increase in resistance. We also identified four novel candidate genes associated with azinphos-methyl resistance, all of which are involved in either regulation of fat storage, or nervous system development. In Cyp6g1, we find a demonstrable resistance locus, a verification that transcriptome data can be used to identify variants associated with insecticide resistance, and an overlap between peaks of a genome-wide association study, and a genome-wide selective sweep analysis. PMID:27317781

  20. Genomic and Transcriptomic Associations Identify a New Insecticide Resistance Phenotype for the Selective Sweep at the Cyp6g1 Locus of Drosophila melanogaster

    PubMed Central

    Battlay, Paul; Schmidt, Joshua M.; Fournier-Level, Alexandre; Robin, Charles

    2016-01-01

    Scans of the Drosophila melanogaster genome have identified organophosphate resistance loci among those with the most pronounced signature of positive selection. In this study, the molecular basis of resistance to the organophosphate insecticide azinphos-methyl was investigated using the Drosophila Genetic Reference Panel, and genome-wide association. Recently released full transcriptome data were used to extend the utility of the Drosophila Genetic Reference Panel resource beyond traditional genome-wide association studies to allow systems genetics analyses of phenotypes. We found that both genomic and transcriptomic associations independently identified Cyp6g1, a gene involved in resistance to DDT and neonicotinoid insecticides, as the top candidate for azinphos-methyl resistance. This was verified by transgenically overexpressing Cyp6g1 using natural regulatory elements from a resistant allele, resulting in a 6.5-fold increase in resistance. We also identified four novel candidate genes associated with azinphos-methyl resistance, all of which are involved in either regulation of fat storage, or nervous system development. In Cyp6g1, we find a demonstrable resistance locus, a verification that transcriptome data can be used to identify variants associated with insecticide resistance, and an overlap between peaks of a genome-wide association study, and a genome-wide selective sweep analysis. PMID:27317781

  1. INTERPRETING PERSONAL TRANSCRIPTOMES: PERSONALIZED MECHANISM-SCALE PROFILING OF RNA-SEQ DATA

    PubMed Central

    Perez-Rathke, Alan; Li, Haiquan

    2013-01-01

    Despite thousands of reported studies unveiling gene-level signatures for complex diseases, few of these techniques work at the single-sample level with explicit underpinning of biological mechanisms. This presents both a critical dilemma in the field of personalized medicine as well as a plethora of opportunities for analysis of RNA-seq data. In this study, we hypothesize that the “Functional Analysis of Individual Microarray Expression” (FAIME) method we developed could be smoothly extended to RNA-seq data and unveil intrinsic underlying mechanism signatures across different scales of biological data for the same complex disease. Using publicly available RNA-seq data for gastric cancer, we confirmed the effectiveness of this method (i) to translate each sample transcriptome to pathway-scale scores, (ii) to predict deregulated pathways in gastric cancer against gold standards (FDR<5%, Precision=75%, Recall =92%), and (iii) to predict phenotypes in an independent dataset and expression platform (RNA-seq vs microarrays, Fisher Exact Test p<10−6). Measuring at a single-sample level, FAIME could differentiate cancer samples from normal ones; furthermore, it achieved comparative performance in identifying differentially expressed pathways as compared to state-of-the-art cross-sample methods. These results motivate future work on mechanism-level biomarker discovery predictive of diagnoses, treatment, and therapy. PMID:23424121

  2. Pathway analysis of genome-wide association study and transcriptome data highlights new biological pathways in colorectal cancer.

    PubMed

    Quan, Baoku; Qi, Xingsi; Yu, Zhihui; Jiang, Yongshuai; Liao, Mingzhi; Wang, Guangyu; Feng, Rennan; Zhang, Liangcai; Chen, Zugen; Jiang, Qinghua; Liu, Guiyou

    2015-04-01

    Colorectal cancer (CRC) is a common malignancy that meets the definition of a complex disease. Genome-wide association study (GWAS) has identified several loci of weak predictive value in CRC, however, these do not fully explain the occurrence risk. Recently, gene set analysis has allowed enhanced interpretation of GWAS data in CRC, identifying a number of metabolic pathways as important for disease pathogenesis. Whether there are other important pathways involved in CRC, however, remains unclear. We present a systems analysis of KEGG pathways in CRC using (1) a human CRC GWAS dataset and (2) a human whole transcriptome CRC case-control expression dataset. Analysis of the GWAS dataset revealed significantly enriched KEGG pathways related to metabolism, immune system and diseases, cellular processes, environmental information processing, genetic information processing, and neurodegenerative diseases. Altered gene expression was confirmed in these pathways using the transcriptome dataset. Taken together, these findings not only confirm previous work in this area, but also highlight new biological pathways whose deregulation is critical for CRC. These results contribute to our understanding of disease-causing mechanisms and will prove useful for future genetic and functional studies in CRC. PMID:25362561

  3. Brain transcriptome of the violet-eared waxbill Uraeginthus granatina and recent evolution in the songbird genome

    PubMed Central

    Balakrishnan, Christopher N.; Chapus, Charles; Brewer, Michael S.; Clayton, David F.

    2013-01-01

    Songbirds are important models for the study of social behaviour and communication. To complement the recent genome sequencing of the domesticated zebra finch, we sequenced the brain transcriptome of a closely related songbird species, the violet-eared waxbill (Uraeginthus granatina). Both the zebra finch and violet-eared waxbill are members of the family Estrildidae, but differ markedly in their social behaviour. Using Roche 454 RNA sequencing, we generated an assembly and annotation of 11 084 waxbill orthologues of 17 475 zebra finch genes (64%), with an average transcript length of 1555 bp. We also identified 5985 single nucleotide polymorphisms (SNPs) of potential utility for future population genomic studies. Comparing the two species, we found evidence for rapid protein evolution (ω) and low polymorphism of the avian Z sex chromosome, consistent with prior studies of more divergent avian species. An intriguing outlier was putative chromosome 4A, which showed a high density of SNPs and low evolutionary rate relative to other chromosomes. Genome-wide ω was identical in zebra finch and violet-eared waxbill lineages, suggesting a similar demographic history with efficient purifying natural selection. Further comparisons of these and other estrildid finches may provide insights into the evolutionary neurogenomics of social behaviour. PMID:24004662

  4. Characterization of the mechanism of prolonged adaptation to osmotic stress of Jeotgalibacillus malaysiensis via genome and transcriptome sequencing analyses

    PubMed Central

    Yaakop, Amira Suriaty; Chan, Kok-Gan; Ee, Robson; Lim, Yan Lue; Lee, Siew-Kim; Manan, Fazilah Abd; Goh, Kian Mau

    2016-01-01

    Jeotgalibacillus malaysiensis, a moderate halophilic bacterium isolated from a pelagic area, can endure higher concentrations of sodium chloride (NaCl) than other Jeotgalibacillus type strains. In this study, we therefore chose to sequence and assemble the entire J. malaysiensis genome. This is the first report to provide a detailed analysis of the genomic features of J. malaysiensis, and to perform genetic comparisons between this microorganism and other halophiles. J. malaysiensis encodes a native megaplasmid (pJeoMA), which is greater than 600 kilobases in size, that is absent from other sequenced species of Jeotgalibacillus. Subsequently, RNA-Seq-based transcriptome analysis was utilised to examine adaptations of J. malaysiensis to osmotic stress. Specifically, the eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) and KEGG (Kyoto Encyclopaedia of Genes and Genomes) databases were used to elucidate the overall effects of osmotic stress on the organism. Generally, saline stress significantly affected carbohydrate, energy, and amino acid metabolism, as well as fatty acid biosynthesis. Our findings also indicate that J. malaysiensis adopted a combination of approaches, including the uptake or synthesis of osmoprotectants, for surviving salt stress. Among these, proline synthesis appeared to be the preferred method for withstanding prolonged osmotic stress in J. malaysiensis. PMID:27641516

  5. Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen

    PubMed Central

    DiGuistini, Scott; Wang, Ye; Liao, Nancy Y.; Taylor, Greg; Tanguay, Philippe; Feau, Nicolas; Henrissat, Bernard; Chan, Simon K.; Hesse-Orce, Uljana; Alamouti, Sepideh Massoumi; Tsui, Clement K. M.; Docking, Roderick T.; Levasseur, Anthony; Haridas, Sajeet; Robertson, Gordon; Birol, Inanc; Holt, Robert A.; Marra, Marco A.; Hamelin, Richard C.; Hirst, Martin; Jones, Steven J. M.; Bohlmann, Jörg; Breuil, Colette

    2011-01-01

    In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera (Gc), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc, MPB, and lodgepole pine hosts, we sequenced the ∼30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc, and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ∼100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system. PMID:21262841

  6. A blow to the fly - Lucilia cuprina draft genome and transcriptome to support advances in biology and biotechnology.

    PubMed

    Anstead, Clare A; Batterham, Philip; Korhonen, Pasi K; Young, Neil D; Hall, Ross S; Bowles, Vernon M; Richards, Stephen; Scott, Maxwell J; Gasser, Robin B

    2016-01-01

    The blow fly, Lucilia cuprina (Wiedemann, 1830) is a parasitic insect of major global economic importance. Maggots of this fly parasitize the skin of animal hosts, feed on excretions and tissues, and cause severe disease (flystrike or myiasis). Although there has been considerable research on L. cuprina over the years, little is understood about the molecular biology, biochemistry and genetics of this parasitic fly, as well as its relationship with its hosts and the disease that it causes. This situation might change with the recent report of the draft genome and transcriptome of this blow fly, which has given new and global insights into its biology, interactions with the host animal and aspects of insecticide resistance at the molecular level. This genomic resource will likely enable many fundamental and applied research areas in the future. The present article gives a background on L. cuprina and myiasis, a brief account of past and current treatment, prevention and control approaches, and provides a perspective on the impact that the L. cuprina genome should have on future research of this and related parasitic flies, and the design of new and improved interventions for myiasis. PMID:26944522

  7. Genome-Wide Host-Pathogen Interaction Unveiled by Transcriptomic Response of Diamondback Moth to Fungal Infection

    PubMed Central

    Chu, Zhen-Jian; Wang, Yu-Jun; Ying, Sheng-Hua; Wang, Xiao-Wei; Feng, Ming-Guang

    2016-01-01

    Genome-wide insight into insect pest response to the infection of Beauveria bassiana (fungal insect pathogen) is critical for genetic improvement of fungal insecticides but has been poorly explored. We constructed three pairs of transcriptomes of Plutella xylostella larvae at 24, 36 and 48 hours post treatment of infection (hptI) and of control (hptC) for insight into the host-pathogen interaction at genomic level. There were 2143, 3200 and 2967 host genes differentially expressed at 24, 36 and 48 hptI/hptC respectively. These infection-responsive genes (~15% of the host genome) were enriched in various immune processes, such as complement and coagulation cascades, protein digestion and absorption, and drug metabolism-cytochrome P450. Fungal penetration into cuticle and host defense reaction began at 24 hptI, followed by most intensive host immune response at 36 hptI and attenuated immunity at 48 hptI. Contrastingly, 44% of fungal genes were differentially expressed in the infection course and enriched in several biological processes, such as antioxidant activity, peroxidase activity and proteolysis. There were 1636 fungal genes co-expressed during 24–48 hptI, including 116 encoding putative secretion proteins. Our results provide novel insights into the insect-pathogen interaction and help to probe molecular mechanisms involved in the fungal infection to the global pest. PMID:27043942

  8. Characterization of the mechanism of prolonged adaptation to osmotic stress of Jeotgalibacillus malaysiensis via genome and transcriptome sequencing analyses.

    PubMed

    Yaakop, Amira Suriaty; Chan, Kok-Gan; Ee, Robson; Lim, Yan Lue; Lee, Siew-Kim; Manan, Fazilah Abd; Goh, Kian Mau

    2016-01-01

    Jeotgalibacillus malaysiensis, a moderate halophilic bacterium isolated from a pelagic area, can endure higher concentrations of sodium chloride (NaCl) than other Jeotgalibacillus type strains. In this study, we therefore chose to sequence and assemble the entire J. malaysiensis genome. This is the first report to provide a detailed analysis of the genomic features of J. malaysiensis, and to perform genetic comparisons between this microorganism and other halophiles. J. malaysiensis encodes a native megaplasmid (pJeoMA), which is greater than 600 kilobases in size, that is absent from other sequenced species of Jeotgalibacillus. Subsequently, RNA-Seq-based transcriptome analysis was utilised to examine adaptations of J. malaysiensis to osmotic stress. Specifically, the eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) and KEGG (Kyoto Encyclopaedia of Genes and Genomes) databases were used to elucidate the overall effects of osmotic stress on the organism. Generally, saline stress significantly affected carbohydrate, energy, and amino acid metabolism, as well as fatty acid biosynthesis. Our findings also indicate that J. malaysiensis adopted a combination of approaches, including the uptake or synthesis of osmoprotectants, for surviving salt stress. Among these, proline synthesis appeared to be the preferred method for withstanding prolonged osmotic stress in J. malaysiensis. PMID:27641516

  9. Complete Genome Sequence and Transcriptomic Analysis of the Novel Pathogen Elizabethkingia anophelis in Response to Oxidative Stress

    PubMed Central

    Li, Yingying; Liu, Yang; Chew, Su Chuen; Tay, Martin; Salido, May Margarette Santillan; Teo, Jeanette; Lauro, Federico M.; Givskov, Michael; Yang, Liang

    2015-01-01

    Elizabethkingia anophelis is an emerging pathogen that can cause life-threatening infections in neonates, severely immunocompromised and postoperative patients. The lack of genomic information on E. anophelis hinders our understanding of its mechanisms of pathogenesis. Here, we report the first complete genome sequence of E. anophelis NUHP1 and assess its response to oxidative stress. Elizabethkingia anophelis NUHP1 has a circular genome of 4,369,828 base pairs and 4,141 predicted coding sequences. Sequence analysis indicates that E. anophelis has well-developed systems for scavenging iron and stress response. Many putative virulence factors and antibiotic resistance genes were identified, underscoring potential host–pathogen interactions and antibiotic resistance. RNA-sequencing-based transcriptome profiling indicates that expressions of genes involved in synthesis of an yersiniabactin-like iron siderophore and heme utilization are highly induced as a protective mechanism toward oxidative stress caused by hydrogen peroxide treatment. Chrome azurol sulfonate assay verified that siderophore production of E. anophelis is increased in the presence of oxidative stress. We further showed that hemoglobin facilitates the growth, hydrogen peroxide tolerance, cell attachment, and biofilm formation of E. anophelis NUHP1. Our study suggests that siderophore production and heme uptake pathways might play essential roles in stress response and virulence of the emerging pathogen E. anophelis. PMID:26019164

  10. Genome-Wide Host-Pathogen Interaction Unveiled by Transcriptomic Response of Diamondback Moth to Fungal Infection.

    PubMed

    Chu, Zhen-Jian; Wang, Yu-Jun; Ying, Sheng-Hua; Wang, Xiao-Wei; Feng, Ming-Guang

    2016-01-01

    Genome-wide insight into insect pest response to the infection of Beauveria bassiana (fungal insect pathogen) is critical for genetic improvement of fungal insecticides but has been poorly explored. We constructed three pairs of transcriptomes of Plutella xylostella larvae at 24, 36 and 48 hours post treatment of infection (hptI) and of control (hptC) for insight into the host-pathogen interaction at genomic level. There were 2143, 3200 and 2967 host genes differentially expressed at 24, 36 and 48 hptI/hptC respectively. These infection-responsive genes (~15% of the host genome) were enriched in various immune processes, such as complement and coagulation cascades, protein digestion and absorption, and drug metabolism-cytochrome P450. Fungal penetration into cuticle and host defense reaction began at 24 hptI, followed by most intensive host immune response at 36 hptI and attenuated immunity at 48 hptI. Contrastingly, 44% of fungal genes were differentially expressed in the infection course and enriched in several biological processes, such as antioxidant activity, peroxidase activity and proteolysis. There were 1636 fungal genes co-expressed during 24-48 hptI, including 116 encoding putative secretion proteins. Our results provide novel insights into the insect-pathogen interaction and help to probe molecular mechanisms involved in the fungal infection to the global pest.

  11. Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen.

    PubMed

    DiGuistini, Scott; Wang, Ye; Liao, Nancy Y; Taylor, Greg; Tanguay, Philippe; Feau, Nicolas; Henrissat, Bernard; Chan, Simon K; Hesse-Orce, Uljana; Alamouti, Sepideh Massoumi; Tsui, Clement K M; Docking, Roderick T; Levasseur, Anthony; Haridas, Sajeet; Robertson, Gordon; Birol, Inanc; Holt, Robert A; Marra, Marco A; Hamelin, Richard C; Hirst, Martin; Jones, Steven J M; Bohlmann, Jörg; Breuil, Colette

    2011-02-01

    In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera (Gc), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc, MPB, and lodgepole pine hosts, we sequenced the ∼30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc, and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ∼100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system.

  12. Genome-wide expression profiling of the transcriptomes of four Paulownia tomentosa accessions in response to drought.

    PubMed

    Dong, Yanpeng; Fan, Guoqiang; Deng, Minjie; Xu, Enkai; Zhao, Zhenli

    2014-10-01

    Paulownia tomentosa is an important foundation forest tree species in semiarid areas. The lack of genetic information hinders research into the mechanisms involved in its response to abiotic stresses. Here, short-read sequencing technology (Illumina) was used to de novo assemble the transcriptome on P. tomentosa. A total of 99,218 unigenes with a mean length of 949 nucleotides were assembled. 68,295 unigenes were selected and the functions of their products were predicted using Clusters of Orthologous Groups, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes annotations. Afterwards, hundreds of genes involved in drought response were identified. Twelve putative drought response genes were analyzed by quantitative real-time polymerase chain reaction. This study provides a dataset of genes and inherent biochemical pathways, which will help in understanding the mechanisms of the water-deficit response in P. tomentosa. To our knowledge, this is the first study to highlight the genetic makeup of P. tomentosa. PMID:25192670

  13. The Discovery of Novel Genomic, Transcriptomic, and Proteomic Biomarkers in Cardiovascular and Peripheral Vascular Disease: The State of the Art

    PubMed Central

    de Franciscis, Stefano; Metzinger, Laurent; Serra, Raffaele

    2016-01-01

    Cardiovascular disease (CD) and peripheral vascular disease (PVD) are leading causes of mortality and morbidity in western countries and also responsible of a huge burden in terms of disability, functional decline, and healthcare costs. Biomarkers are measurable biological elements that reflect particular physiological or pathological states or predisposition towards diseases and they are currently widely studied in medicine and especially in CD. In this context, biomarkers can also be used to assess the severity or the evolution of several diseases, as well as the effectiveness of particular therapies. Genomics, transcriptomics, and proteomics have opened new windows on disease phenomena and may permit in the next future an effective development of novel diagnostic and prognostic medicine in order to better prevent or treat CD. This review will consider the current evidence of novel biomarkers with clear implications in the improvement of risk assessment, prevention strategies, and medical decision making in the field of CD. PMID:27298828

  14. Genome-wide profiling and analysis of Festuca arundinacea miRNAs and transcriptomes in response to foliar glyphosate application.

    PubMed

    Unver, Turgay; Bakar, Mine; Shearman, Robert C; Budak, Hikmet

    2010-04-01

    Glyphosate is a broad spectrum herbicide which has been widely used for non-selective weed control in turfgrass management. Festuca arundinacea cv. Falcon was shown to be one of the tolerant turfgrass species in response to varying levels of glyphosate [5% (1.58 mM), 20% (6.32 mM)] recommended for weed control. However, there is a lack of knowledge on the mRNA expression patterns and miRNA, critical regulators of gene expression, in response to varying levels of glyphosate treatments. Here, we investigate the transcriptome and miRNA-guided post-transcriptional networks using plant miRNA microarray and Affymetrix GeneChip Wheat Genome Array platforms. Transcriptome analysis revealed 93 up-regulated and 78 down-regulated genes, whereas a smaller number showed inverse differential expressions. miRNA chip analysis indicated a number of (34 out of the 853) plant miRNAs were differentially regulated in response to glyphosate treatments. Target transcripts of differentially regulated miRNAs were predicted and nine of them were quantified by quantitative real-time PCR (qRT-PCR). Target transcripts of miRNAs validate the expression level change of miRNAs detected by miRNA microarray analysis. Down-regulation of miRNAs upon 5 and 20% glyphosate applications led to the up-regulation of their target observed by qRT-PCR or vice versa. Quantification of F. arundinacea miRNA, homologous of osa-miR1436, revealed the agreement between the Affymetrix and miRNA microarray analyses. In addition to miRNA microarray experiment, 25 conserved F. arundinacea miRNAs were identified through homology-based approach and their secondary structures were predicted. The results presented serve as analyses of genome-wide expression profiling of miRNAs and target mRNAs in response to foliar glyphosate treatment in grass species.

  15. Analysis of CATMA transcriptome data identifies hundreds of novel functional genes and improves gene models in the Arabidopsis genome

    PubMed Central

    Aubourg, Sébastien; Martin-Magniette, Marie-Laure; Brunaud, Véronique; Taconnat, Ludivine; Bitton, Frédérique; Balzergue, Sandrine; Jullien, Pauline E; Ingouff, Mathieu; Thareau, Vincent; Schiex, Thomas; Lecharny, Alain; Renou, Jean-Pierre

    2007-01-01

    Background Since the finishing of the sequencing of the Arabidopsis thaliana genome, the Arabidopsis community and the annotator centers have been working on the improvement of gene annotation at the structural and functional levels. In this context, we have used the large CATMA resource on the Arabidopsis transcriptome to search for genes missed by different annotation processes. Probes on the CATMA microarrays are specific gene sequence tags (GSTs) based on the CDS models predicted by the Eugene software. Among the 24 576 CATMA v2 GSTs, 677 are in regions considered as intergenic by the TAIR annotation. We analyzed the cognate transcriptome data in the CATMA resource and carried out data-mining to characterize novel genes and improve gene models. Results The statistical analysis of the results of more than 500 hybridized samples distributed among 12 organs provides an experimental validation for 465 novel genes. The hybridization evidence was confirmed by RT-PCR approaches for 88% of the 465 novel genes. Comparisons with the current annotation show that these novel genes often encode small proteins, with an average size of 137 aa. Our approach has also led to the improvement of pre-existing gene models through both the extension of 16 CDS and the identification of 13 gene models erroneously constituted of two merged CDS. Conclusion This work is a noticeable step forward in the improvement of the Arabidopsis genome annotation. We increased the number of Arabidopsis validated genes by 465 novel transcribed genes to which we associated several functional annotations such as expression profiles, sequence conservation in plants, cognate transcripts and protein motifs. PMID:17980019

  16. The use of high-dimensional biology (genomics, transcriptomics, proteomics, and metabolomics) to understand the preterm parturition syndrome.

    PubMed

    Romero, R; Espinoza, J; Gotsch, F; Kusanovic, J P; Friel, L A; Erez, O; Mazaki-Tovi, S; Than, N G; Hassan, S; Tromp, G

    2006-12-01

    High-dimensional biology (HDB) refers to the simultaneous study of the genetic variants (DNA variation), transcription (messenger RNA [mRNA]), peptides and proteins, and metabolites of an organ, tissue, or an organism in health and disease. The fundamental premise is that the evolutionary complexity of biological systems renders them difficult to comprehensively understand using only a reductionist approach. Such complexity can become tractable with the use of "omics" research. This term refers to the study of entities in aggregate. The current nomenclature of "omics" sciences includes genomics for DNA variants, transcriptomics for mRNA, proteomics for proteins, and metabolomics for intermediate products of metabolism. Another discipline relevant to medicine is pharmacogenomics. The two major advances that have made HDB possible are technological breakthroughs that allow simultaneous examination of thousands of genes, transcripts, and proteins, etc., with high-throughput techniques and analytical tools to extract information. What is conventionally considered hypothesis-driven research and discovery-driven research (through "omic" methodologies) are complementary and synergistic. Here we review data which have been derived from: 1) genomics to examine predisposing factors for preterm birth; 2) transcriptomics to determine changes in mRNA in reproductive tissues associated with preterm labour and preterm prelabour rupture of membranes; 3) proteomics to identify differentially expressed proteins in amniotic fluid of women with preterm labour; and 4) metabolomics to identify the metabolic footprints of women with preterm labour likely to deliver preterm and those who will deliver at term. The complementary nature of discovery science and HDB is emphasised.

  17. Territorial Polymers and Large Scale Genome Organization

    NASA Astrophysics Data System (ADS)

    Grosberg, Alexander

    2012-02-01

    Chromatin fiber in interphase nucleus represents effectively a very long polymer packed in a restricted volume. Although polymer models of chromatin organization were considered, most of them disregard the fact that DNA has to stay not too entangled in order to function properly. One polymer model with no entanglements is the melt of unknotted unconcatenated rings. Extensive simulations indicate that rings in the melt at large length (monomer numbers) N approach the compact state, with gyration radius scaling as N^1/3, suggesting every ring being compact and segregated from the surrounding rings. The segregation is consistent with the known phenomenon of chromosome territories. Surface exponent β (describing the number of contacts between neighboring rings scaling as N^β) appears only slightly below unity, β 0.95. This suggests that the loop factor (probability to meet for two monomers linear distance s apart) should decay as s^-γ, where γ= 2 - β is slightly above one. The later result is consistent with HiC data on real human interphase chromosomes, and does not contradict to the older FISH data. The dynamics of rings in the melt indicates that the motion of one ring remains subdiffusive on the time scale well above the stress relaxation time.

  18. Genome-scale metabolic network reconstruction.

    PubMed

    Fondi, Marco; Liò, Pietro

    2015-01-01

    Bacterial metabolism is an important source of novel products/processes for everyday life and strong efforts are being undertaken to discover and exploit new usable substances of microbial origin. Computational modeling and in silico simulations are powerful tools in this context since they allow the exploration and a deeper understanding of bacterial metabolic circuits. Many approaches exist to quantitatively simulate chemical reaction fluxes within the whole microbial metabolism and, regardless of the technique of choice, metabolic model reconstruction is the first step in every modeling pipeline. Reconstructing a metabolic network consists in drafting the list of the biochemical reactions that an organism can carry out together with information on cellular boundaries, a biomass assembly reaction, and exchange fluxes with the external environment. Building up models able to represent the different functional cellular states is universally recognized as a tricky task that requires intensive manual effort and much additional information besides genome sequence. In this chapter we present a general protocol for metabolic reconstruction in bacteria and the main challenges encountered during this process. PMID:25343869

  19. Transcriptome Analysis of Kiwifruit (Actinidia chinensis) Bark in Response to Armoured Scale Insect (Hemiberlesia lataniae) Feeding

    PubMed Central

    Hill, M. Garry; Wurms, Kirstin V.; Davy, Marcus W.; Gould, Elaine; Allan, Andrew; Mauchline, Nicola A.; Luo, Zhiwei; Ah Chee, Annette; Stannard, Kate; Storey, Roy D.; Rikkerink, Erik H.

    2015-01-01

    The kiwifruit cultivar Actinidia chinensis ‘Hort16A’ is resistant to the polyphagous armoured scale insect pest Hemiberlesia lataniae (Hemiptera: Diaspididae). A cDNA microarray consisting of 17,512 unigenes selected from over 132,000 expressed sequence tags (ESTs) was used to measure the transcriptomic profile of the A. chinensis ‘Hort16A’ canes in response to a controlled infestation of H. lataniae. After 2 days, 272 transcripts were differentially expressed. After 7 days, 5,284 (30%) transcripts were differentially expressed. The transcripts were grouped into 22 major functional categories using MapMan software. After 7 days, transcripts associated with photosynthesis (photosystem II) were significantly down-regulated, while those associated with secondary metabolism were significantly up-regulated. A total of 643 transcripts associated with response to stress were differentially expressed. This included biotic stress-related transcripts orthologous with pathogenesis related proteins, the phenylpropanoid pathway, NBS-LRR (R) genes, and receptor-like kinase–leucine rich repeat signalling proteins. While transcriptional studies are not conclusive in their own right, results were suggestive of a defence response involving both ETI and PTI, with predominance of the SA signalling pathway. Exogenous application of an SA-mimic decreased H. lataniae growth on A. chinensis ‘Hort16A’ plants in two laboratory experiments. PMID:26571404

  20. Transcriptome Analysis of Kiwifruit (Actinidia chinensis) Bark in Response to Armoured Scale Insect (Hemiberlesia lataniae) Feeding.

    PubMed

    Hill, M Garry; Wurms, Kirstin V; Davy, Marcus W; Gould, Elaine; Allan, Andrew; Mauchline, Nicola A; Luo, Zhiwei; Ah Chee, Annette; Stannard, Kate; Storey, Roy D; Rikkerink, Erik H

    2015-01-01

    The kiwifruit cultivar Actinidia chinensis 'Hort16A' is resistant to the polyphagous armoured scale insect pest Hemiberlesia lataniae (Hemiptera: Diaspididae). A cDNA microarray consisting of 17,512 unigenes selected from over 132,000 expressed sequence tags (ESTs) was used to measure the transcriptomic profile of the A. chinensis 'Hort16A' canes in response to a controlled infestation of H. lataniae. After 2 days, 272 transcripts were differentially expressed. After 7 days, 5,284 (30%) transcripts were differentially expressed. The transcripts were grouped into 22 major functional categories using MapMan software. After 7 days, transcripts associated with photosynthesis (photosystem II) were significantly down-regulated, while those associated with secondary metabolism were significantly up-regulated. A total of 643 transcripts associated with response to stress were differentially expressed. This included biotic stress-related transcripts orthologous with pathogenesis related proteins, the phenylpropanoid pathway, NBS-LRR (R) genes, and receptor-like kinase-leucine rich repeat signalling proteins. While transcriptional studies are not conclusive in their own right, results were suggestive of a defence response involving both ETI and PTI, with predominance of the SA signalling pathway. Exogenous application of an SA-mimic decreased H. lataniae growth on A. chinensis 'Hort16A' plants in two laboratory experiments. PMID:26571404

  1. Transcriptome Analysis of Kiwifruit (Actinidia chinensis) Bark in Response to Armoured Scale Insect (Hemiberlesia lataniae) Feeding.

    PubMed

    Hill, M Garry; Wurms, Kirstin V; Davy, Marcus W; Gould, Elaine; Allan, Andrew; Mauchline, Nicola A; Luo, Zhiwei; Ah Chee, Annette; Stannard, Kate; Storey, Roy D; Rikkerink, Erik H

    2015-01-01

    The kiwifruit cultivar Actinidia chinensis 'Hort16A' is resistant to the polyphagous armoured scale insect pest Hemiberlesia lataniae (Hemiptera: Diaspididae). A cDNA microarray consisting of 17,512 unigenes selected from over 132,000 expressed sequence tags (ESTs) was used to measure the transcriptomic profile of the A. chinensis 'Hort16A' canes in response to a controlled infestation of H. lataniae. After 2 days, 272 transcripts were differentially expressed. After 7 days, 5,284 (30%) transcripts were differentially expressed. The transcripts were grouped into 22 major functional categories using MapMan software. After 7 days, transcripts associated with photosynthesis (photosystem II) were significantly down-regulated, while those associated with secondary metabolism were significantly up-regulated. A total of 643 transcripts associated with response to stress were differentially expressed. This included biotic stress-related transcripts orthologous with pathogenesis related proteins, the phenylpropanoid pathway, NBS-LRR (R) genes, and receptor-like kinase-leucine rich repeat signalling proteins. While transcriptional studies are not conclusive in their own right, results were suggestive of a defence response involving both ETI and PTI, with predominance of the SA signalling pathway. Exogenous application of an SA-mimic decreased H. lataniae growth on A. chinensis 'Hort16A' plants in two laboratory experiments.

  2. Genome-scale engineering for systems and synthetic biology

    PubMed Central

    Esvelt, Kevin M; Wang, Harris H

    2013-01-01

    Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review current technologies and methodologies for genome-scale engineering, discuss the prospects for extending efficient genome modification to new hosts, and explore the implications of continued advances toward the development of flexibly programmable chasses, novel biochemistries, and safer organismal and ecological engineering. PMID:23340847

  3. Genome resequencing in Populus: Revealing large-scale genome variation and implications on specialized-trait genomics

    SciTech Connect

    Muchero, Wellington; Labbe, Jessy L; Priya, Ranjan; DiFazio, Steven P; Tuskan, Gerald A

    2014-01-01

    To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel and fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.

  4. Unraveling the 3D genome: genomics tools for multi-scale exploration

    PubMed Central

    Risca, Viviana I.; Greenleaf, William J.

    2015-01-01

    A decade of rapid method development has begun to yield exciting insights into the three-dimensional architecture of the metazoan genome and the roles it may play in regulating transcription. We review here core methods and new tools in the modern genomicist’s toolbox at three length scales, ranging from single base pair to megabase scale chromosomal domains, and discuss the emerging picture of the 3D genome that these tools have revealed. Blind spots remain, especially at intermediate length scales spanning a few nucleosomes, but thanks in part to new technologies that permit targeted alteration of chromatin states and time-resolved studies, the next decade holds great promise for hypothesis-driven research into the mechanisms that drive genome architecture and transcriptional regulation. PMID:25887733

  5. Genome-wide Annotation, Identification, and Global Transcriptomic Analysis of Regulatory or Small RNA Gene Expression in Staphylococcus aureus

    PubMed Central

    Weiss, Andy; Broach, William H.; Wiemels, Richard E.; Mogen, Austin B.; Rice, Kelly C.

    2016-01-01

    ABSTRACT In Staphylococcus aureus, hundreds of small regulatory or small RNAs (sRNAs) have been identified, yet this class of molecule remains poorly understood and severely understudied. sRNA genes are typically absent from genome annotation files, and as a consequence, their existence is often overlooked, particularly in global transcriptomic studies. To facilitate improved detection and analysis of sRNAs in S. aureus, we generated updated GenBank files for three commonly used S. aureus strains (MRSA252, NCTC 8325, and USA300), in which we added annotations for >260 previously identified sRNAs. These files, the first to include genome-wide annotation of sRNAs in S. aureus, were then used as a foundation to identify novel sRNAs in the community-associated methicillin-resistant strain USA300. This analysis led to the discovery of 39 previously unidentified sRNAs. Investigating the genomic loci of the newly identified sRNAs revealed a surprising degree of inconsistency in genome annotation in S. aureus, which may be hindering the analysis and functional exploration of these elements. Finally, using our newly created annotation files as a reference, we perform a global analysis of sRNA gene expression in S. aureus and demonstrate that the newly identified tsr25 is the most highly upregulated sRNA in human serum. This study provides an invaluable resource to the S. aureus research community in the form of our newly generated annotation files, while at the same time presenting the first examination of differential sRNA expression in pathophysiologically relevant conditions. PMID:26861020

  6. Genome-scale cold stress response regulatory networks in ten Arabidopsis thaliana ecotypes

    PubMed Central

    2013-01-01

    Background Low temperature leads to major crop losses every year. Although several studies have been conducted focusing on diversity of cold tolerance level in multiple phenotypically divergent Arabidopsis thaliana (A. thaliana) ecotypes, genome-scale molecular understanding is still lacking. Results In this study, we report genome-scale transcript response diversity of 10 A. thaliana ecotypes originating from different geographical locations to non-freezing cold stress (10°C). To analyze the transcriptional response diversity, we initially compared transcriptome changes in all 10 ecotypes using Arabidopsis NimbleGen ATH6 microarrays. In total 6061 transcripts were significantly cold regulated (p < 0.01) in 10 ecotypes, including 498 transcription factors and 315 transposable elements. The majority of the transcripts (75%) showed ecotype specific expression pattern. By using sequence data available from Arabidopsis thaliana 1001 genome project, we further investigated sequence polymorphisms in the core cold stress regulon genes. Significant numbers of non-synonymous amino acid changes were observed in the coding region of the CBF regulon genes. Considering the limited knowledge about regulatory interactions between transcription factors and their target genes in the model plant A. thaliana, we have adopted a powerful systems genetics approach- Network Component Analysis (NCA) to construct an in-silico transcriptional regulatory network model during response to cold stress. The resulting regulatory network contained 1,275 nodes and 7,720 connections, with 178 transcription factors and 1,331 target genes. Conclusions A. thaliana ecotypes exhibit considerable variation in transcriptome level responses to non-freezing cold stress treatment. Ecotype specific transcripts and related gene ontology (GO) categories were identified to delineate natural variation of cold stress regulated differential gene expression in the model plant A. thaliana. The predicted

  7. Symbiodinium Transcriptomes: Genome Insights into the Dinoflagellate Symbionts of Reef-Building Corals

    PubMed Central

    Sunagawa, Shinichi; Yum, Lauren K.; DeSalvo, Michael K.; Lindquist, Erika; Coffroth, Mary Alice; Voolstra, Christian R.; Medina, Mónica

    2012-01-01

    Dinoflagellates are unicellular algae that are ubiquitously abundant in aquatic environments. Species of the genus Symbiodinium form symbiotic relationships with reef-building corals and other marine invertebrates. Despite their ecologic importance, little is known about the genetics of dinoflagellates in general and Symbiodinium in particular. Here, we used 454 sequencing to generate transcriptome data from two Symbiodinium species from different clades (clade A and clade B). With more than 56,000 assembled sequences per species, these data represent the largest transcriptomic resource for dinoflagellates to date. Our results corroborate previous observations that dinoflagellates possess the complete nucleosome machinery. We found a complete set of core histones as well as several H3 variants and H2A.Z in one species. Furthermore, transcriptome analysis points toward a low number of transcription factors in Symbiodinium spp. that also differ in the distribution of DNA-binding domains relative to other eukaryotes. In particular the cold shock domain was predominant among transcription factors. Additionally, we found a high number of antioxidative genes in comparison to non-symbiotic but evolutionary related organisms. These findings might be of relevance in the context of the role that Symbiodinium spp. play as coral symbionts. Our data represent the most comprehensive dinoflagellate EST data set to date. This study provides a comprehensive resource to further analyze the genetic makeup, metabolic capacities, and gene repertoire of Symbiodinium and dinoflagellates. Overall, our findings indicate that Symbiodinium possesses some unique characteristics, in particular the transcriptional regulation in Symbiodinium may differ from the currently known mechanisms of eukaryotic gene regulation. PMID:22529998

  8. From genes to milk: Genomic organization and epigenetic regulation of the mammary transcriptome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin sta...

  9. Quantitative RNA-Seq analysis in non-model species: assessing transcriptome assemblies as a scaffold and the utility of evolutionary divergent genomic reference species

    PubMed Central

    2012-01-01

    Background How well does RNA-Seq data perform for quantitative whole gene expression analysis in the absence of a genome? This is one unanswered question facing the rapidly growing number of researchers studying non-model species. Using Homo sapiens data and resources, we compared the direct mapping of sequencing reads to predicted genes from the genome with mapping to de novo transcriptomes assembled from RNA-Seq data. Gene coverage and expression analysis was further investigated in the non-model context by using increasingly divergent genomic reference species to group assembled contigs by unique genes. Results Eight transcriptome sets, composed of varying amounts of Illumina and 454 data, were assembled and assessed. Hybrid 454/Illumina assemblies had the highest transcriptome and individual gene coverage. Quantitative whole gene expression levels were highly similar between using a de novo hybrid assembly and the predicted genes as a scaffold, although mapping to the de novo transcriptome assembly provided data on fewer genes. Using non-target species as reference scaffolds does result in some loss of sequence and expression data, and bias and error increase with evolutionary distance. However, within a 100 million year window these effect sizes are relatively small. Conclusions Predicted gene sets from sequenced genomes of related species can provide a powerful method for grouping RNA-Seq reads and annotating contigs. Gene expression results can be produced that are similar to results obtained using gene models derived from a high quality genome, though biased towards conserved genes. Our results demonstrate the power and limitations of conducting RNA-Seq in non-model species. PMID:22853326

  10. Analysis of the Phlebiopsis gigantea genome, transcriptome and secretome provides insight into its pioneer colonization strategies of wood.

    PubMed

    Hori, Chiaki; Ishida, Takuya; Igarashi, Kiyohiko; Samejima, Masahiro; Suzuki, Hitoshi; Master, Emma; Ferreira, Patricia; Ruiz-Dueñas, Francisco J; Held, Benjamin; Canessa, Paulo; Larrondo, Luis F; Schmoll, Monika; Druzhinina, Irina S; Kubicek, Christian P; Gaskell, Jill A; Kersten, Phil; St John, Franz; Glasner, Jeremy; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit; Mgbeahuruike, Anthony C; Kovalchuk, Andriy; Asiegbu, Fred O; Lackner, Gerald; Hoffmeister, Dirk; Rencoret, Jorge; Gutiérrez, Ana; Sun, Hui; Lindquist, Erika; Barry, Kerrie; Riley, Robert; Grigoriev, Igor V; Henrissat, Bernard; Kües, Ursula; Berka, Randy M; Martínez, Angel T; Covert, Sarah F; Blanchette, Robert A; Cullen, Daniel

    2014-12-01

    Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on fresh-cut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genes involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea's extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes.

  11. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer’s disease

    PubMed Central

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer’s diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer’s disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer’s disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer’s disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer’s disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer’s disease. PMID:27585646

  12. Genomic organization, transcriptomic analysis, and functional characterization of avian α- and β-keratins in diverse feather forms.

    PubMed

    Ng, Chen Siang; Wu, Ping; Fan, Wen-Lang; Yan, Jie; Chen, Chih-Kuan; Lai, Yu-Ting; Wu, Siao-Man; Mao, Chi-Tang; Chen, Jun-Jie; Lu, Mei-Yeh Jade; Ho, Meng-Ru; Widelitz, Randall B; Chen, Chih-Feng; Chuong, Cheng-Ming; Li, Wen-Hsiung

    2014-08-24

    Feathers are hallmark avian integument appendages, although they were also present on theropods. They are composed of flexible corneous materials made of α- and β-keratins, but their genomic organization and their functional roles in feathers have not been well studied. First, we made an exhaustive search of α- and β-keratin genes in the new chicken genome assembly (Galgal4). Then, using transcriptomic analysis, we studied α- and β-keratin gene expression patterns in five types of feather epidermis. The expression patterns of β-keratin genes were different in different feather types, whereas those of α-keratin genes were less variable. In addition, we obtained extensive α- and β-keratin mRNA in situ hybridization data, showing that α-keratins and β-keratins are preferentially expressed in different parts of the feather components. Together, our data suggest that feather morphological and structural diversity can largely be attributed to differential combinations of α- and β-keratin genes in different intrafeather regions and/or feather types from different body parts. The expression profiles provide new insights into the evolutionary origin and diversification of feathers. Finally, functional analysis using mutant chicken keratin forms based on those found in the human α-keratin mutation database led to abnormal phenotypes. This demonstrates that the chicken can be a convenient model for studying the molecular biology of human keratin-based diseases.

  13. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer’s disease

    NASA Astrophysics Data System (ADS)

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-09-01

    Among the genetic factors known to increase the risk of late onset Alzheimer’s diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer’s disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer’s disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer’s disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer’s disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer’s disease.

  14. Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood

    DOE PAGESBeta

    Hori, Chiaki; Ishida, Takuya; Igarashi, Kiyohiko; Samejima, Masahiro; Suzuki, Hitoshi; Master, Emma; Ferreira, Patricia; Ruiz-Dueñas, Francisco J.; Held, Benjamin; Canessa, Paulo; et al

    2014-12-04

    Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on freshcut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genesmore » involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea’s extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes.« less

  15. Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood

    SciTech Connect

    Hori, Chiaki; Ishida, Takuya; Igarashi, Kiyohiko; Samejima, Masahiro; Suzuki, Hitoshi; Master, Emma; Ferreira, Patricia; Ruiz-Dueñas, Francisco J.; Held, Benjamin; Canessa, Paulo; Larrondo, Luis F.; Schmoll, Monika; Druzhinina, Irina S.; Kubicek, Christian P.; Gaskell, Jill A.; Kersten, Phil; St. John, Franz; Glasner, Jeremy; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit; Mgbeahuruike, Anthony C.; Kovalchuk, Andriy; Asiegbu, Fred O.; Lackner, Gerald; Hoffmeister, Dirk; Rencoret, Jorge; Gutiérrez, Ana; Sun, Hui; Lindquist, Erika; Barry, Kerrie; Riley, Robert; Grigoriev, Igor V.; Henrissat, Bernard; Berka, Randy M.; Martínez, Angel T.; Covert, Sarah F.; Blanchette, Robert A.; Cullen, Daniel; Copenhaver, Gregory P.

    2014-12-04

    Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on freshcut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genes involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea’s extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes.

  16. Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood

    PubMed Central

    Hori, Chiaki; Ishida, Takuya; Igarashi, Kiyohiko; Samejima, Masahiro; Suzuki, Hitoshi; Master, Emma; Ferreira, Patricia; Ruiz-Dueñas, Francisco J.; Held, Benjamin; Canessa, Paulo; Larrondo, Luis F.; Schmoll, Monika; Druzhinina, Irina S.; Kubicek, Christian P.; Gaskell, Jill A.; Kersten, Phil; St. John, Franz; Glasner, Jeremy; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit; Mgbeahuruike, Anthony C.; Kovalchuk, Andriy; Asiegbu, Fred O.; Lackner, Gerald; Hoffmeister, Dirk; Rencoret, Jorge; Gutiérrez, Ana; Sun, Hui; Lindquist, Erika; Barry, Kerrie; Riley, Robert; Grigoriev, Igor V.; Henrissat, Bernard; Kües, Ursula; Berka, Randy M.; Martínez, Angel T.; Covert, Sarah F.; Blanchette, Robert A.; Cullen, Daniel

    2014-01-01

    Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on fresh-cut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genes involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea's extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes. PMID:25474575

  17. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer's disease.

    PubMed

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-01-01

    Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer's disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer's disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer's disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer's disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer's disease. PMID:27585646

  18. Whipworm genome and dual-species transcriptome analyses provide molecular insights into an intimate host-parasite interaction

    PubMed Central

    Nichol, Sarah; Tracey, Alan; Holroyd, Nancy; Cotton, James A.; Stanley, Eleanor J.; Zarowiecki, Magdalena; Liu, Jimmy Z.; Huckvale, Thomas; Cooper, Philip J.; Grencis, Richard K.; Berriman, Matthew

    2014-01-01

    Whipworms are common soil-transmitted helminths that cause debilitating chronic infections in man. These nematodes are only distantly related to Caenorhabditis elegans and have evolved to occupy an unusual niche, tunneling through epithelial cells of the large intestine. Here we present the genome sequences of the human-infective Trichuris trichiura and the murine laboratory model T. muris. Based on whole transcriptome analyses we identify many genes that are expressed in a gender- or life stage-specific manner and characterise the transcriptional landscape of a morphological region with unique biological adaptations, namely bacillary band and stichosome, found only in whipworms and related parasites. Using RNAseq data from whipworm-infected mice we describe the regulated Th1-like immune response of the chronically infected cecum in unprecedented detail. In silico screening identifies numerous potential new drug targets against trichuriasis. Together, these genomes and associated functional data elucidate key aspects of the molecular host-parasite interactions that define chronic whipworm infection. PMID:24929830

  19. The transcriptome of the reference potato genome Solanum tuberosum Group Phureja clone DM1-3 516R44.

    PubMed

    Massa, Alicia N; Childs, Kevin L; Lin, Haining; Bryan, Glenn J; Giuliano, Giovanni; Buell, C Robin

    2011-01-01

    Advances in molecular breeding in potato have been limited by its complex biological system, which includes vegetative propagation, autotetraploidy, and extreme heterozygosity. The availability of the potato genome and accompanying gene complement with corresponding gene structure, location, and functional annotation are powerful resources for understanding this complex plant and advancing molecular breeding efforts. Here, we report a reference for the potato transcriptome using 32 tissues and growth conditions from the doubled monoploid Solanum tuberosum Group Phureja clone DM1-3 516R44 for which a genome sequence is available. Analysis of greater than 550 million RNA-Seq reads permitted the detection and quantification of expression levels of over 22,000 genes. Hierarchical clustering and principal component analyses captured the biological variability that accounts for gene expression differences among tissues suggesting tissue-specific gene expression, and genes with tissue or condition restricted expression. Using gene co-expression network analysis, we identified 18 gene modules that represent tissue-specific transcriptional networks of major potato organs and developmental stages. This information provides a powerful resource for potato research as well as studies on other members of the Solanaceae family.

  20. Integration of transcriptomic and genomic data suggests candidate mechanisms for APOE4-mediated pathogenic action in Alzheimer's disease.

    PubMed

    Caberlotto, Laura; Marchetti, Luca; Lauria, Mario; Scotti, Marco; Parolo, Silvia

    2016-09-02

    Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer's disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer's disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer's disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer's disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer's disease.

  1. Transcriptome kinetics is governed by a genome-wide coupling of mRNA production and degradation: a role for RNA Pol II.

    PubMed

    Shalem, Ophir; Groisman, Bella; Choder, Mordechai; Dahan, Orna; Pilpel, Yitzhak

    2011-09-01

    Transcriptome dynamics is governed by two opposing processes, mRNA production and degradation. Recent studies found that changes in these processes are frequently coordinated and that the relationship between them shapes transcriptome kinetics. Specifically, when transcription changes are counter-acted with changes in mRNA stability, transient fast-relaxing transcriptome kinetics is observed. A possible molecular mechanism underlying such coordinated regulation might lay in two RNA polymerase (Pol II) subunits, Rpb4 and Rpb7, which are recruited to mRNAs during transcription and later affect their degradation in the cytoplasm. Here we used a yeast strain carrying a mutant Pol II which poorly recruits these subunits. We show that this mutant strain is impaired in its ability to modulate mRNA stability in response to stress. The normal negative coordinated regulation is lost in the mutant, resulting in abnormal transcriptome profiles both with respect to magnitude and kinetics of responses. These results reveal an important role for Pol II, in regulation of both mRNA synthesis and degradation, and also in coordinating between them. We propose a simple model for production-degradation coupling that accounts for our observations. The model shows how a simple manipulation of the rates of co-transcriptional mRNA imprinting by Pol II may govern genome-wide transcriptome kinetics in response to environmental changes.

  2. Large-scale data mining pilot project in human genome

    SciTech Connect

    Musick, R.; Fidelis, R.; Slezak, T.

    1997-05-01

    This whitepaper briefly describes a new, aggressive effort in large- scale data Livermore National Labs. The implications of `large- scale` will be clarified Section. In the short term, this effort will focus on several @ssion-critical questions of Genome project. We will adapt current data mining techniques to the Genome domain, to quantify the accuracy of inference results, and lay the groundwork for a more extensive effort in large-scale data mining. A major aspect of the approach is that we will be fully-staffed data warehousing effort in the human Genome area. The long term goal is strong applications- oriented research program in large-@e data mining. The tools, skill set gained will be directly applicable to a wide spectrum of tasks involving a for large spatial and multidimensional data. This includes applications in ensuring non-proliferation, stockpile stewardship, enabling Global Ecology (Materials Database Industrial Ecology), advancing the Biosciences (Human Genome Project), and supporting data for others (Battlefield Management, Health Care).

  3. Genome-wide transcriptomic analysis of the sporophyte of the moss Physcomitrella patens.

    PubMed

    O'Donoghue, Martin-Timothy; Chater, Caspar; Wallace, Simon; Gray, Julie E; Beerling, David J; Fleming, Andrew J

    2013-09-01

    Bryophytes, the most basal of the extant land plants, diverged at least 450 million years ago. A major feature of these plants is the biphasic alternation of generations between a dominant haploid gametophyte and a minor diploid sporophyte phase. These dramatic differences in form and function occur in a constant genetic background, raising the question of whether the switch from gametophyte-to-sporophyte development reflects major changes in the spectrum of genes being expressed or alternatively whether only limited changes in gene expression occur and the differences in plant form are due to differences in how the gene products are put together. This study performed replicated microarray analyses of RNA from several thousand dissected and developmentally staged sporophytes of the moss Physcomitrella patens, allowing analysis of the transcriptomes of the sporophyte and early gametophyte, as well as the early stages of moss sporophyte development. The data indicate that more significant changes in transcript profile occur during the switch from gametophyte to sporophyte than recently reported, with over 12% of the entire transcriptome of P. patens being altered during this major developmental transition. Analysis of the types of genes contributing to these differences supports the view of the early sporophyte being energetically and nutritionally dependent on the gametophyte, provides a profile of homologues to genes involved in angiosperm stomatal development and physiology which suggests a deeply conserved mechanism of stomatal control, and identifies a novel series of transcription factors associated with moss sporophyte development.

  4. Whole transcriptome sequencing of the aging rat brain reveals dynamic RNA changes in the dark matter of the genome.

    PubMed

    Wood, Shona H; Craig, Thomas; Li, Yang; Merry, Brian; de Magalhães, João Pedro

    2013-06-01

    Brain aging frequently underlies cognitive decline and is a major risk factor for neurodegenerative conditions. The exact molecular mechanisms underlying brain aging, however, remain unknown. Whole transcriptome sequencing provides unparalleled depth and sensitivity in gene expression profiling. It also allows non-coding RNA and splice variant detection/comparison across phenotypes. Using RNA-seq to sequence the cerebral cortex transcriptome in 6-, 12- and 28-month-old rats, age-related changes were studied. Protein-coding genes related to MHC II presentation and serotonin biosynthesis were differentially expressed (DE) in aging. Relative to protein-coding genes, more non-coding genes were DE over the three age-groups. RNA-seq quantifies not only levels of whole genes but also of their individual transcripts. Over the three age-groups, 136 transcripts were DE, 37 of which were so-called dark matter transcripts that do not map to known exons. Fourteen of these transcripts were identified as novel putative long non-coding RNAs. Evidence of isoform switching and changes in usage were found. Promoter and coding sequence usage were also altered, hinting of possible changes to mitochondrial transport within neurons. Therefore, in addition to changes in the expression of protein-coding genes, changes in transcript expression, isoform usage, and non-coding RNAs occur with age. This study demonstrates dynamic changes in RNA with age at various genomic levels, which may reflect changes in regulation of transcriptional networks and provides non-coding RNA gene candidates for further studies.

  5. Next-generation transcriptome assembly

    SciTech Connect

    Martin, Jeffrey A.; Wang, Zhong

    2011-09-01

    Transcriptomics studies often rely on partial reference transcriptomes that fail to capture the full catalog of transcripts and their variations. Recent advances in sequencing technologies and assembly algorithms have facilitated the reconstruction of the entire transcriptome by deep RNA sequencing (RNA-seq), even without a reference genome. However, transcriptome assembly from billions of RNA-seq reads, which are often very short, poses a significant informatics challenge. This Review summarizes the recent developments in transcriptome assembly approaches - reference-based, de novo and combined strategies-along with some perspectives on transcriptome assembly in the near future.

  6. Life-style transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Colletotrichum species are devastating fungal pathogens of major crop plants worldwide. Infection involves differentiation of specialized cell-types associated with host surface penetration, growth inside living host cells (biotrophy) and tissue destruction (necrotrophy). Here we report genome and t...

  7. Genome Sequence and Transcriptome Analysis of Meat-Spoilage-Associated Lactic Acid Bacterium Lactococcus piscium MKFS47.

    PubMed

    Andreevskaya, Margarita; Johansson, Per; Laine, Pia; Smolander, Olli-Pekka; Sonck, Matti; Rahkila, Riitta; Jääskeläinen, Elina; Paulin, Lars; Auvinen, Petri; Björkroth, Johanna

    2015-06-01

    Lactococcus piscium is a psychrotrophic lactic acid bacterium and is known to be one of the predominant species within spoilage microbial communities in cold-stored packaged foods, particularly in meat products. Its presence in such products has been associated with the formation of buttery and sour off-odors. Nevertheless, the spoilage potential of L. piscium varies dramatically depending on the strain and growth conditions. Additional knowledge about the genome is required to explain such variation, understand its phylogeny, and study gene functions. Here, we present the complete and annotated genomic sequence of L. piscium MKFS47, combined with a time course analysis of the glucose catabolism-based transcriptome. In addition, a comparative analysis of gene contents was done for L. piscium MKFS47 and 29 other lactococci, revealing three distinct clades within the genus. The genome of L. piscium MKFS47 consists of one chromosome, carrying 2,289 genes, and two plasmids. A wide range of carbohydrates was predicted to be fermented, and growth on glycerol was observed. Both carbohydrate and glycerol catabolic pathways were significantly upregulated in the course of time as a result of glucose exhaustion. At the same time, differential expression of the pyruvate utilization pathways, implicated in the formation of spoilage substances, switched the metabolism toward a heterofermentative mode. In agreement with data from previous inoculation studies, L. piscium MKFS47 was identified as an efficient producer of buttery-odor compounds under aerobic conditions. Finally, genes and pathways that may contribute to increased survival in meat environments were considered. PMID:25819958

  8. Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion

    SciTech Connect

    Martinez, Diego; Challacombe, Jean; Morgenstern, Ingo; Hibbett, David; Schmoll, Monika; Kubicek, Christian P.; Ferreira, Patricia; Ruiz-Duenas, Francisco; Martinez, Angel T.; Kersten, Phil; Hammel, Ken; Vanden Wymelenberg, Amber; Gaskell, Jill; Lindquist, Erika; Sabat, Gregorz; Splinter Bondurant, Sandra; Larrondo, Luis F.; Canessa, Paulo; Vicuna, Rafael; Yadev, Jagjit; Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Pisabarro, Antonio; Lavin, Jose L.; Oguiza, Jose A.; Master, Emma; Henrissat, Bernard; Coutinho, Pedro M.; Harris, Paul; Magnuson, Jon K.; Baker, Scott E.; Bruno, Kenneth S.; Kenealy, William; Hoegger, Patrik; Kues, Ursula; Ramaiya, Preethi; Lucas, Susan; Salamov, Asaf; Shapiro, Harris; Tu, Hank; Chee, Christine L.; Misra, Monica; Xie, Gary; Teter, Sarah; Yaver, Debbie; James, Tim; Mokrejs, Martin; Pospisek, Martin; Grigoriev, Igor V.; Brettin, T.; Rokhsar, Daniel S.; Berka, Randy; Cullen, Dan

    2009-02-10

    Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome, transcriptome, and secretome revealed unique extracellular enzyme systems, including an unusual repertoire of extracellular glycoside hydrolases. Genes encoding exocellobiohydrolases and cellulose-binding domains, typical of cellulolytic microbes, are absent in this efficient cellulose-degrading fungus. When P. placenta was grown in media containing cellulose as sole carbon source, transcripts corresponding to many hemicellulases and to a single putative β-1-4 endoglucanase were expressed at high levels relative to glucose grown cultures. These transcript profiles were confirmed by direct identification of peptides by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Also upregulated under cellulolytic culture conditions were putative iron reductases, quinone reductase, and structurally divergent oxidases potentially involved in extracellular generation of Fe(II) and H2O2. These observations are consistent with a biodegradative role for Fenton chemistry in which Fe(II) and H2O2 react to form hydroxyl radicals, highly reactive oxidants capable of depolymerizing cellulose. The P. placenta genome resources provide unparalleled opportunities for investigating such unusual mechanisms of cellulose conversion. More broadly, the genome offers insight into the diversification of lignocellulose degrading mechanisms in fungi. In particular, comparisons between P. placenta and the closely related white-rot fungus, Phanerochaete chrysosporium support an evolutionary shift from white-rot to brown-rot during which efficient depolymerization of lignin was lost.

  9. Comparative genomic and transcriptomic analyses of NaCl-tolerant Staphylococcus sp. OJ82 isolated from fermented seafood.

    PubMed

    Choi, Sungjong; Jung, Jaejoon; Jeon, Che Ok; Park, Woojun

    2014-01-01

    Bacteria belonging to the Staphylococcus genus reside in various natural environments; however, only disease-associated Staphylococcus strains have received attention while ecological function and physiologies of non-pathogenic strains were often neglected. Because high level of tolerance against NaCl is a common trait of Staphylococcus, we investigated the characteristics of halotolerance in Staphylococcus sp. OJ82 isolated from fermented seafood containing a high concentration of NaCl. Among the 292 isolates screened, OJ82 showed the highest β-galactosidase and extracellular protease activities under high-salt conditions. Comparative genomic analysis with other Staphylococcus strains showed that (a) replication origins are highly conserved, (b) the OJ82 strain has a high number of amino acid transport- and metabolism-related genes, and (c) OJ82 has many unique proteins (15 %) and 12 prophage-related genomic islands. RNA-seq analysis under high-salt conditions showed that genes involved in cell membranes, transport, osmotic stress, ATP synthesis, and translation are highly expressed. OJ82 may use the ribulose monophosphate pathway to detoxify some toxic intermediates under high-salt conditions. Six new and three known non-coding small RNAs of the OJ82 strain were also found in the RNA-seq analysis. Genomic and transcriptomic analyses identified target β-galactosidase and extracellular protease. Interestingly, the OJ82 strain became resistant to bacteriocin produced by the Bacillus strain only under high-salt conditions. Our data showed that the OJ82 strain adapted to high-salt conditions by expressing core cellular processes (translation, ATP production) and defense genes (membrane synthesis, compatible solute transports, ribulose monophosphate pathway) could survive bacteriocin exposure under high-salt conditions.

  10. Genome Sequence and Transcriptome Analysis of Meat-Spoilage-Associated Lactic Acid Bacterium Lactococcus piscium MKFS47.

    PubMed

    Andreevskaya, Margarita; Johansson, Per; Laine, Pia; Smolander, Olli-Pekka; Sonck, Matti; Rahkila, Riitta; Jääskeläinen, Elina; Paulin, Lars; Auvinen, Petri; Björkroth, Johanna

    2015-06-01

    Lactococcus piscium is a psychrotrophic lactic acid bacterium and is known to be one of the predominant species within spoilage microbial communities in cold-stored packaged foods, particularly in meat products. Its presence in such products has been associated with the formation of buttery and sour off-odors. Nevertheless, the spoilage potential of L. piscium varies dramatically depending on the strain and growth conditions. Additional knowledge about the genome is required to explain such variation, understand its phylogeny, and study gene functions. Here, we present the complete and annotated genomic sequence of L. piscium MKFS47, combined with a time course analysis of the glucose catabolism-based transcriptome. In addition, a comparative analysis of gene contents was done for L. piscium MKFS47 and 29 other lactococci, revealing three distinct clades within the genus. The genome of L. piscium MKFS47 consists of one chromosome, carrying 2,289 genes, and two plasmids. A wide range of carbohydrates was predicted to be fermented, and growth on glycerol was observed. Both carbohydrate and glycerol catabolic pathways were significantly upregulated in the course of time as a result of glucose exhaustion. At the same time, differential expression of the pyruvate utilization pathways, implicated in the formation of spoilage substances, switched the metabolism toward a heterofermentative mode. In agreement with data from previous inoculation studies, L. piscium MKFS47 was identified as an efficient producer of buttery-odor compounds under aerobic conditions. Finally, genes and pathways that may contribute to increased survival in meat environments were considered.

  11. Genome, transcriptome, and secretome analysis of wood decay fungus postia placenta supports unique mechanisms of lignocellulose conversion

    SciTech Connect

    Martinez, Diego; Challacombe, Jean F; Misra, Monica; Xie, Gary; Brettin, Thomas; Morgenstern, Ingo; Hibbett, David; Schmoll, Monika; Kubicek, Christian P; Ferreira, Patricia; Ruiz - Duenase, Francisco J; Martinez, Angel T; Kersten, Phil; Hammel, Kenneth E; Vanden Wymelenberg, Amber; Gaskell, Jill; Lindquist, Erika; Sabati, Grzegorz; Bondurant, Sandra S; Larrondo, Luis F; Canessa, Paulo; Vicunna, Rafael; Yadavk, Jagiit; Doddapaneni, Harshavardhan; Subramaniank, Venkataramanan; Pisabarro, Antonio G; Lavin, Jose L; Oguiza, Jose A; Master, Emma; Henrissat, Bernard; Coutinho, Pedro M; Harris, Paul; Magnuson, Jon K; Baker, Scott; Bruno, Kenneth; Kenealy, William; Hoegger, Patrik J; Kues, Ursula; Ramaiva, Preethi; Lucas, Susan; Salamov, Asaf; Shapiro, Harris; Tuh, Hank; Chee, Christine L; Teter, Sarah; Yaver, Debbie; James, Tim; Mokrejs, Martin; Pospisek, Martin; Grigoriev, Igor; Rokhsar, Dan; Berka, Randy; Cullen, Dan

    2008-01-01

    Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome, transcriptome and secretome revealed unique extracellular enzyme systems, including an unusual repertoire of extracellular glycoside hydrolases. Genes encoding exocellobiohydrolases and cellulose-binding domains, typical of cellulolytic microbes, are absent in this efficient cellulose-degrading fungus. When P. placenta was grown in medium containing cellulose as sole carbon source, transcripts corresponding to many hemicellulases and to a single putative {beta}-1-4 endoglucanase were expressed at high levels relative to glucose grown cultures. These transcript profiles were confirmed by direct identification of peptides by liquid chromatography-tandem mass spectrometry (LC{center_dot}MSIMS). Also upregulated during growth on cellulose medium were putative iron reductases, quinone reductase, and structurally divergent oxidases potentially involved in extracellular generation of Fe(II) and H202. These observations are consistent with a biodegradative role for Fenton chemistry in which Fe(II) and H202 react to form hydroxyl radicals, highly reactive oxidants capable of depolymerizing cellulose. The P. placenta genome resources provide unparalleled opportunities for investigating such unusual mechanisms of cellulose conversion. More broadly, the genome offers insight into the diversification of lignocellulose degrading mechanisms in fungi. Comparisons to the closely related white-rot fungus Phanerochaete chrysosporium support an evolutionary shift from white-rot to brown-rot during which the capacity for efficient depolymerization of lignin was lost.

  12. Deep transcriptome sequencing of Pecten maximus hemocytes: a genomic resource for bivalve immunology.

    PubMed

    Pauletto, Marianna; Milan, Massimo; Moreira, Rebeca; Novoa, Beatriz; Figueras, Antonio; Babbucci, Massimiliano; Patarnello, Tomaso; Bargelloni, Luca

    2014-03-01

    Pecten maximus, the king scallop, is a bivalve species with important commercial value for both fisheries and aquaculture, traditionally consumed in several European countries. Major problems in larval rearing, however, still limit hatchery-based seed production. High mortalities during early larval stages, likely related to bacterial pathogens, represent the most relevant bottleneck. To address this issue, understanding host defense mechanisms against microbes is extremely important. In this study next-generation RNA-sequencing was carried on scallop hemocytes. To enrich for immune-related transcripts, cDNA libraries from hemocytes challenged in vivo with inactivated-Vibrio anguillarum and in vitro with pathogen-associated molecular patterns, as well as unchallenged controls, were sequenced yielding 216,444,674 sequence reads. De novo assembly of the scallop hemocyte transcriptome consisted of 73,732 contigs (31% annotated). A total of 934 contigs encoded proteins with a known immune function, grouped into several functional categories. Particular attention was reserved to Toll-like receptors (TLRs), a family of pattern recognition receptors (PRRs) involved in non-self recognition. Through mining the scallop hemocyte transcriptome, at least four TLRs could be identified. The organization of canonical TLR domains demonstrated that single cysteine cluster and multiple cysteine cluster TLRs co-exist in this species. In addition, preliminary data concerning their mRNA level following bacterial challenge suggested that different members of this family could exhibit opposite responses to pathogenic stimuli. Finally, a global analysis of differential expression comparing gene-expression levels in in vitro and in vivo stimulated hemocytes against controls provided evidence on a large set of transcripts involved in the great scallop immune response.

  13. Transcriptome sequencing reveals both neutral and adaptive genome dynamics in a marine invader.

    PubMed

    Tepolt, C K; Palumbi, S R

    2015-08-01

    Species invasions cause significant ecological and economic damage, and genetic information is important to understanding and managing invasive species. In the ocean, many invasive species have high dispersal and gene flow, lowering the discriminatory power of traditional genetic approaches. High-throughput sequencing holds tremendous promise for increasing resolution and illuminating the relative contributions of selection and drift in marine invasion, but has not yet been used to compare the diversity and dynamics of a high-dispersal invader in its native and invaded ranges. We test a transcriptome-based approach in the European green crab (Carcinus maenas), a widespread invasive species with high gene flow and a well-known invasion history, in two native and five invasive populations. A panel of 10 809 transcriptome-derived nuclear SNPs identified significant population structure among highly bottlenecked invasive populations that were previously undifferentiated with traditional markers. Comparing the full data set and a subset of 9246 putatively neutral SNPs strongly suggested that non-neutral processes are the primary driver of population structure within the species' native range, while neutral processes appear to dominate in the invaded range. Non-neutral native range structure coincides with significant differences in intraspecific thermal tolerance, suggesting temperature as a potential selective agent. These results underline the importance of adaptation in shaping intraspecific differences even in high geneflow marine invasive species. They also demonstrate that high-throughput approaches have broad utility in determining neutral structure in recent invasions of such species. Together, neutral and non-neutral data derived from high-throughput approaches may increase the understanding of invasion dynamics in high-dispersal species. PMID:26118396

  14. Transcriptome sequencing reveals both neutral and adaptive genome dynamics in a marine invader.

    PubMed

    Tepolt, C K; Palumbi, S R

    2015-08-01

    Species invasions cause significant ecological and economic damage, and genetic information is important to understanding and managing invasive species. In the ocean, many invasive species have high dispersal and gene flow, lowering the discriminatory power of traditional genetic approaches. High-throughput sequencing holds tremendous promise for increasing resolution and illuminating the relative contributions of selection and drift in marine invasion, but has not yet been used to compare the diversity and dynamics of a high-dispersal invader in its native and invaded ranges. We test a transcriptome-based approach in the European green crab (Carcinus maenas), a widespread invasive species with high gene flow and a well-known invasion history, in two native and five invasive populations. A panel of 10 809 transcriptome-derived nuclear SNPs identified significant population structure among highly bottlenecked invasive populations that were previously undifferentiated with traditional markers. Comparing the full data set and a subset of 9246 putatively neutral SNPs strongly suggested that non-neutral processes are the primary driver of population structure within the species' native range, while neutral processes appear to dominate in the invaded range. Non-neutral native range structure coincides with significant differences in intraspecific thermal tolerance, suggesting temperature as a potential selective agent. These results underline the importance of adaptation in shaping intraspecific differences even in high geneflow marine invasive species. They also demonstrate that high-throughput approaches have broad utility in determining neutral structure in recent invasions of such species. Together, neutral and non-neutral data derived from high-throughput approaches may increase the understanding of invasion dynamics in high-dispersal species.

  15. Adaptation of an abundant Roseobacter RCA organism to pelagic systems revealed by genomic and transcriptomic analyses.

    PubMed

    Voget, Sonja; Wemheuer, Bernd; Brinkhoff, Thorsten; Vollmers, John; Dietrich, Sascha; Giebel, Helge-Ansgar; Beardsley, Christine; Sardemann, Carla; Bakenhus, Insa; Billerbeck, Sara; Daniel, Rolf; Simon, Meinhard

    2015-02-01

    The RCA (Roseobacter clade affiliated) cluster, with an internal 16S rRNA gene sequence similarity of >98%, is the largest cluster of the marine Roseobacter clade and most abundant in temperate to (sub)polar oceans, constituting up to 35% of total bacterioplankton. The genome analysis of the first described species of the RCA cluster, Planktomarina temperata RCA23, revealed that this phylogenetic lineage is deeply branching within the Roseobacter clade. It shares not >65.7% of homologous genes with any other organism of this clade. The genome is the smallest of all closed genomes of the Roseobacter clade, exhibits various features of genome streamlining and encompasses genes for aerobic anoxygenic photosynthesis (AAP) and CO oxidation. In order to assess the biogeochemical significance of the RCA cluster we investigated a phytoplankton spring bloom in the North Sea. This cluster constituted 5.1% of the total, but 10-31% (mean 18.5%) of the active bacterioplankton. A metatranscriptomic analysis showed that the genome of P. temperata RCA23 was transcribed to 94% in the bloom with some variations during day and night. The genome of P. temperata RCA23 was also retrieved to 84% from metagenomic data sets from a Norwegian fjord and to 82% from stations of the Global Ocean Sampling expedition in the northwestern Atlantic. In this region, up to 6.5% of the total reads mapped on the genome of P. temperata RCA23. This abundant taxon appears to be a major player in ocean biogeochemistry. PMID:25083934

  16. Adaptation of an abundant Roseobacter RCA organism to pelagic systems revealed by genomic and transcriptomic analyses

    PubMed Central

    Voget, Sonja; Wemheuer, Bernd; Brinkhoff, Thorsten; Vollmers, John; Dietrich, Sascha; Giebel, Helge-Ansgar; Beardsley, Christine; Sardemann, Carla; Bakenhus, Insa; Billerbeck, Sara; Daniel, Rolf; Simon, Meinhard

    2015-01-01

    The RCA (Roseobacter clade affiliated) cluster, with an internal 16S rRNA gene sequence similarity of >98%, is the largest cluster of the marine Roseobacter clade and most abundant in temperate to (sub)polar oceans, constituting up to 35% of total bacterioplankton. The genome analysis of the first described species of the RCA cluster, Planktomarina temperata RCA23, revealed that this phylogenetic lineage is deeply branching within the Roseobacter clade. It shares not >65.7% of homologous genes with any other organism of this clade. The genome is the smallest of all closed genomes of the Roseobacter clade, exhibits various features of genome streamlining and encompasses genes for aerobic anoxygenic photosynthesis (AAP) and CO oxidation. In order to assess the biogeochemical significance of the RCA cluster we investigated a phytoplankton spring bloom in the North Sea. This cluster constituted 5.1% of the total, but 10–31% (mean 18.5%) of the active bacterioplankton. A metatranscriptomic analysis showed that the genome of P. temperata RCA23 was transcribed to 94% in the bloom with some variations during day and night. The genome of P. temperata RCA23 was also retrieved to 84% from metagenomic data sets from a Norwegian fjord and to 82% from stations of the Global Ocean Sampling expedition in the northwestern Atlantic. In this region, up to 6.5% of the total reads mapped on the genome of P. temperata RCA23. This abundant taxon appears to be a major player in ocean biogeochemistry. PMID:25083934

  17. Capturing the response of Clostridium acetobutylicum to chemical stressors using a regulated genome-scale metabolic model

    SciTech Connect

    Dash, Satyakam; Mueller, Thomas J.; Venkataramanan, Keerthi P.; Papoutsakis, Eleftherios T.; Maranas, Costas D.

    2014-10-14

    Clostridia are anaerobic Gram-positive Firmicutes containing broad and flexible systems for substrate utilization, which have been used successfully to produce a range of industrial compounds. Clostridium acetobutylicum has been used to produce butanol on an industrial scale through acetone-butanol-ethanol (ABE) fermentation. A genome-scale metabolic (GSM) model is a powerful tool for understanding the metabolic capacities of an organism and developing metabolic engineering strategies for strain development. The integration of stress related specific transcriptomics information with the GSM model provides opportunities for elucidating the focal points of regulation.

  18. Evaluating genome-scale approaches to eukaryotic DNA replication

    PubMed Central

    Gilbert, David M.

    2010-01-01

    Mechanisms regulating where and when eukaryotic DNA replication initiates remain a mystery. Recently, genome-scale methods have been brought to bear on this problem. The identification of replication origins and their associated proteins in yeasts is a well-integrated investigative tool, but corresponding data sets from multicellular organisms are scarce. By contrast, standardized protocols for evaluating replication timing have generated informative data sets for most eukaryotic systems. Here, I summarize the genome-scale methods that are most frequently used to analyse replication in eukaryotes, the kinds of questions each method can address and the technical hurdles that must be overcome to gain a complete understanding of the nature of eukaryotic replication origins. PMID:20811343

  19. Selection and evaluation of novel reference genes for quantitative reverse transcription PCR (qRT-PCR) based on genome and transcriptome data in Brassica napus L.

    PubMed

    Yang, Hongli; Liu, Jing; Huang, Shunmou; Guo, Tingting; Deng, Linbin; Hua, Wei

    2014-03-15

    Selection of reference genes in Brassica napus, a tetraploid (4×) species, is a very difficult task without information on genome and transcriptome. By now, only several traditional reference genes which show significant expression differentiation under different conditions are used in B. napus. In the present study, based on genome and transcriptome data of the rapeseed Zhongshuang-11 cultivar, 14 candidate reference genes were screened for investigation in different tissues, cultivars, and treated conditions of B. napus. These genes were as follows: ELF5, ENTH, F-BOX7, F-BOX2, FYPP1, GDI1, GYF, MCP2d, OTP80, PPR, SPOC, Unknown1, Unknown2 and UBA. Among them, excluding GYF and FYPP1, another 12 genes, were identified to perform better than traditional reference genes ACTIN7 and GAPDH. To further validate the accuracy of the newly developed reference genes in normalization, expression levels of BnCAT1 (B. napus catalase 1) in different rapeseed tissues and seedlings under stress conditions were normalized by the three most stable reference genes PPR, GDI1, and ENTH and little difference existed in normalization results. To the best of our knowledge, this is the first time B. napus reference genes have been provided with the help of complete genome and transcriptome information. The new reference genes provided in this study are more accurate than previously reported reference genes in quantifying expression levels of B. napus genes.

  20. Integration of genetic, genomic and transcriptomic information identifies putative regulators of adventitious root formation in Populus

    DOE PAGESBeta

    Ribeiro, Cintia L.; Silva, Cynthia M.; Drost, Derek R.; Novaes, Evandro; Novaes, Carolina R. D. B.; Dervinis, Christopher; Kirst, Matias

    2016-03-16

    In this study, adventitious roots (AR) develop from tissues other than the primary root, in a process physiologically regulated by phytohormones. Adventitious roots provide structural support and contribute to water and nutrient absorption, and are critical for commercial vegetative propagation of several crops. Here we quantified the number of AR, root architectural traits and root biomass in cuttings from a pseudo-backcross population of Populus deltoides and Populus trichocarpa. Quantitative trait loci (QTL) mapping and whole-transcriptome analysis of individuals with alternative QTL alleles for AR number were used to identify putative regulators of AR development. As a result, parental individuals andmore » progeny showed extensive segregation for AR developmental traits. Quantitative trait loci for number of AR mapped consistently in the same interval of linkage group (LG) II and LG XIV, explaining 7–10 % of the phenotypic variation. A time series transcriptome analysis identified 26,121 genes differentially expressed during AR development, particularly during the first 24 h after cuttings were harvested. Of those, 1929 genes were differentially regulated between individuals carrying alternative alleles for the two QTL for number of AR, in one or more time point. Eighty-one of these genes were physically located within the QTL intervals for number of AR, including putative homologs of the Arabidopsis genes SUPERROOT2 (SUR2) and TRYPTOPHAN SYNTHASE ALPHA CHAIN (TSA1), both of which are involved in the auxin indole-3-acetic acid (IAA) biosynthesis pathway. In conclusion, this study suggests the involvement of two genes of the tryptophan-dependent auxin biosynthesis pathway, SUR2 and TSA1, in the regulation of a critical trait for the clonal propagation of woody species. A possible model for this regulation is that poplar individuals that have poor AR formation synthesize auxin indole-3-acetic acid (IAA) primarily through the tryptophan (Trp) pathway. Much of

  1. Genome, transcriptome, and functional analyses of Penicillium expansum provide new insights into secondary metabolism and pathogenicity

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The relationship between secondary metabolism and infection in pathogenic fungi has remained largely elusive. Penicillium comprises a group of plant pathogens with varying host specificities and with the ability to produce a wide array of secondary metabolites. The genomes of three Penicillium exp...

  2. Genome structures and transcriptomes signify niche adaptation for the multiple-ion-tolerant extremophyte Schrenkiella parvula.

    PubMed

    Oh, Dong-Ha; Hong, Hyewon; Lee, Sang Yeol; Yun, Dae-Jin; Bohnert, Hans J; Dassanayake, Maheshi

    2014-04-01

    Schrenkiella parvula (formerly Thellungiella parvula), a close relative of Arabidopsis (Arabidopsis thaliana) and Brassica crop species, thrives on the shores of Lake Tuz, Turkey, where soils accumulate high concentrations of multiple-ion salts. Despite the stark differences in adaptations to extreme salt stresses, the genomes of S. parvula and Arabidopsis show extensive synteny. S. parvula completes its life cycle in the presence of Na⁺, K⁺, Mg²⁺, Li⁺, and borate at soil concentrations lethal to Arabidopsis. Genome structural variations, including tandem duplications and translocations of genes, interrupt the colinearity observed throughout the S. parvula and Arabidopsis genomes. Structural variations distinguish homologous gene pairs characterized by divergent promoter sequences and basal-level expression strengths. Comparative RNA sequencing reveals the enrichment of ion-transport functions among genes with higher expression in S. parvula, while pathogen defense-related genes show higher expression in Arabidopsis. Key stress-related ion transporter genes in S. parvula showed increased copy number, higher transcript dosage, and evidence for subfunctionalization. This extremophyte offers a framework to identify the requisite adjustments of genomic architecture and expression control for a set of genes found in most plants in a way to support distinct niche adaptation and lifestyles. PMID:24563282

  3. Characterizing acetogenic metabolism using a genome-scale metabolic reconstruction of Clostridium ljungdahlii

    PubMed Central

    2013-01-01

    Background The metabolic capabilities of acetogens to ferment a wide range of sugars, to grow autotrophically on H2/CO2, and more importantly on synthesis gas (H2/CO/CO2) make them very attractive candidates as production hosts for biofuels and biocommodities. Acetogenic metabolism is considered one of the earliest modes of bacterial metabolism. A thorough understanding of various factors governing the metabolism, in particular energy conservation mechanisms, is critical for metabolic engineering of acetogens for targeted production of desired chemicals. Results Here, we present the genome-scale metabolic network of Clostridium ljungdahlii, the first such model for an acetogen. This genome-scale model (iHN637) consisting of 637 genes, 785 reactions, and 698 metabolites captures all the major central metabolic and biosynthetic pathways, in particular pathways involved in carbon fixation and energy conservation. A combination of metabolic modeling, with physiological and transcriptomic data provided insights into autotrophic metabolism as well as aided the characterization of a nitrate reduction pathway in C. ljungdahlii. Analysis of the iHN637 metabolic model revealed that flavin based electron bifurcation played a key role in energy conservation during autotrophic growth and helped identify genes for some of the critical steps in this mechanism. Conclusions iHN637 represents a predictive model that recapitulates experimental data, and provides valuable insights into the metabolic response of C. ljungdahlii to genetic perturbations under various growth conditions. Thus, the model will be instrumental in guiding metabolic engineering of C. ljungdahlii for the industrial production of biocommodities and biofuels. PMID:24274140

  4. Characterizing acetogenic metabolism using a genome-scale metabolic reconstruction of Clostridium ljungdahlii

    SciTech Connect

    Nagarajan, H; Sahin, M; Nogales, J; Latif, H; Lovley, DR; Ebrahim, A; Zengler, K

    2013-11-25

    Background: The metabolic capabilities of acetogens to ferment a wide range of sugars, to grow autotrophically on H-2/CO2, and more importantly on synthesis gas (H-2/CO/CO2) make them very attractive candidates as production hosts for biofuels and biocommodities. Acetogenic metabolism is considered one of the earliest modes of bacterial metabolism. A thorough understanding of various factors governing the metabolism, in particular energy conservation mechanisms, is critical for metabolic engineering of acetogens for targeted production of desired chemicals. Results: Here, we present the genome-scale metabolic network of Clostridium ljungdahlii, the first such model for an acetogen. This genome-scale model (iHN637) consisting of 637 genes, 785 reactions, and 698 metabolites captures all the major central metabolic and biosynthetic pathways, in particular pathways involved in carbon fixation and energy conservation. A combination of metabolic modeling, with physiological and transcriptomic data provided insights into autotrophic metabolism as well as aided the characterization of a nitrate reduction pathway in C. ljungdahlii. Analysis of the iHN637 metabolic model revealed that flavin based electron bifurcation played a key role in energy conservation during autotrophic growth and helped identify genes for some of the critical steps in this mechanism. Conclusions: iHN637 represents a predictive model that recapitulates experimental data, and provides valuable insights into the metabolic response of C. ljungdahlii to genetic perturbations under various growth conditions. Thus, the model will be instrumental in guiding metabolic engineering of C. ljungdahlii for the industrial production of biocommodities and biofuels.

  5. Recurrent mutation of the ID3 gene in Burkitt lymphoma identified by integrated genome, exome and transcriptome sequencing.

    PubMed

    Richter, Julia; Schlesner, Matthias; Hoffmann, Steve; Kreuz, Markus; Leich, Ellen; Burkhardt, Birgit; Rosolowski, Maciej; Ammerpohl, Ole; Wagener, Rabea; Bernhart, Stephan H; Lenze, Dido; Szczepanowski, Monika; Paulsen, Maren; Lipinski, Simone; Russell, Robert B; Adam-Klages, Sabine; Apic, Gordana; Claviez, Alexander; Hasenclever, Dirk; Hovestadt, Volker; Hornig, Nadine; Korbel, Jan O; Kube, Dieter; Langenberger, David; Lawerenz, Chris; Lisfeld, Jasmin; Meyer, Katharina; Picelli, Simone; Pischimarov, Jordan; Radlwimmer, Bernhard; Rausch, Tobias; Rohde, Marius; Schilhabel, Markus; Scholtysik, René; Spang, Rainer; Trautmann, Heiko; Zenz, Thorsten; Borkhardt, Arndt; Drexler, Hans G; Möller, Peter; MacLeod, Roderick A F; Pott, Christiane; Schreiber, Stefan; Trümper, Lorenz; Loeffler, Markus; Stadler, Peter F; Lichter, Peter; Eils, Roland; Küppers, Ralf; Hummel, Michael; Klapper, Wolfram; Rosenstiel, Philip; Rosenwald, Andreas; Brors, Benedikt; Siebert, Reiner

    2012-12-01

    Burkitt lymphoma is a mature aggressive B-cell lymphoma derived from germinal center B cells. Its cytogenetic hallmark is the Burkitt translocation t(8;14)(q24;q32) and its variants, which juxtapose the MYC oncogene with one of the three immunoglobulin loci. Consequently, MYC is deregulated, resulting in massive perturbation of gene expression. Nevertheless, MYC deregulation alone seems not to be sufficient to drive Burkitt lymphomagenesis. By whole-genome, whole-exome and transcriptome sequencing of four prototypical Burkitt lymphomas with immunoglobulin gene (IG)-MYC translocation, we identified seven recurrently mutated genes. One of these genes, ID3, mapped to a region of focal homozygous loss in Burkitt lymphoma. In an extended cohort, 36 of 53 molecularly defined Burkitt lymphomas (68%) carried potentially damaging mutations of ID3. These were strongly enriched at somatic hypermutation motifs. Only 6 of 47 other B-cell lymphomas with the IG-MYC translocation (13%) carried ID3 mutations. These findings suggest that cooperation between ID3 inactivation and IG-MYC translocation is a hallmark of Burkitt lymphomagenesis.

  6. Hyperlipidemia-associated gene variations and expression patterns revealed by whole-genome and transcriptome sequencing of rabbit models.

    PubMed

    Wang, Zhen; Zhang, Jifeng; Li, Hong; Li, Junyi; Niimi, Manabu; Ding, Guohui; Chen, Haifeng; Xu, Jie; Zhang, Hongjiu; Xu, Ze; Dai, Yulin; Gui, Tuantuan; Li, Shengdi; Liu, Zhi; Wu, Sujuan; Cao, Mushui; Zhou, Lu; Lu, Xingyu; Wang, Junxia; Yang, Jing; Fu, Yunhe; Yang, Dongshan; Song, Jun; Zhu, Tianqing; Li, Shen; Ning, Bo; Wang, Ziyun; Koike, Tomonari; Shiomi, Masashi; Liu, Enqi; Chen, Luonan; Fan, Jianglin; Chen, Y Eugene; Li, Yixue

    2016-01-01

    The rabbit (Oryctolagus cuniculus) is an important experimental animal for studying human diseases, such as hypercholesterolemia and atherosclerosis. Despite this, genetic information and RNA expression profiling of laboratory rabbits are lacking. Here, we characterized the whole-genome variants of three breeds of the most popular experimental rabbits, New Zealand White (NZW), Japanese White (JW) and Watanabe heritable hyperlipidemic (WHHL) rabbits. Although the genetic diversity of WHHL rabbits was relatively low, they accumulated a large proportion of high-frequency deleterious mutations due to the small population size. Some of the deleterious mutations were associated with the pathophysiology of WHHL rabbits in addition to the LDLR deficiency. Furthermore, we conducted transcriptome sequencing of different organs of both WHHL and cholesterol-rich diet (Chol)-fed NZW rabbits. We found that gene expression profiles of the two rabbit models were essentially similar in the aorta, even though they exhibited different types of hypercholesterolemia. In contrast, Chol-fed rabbits, but not WHHL rabbits, exhibited pronounced inflammatory responses and abnormal lipid metabolism in the liver. These results provide valuable insights into identifying therapeutic targets of hypercholesterolemia and atherosclerosis with rabbit models. PMID:27245873

  7. FUNCTIONAL GENOMICS OF ADAPTATION TO HYPOXIC COLD-STRESS IN HIGH-ALTITUDE DEER MICE: TRANSCRIPTOMIC PLASTICITY AND THERMOGENIC PERFORMANCE

    PubMed Central

    Connaty, Alex D.; McClelland, Grant B.; Storz, Jay F.

    2015-01-01

    In species that are distributed across steep environmental gradients, adaptive variation in physiological performance may be attributable to transcriptional plasticity in underlying regulatory networks. Here we report the results of common-garden experiments that were designed to elucidate the role of regulatory plasticity in evolutionary adaptation to hypoxic cold-stress in deer mice (Peromyscus maniculatus). We integrated genomic transcriptional profiles with measures of metabolic enzyme activities and whole-animal thermogenic performance under hypoxia in highland (4350 m) and lowland (430 m) mice from three experimental groups: (1) wild-caught mice that were sampled at their native elevations; (2) wild-caught/lab-reared mice that were deacclimated to low-elevation conditions in a common-garden lab environment; and (3) the F1 progeny of deacclimated mice that were maintained under the same low-elevation common-garden conditions. In each experimental group, highland mice exhibited greater thermogenic capacities than lowland mice, and this enhanced performance was associated with upregulation of transcriptional modules that influence several hierarchical steps in the O2 cascade, including tissue O2 diffusion (angiogenesis) and tissue O2 utilization (metabolic fuel use and cellular oxidative capacity). Most of these performance-related transcriptomic changes occurred over physiological and developmental timescales, suggesting that regulatory plasticity makes important contributions to fitness-related physiological performance in highland deer mice. PMID:24102503

  8. Genome-wide comparison of the transcriptomes of highly enriched normal and chronic myeloid leukemia stem and progenitor cell populations.

    PubMed

    Gerber, Jonathan M; Gucwa, Jessica L; Esopi, David; Gurel, Meltem; Haffner, Michael C; Vala, Milada; Nelson, William G; Jones, Richard J; Yegnasubramanian, Srinivasan

    2013-05-01

    The persistence leukemia stem cells (LSCs) in chronic myeloid leukemia (CML) despite tyrosine kinase inhibition (TKI) may explain relapse after TKI withdrawal. Here we performed genome-wide transcriptome analysis of highly refined CML and normal stem and progenitor cell populations to identify novel targets for the eradication of CML LSCs using exon microarrays. We identified 97 genes that were differentially expressed in CML versus normal stem and progenitor cells. These included cell surface genes significantly upregulated in CML LSCs: DPP4 (CD26), IL2RA (CD25), PTPRD, CACNA1D, IL1RAP, SLC4A4, and KCNK5. Further analyses of the LSCs revealed dysregulation of normal cellular processes, evidenced by alternative splicing of genes in key cancer signaling pathways such as p53 signaling (e.g. PERP, CDKN1A), kinase binding (e.g. DUSP12, MARCKS), and cell proliferation (MYCN, TIMELESS); downregulation of pro-differentiation and TGF-β/BMP signaling pathways; upregulation of oxidative metabolism and DNA repair pathways; and activation of inflammatory cytokines, including CCL2, and multiple oncogenes (e.g., CCND1). These data represent an important resource for understanding the molecular changes in CML LSCs, which may be exploited to develop novel therapies for eradication these cells and achieve cure.

  9. Unraveling adaptation of Pontibacter korlensis to radiation and infertility in desert through complete genome and comparative transcriptomic analysis

    PubMed Central

    Dai, Jun; Dai, Wenkui; Qiu, Chuangzhao; Yang, Zhenyu; Zhang, Yi; Zhou, Mengzhou; Zhang, Lei; Fang, Chengxiang; Gao, Qiang; Yang, Qiao; Li, Xin; Wang, Zhi; Wang, Zhiyong; Jia, Zhenhua; Chen, Xiong

    2015-01-01

    The desert is a harsh habitat for flora and microbial life due to its aridness and strong radiation. In this study, we constructed the first complete and deeply annotated genome of the genus Pontibacter (Pontibacter korlensis X14-1T = CCTCC AB 206081T, X14-1). Reconstruction of the sugar metabolism process indicated that strain X14-1 can utilize diverse sugars, including cellulose, starch and sucrose; this result is consistent with previous experiments. Strain X14-1 is also able to resist desiccation and radiation in the desert through well-armed systems related to DNA repair, radical oxygen species (ROS) detoxification and the OstAB and TreYZ pathways for trehalose synthesis. A comparative transcriptomic analysis under gamma radiation revealed that strain X14-1 presents high-efficacy operating responses to radiation, including the robust expression of catalase and the manganese transport protein. Evaluation of 73 novel genes that are differentially expressed showed that some of these genes may contribute to the strain’s adaptation to radiation and desiccation through ferric transport and preservation. PMID:26057562

  10. Hyperlipidemia-associated gene variations and expression patterns revealed by whole-genome and transcriptome sequencing of rabbit models

    PubMed Central

    Wang, Zhen; Zhang, Jifeng; Li, Hong; Li, Junyi; Niimi, Manabu; Ding, Guohui; Chen, Haifeng; Xu, Jie; Zhang, Hongjiu; Xu, Ze; Dai, Yulin; Gui, Tuantuan; Li, Shengdi; Liu, Zhi; Wu, Sujuan; Cao, Mushui; Zhou, Lu; Lu, Xingyu; Wang, Junxia; Yang, Jing; Fu, Yunhe; Yang, Dongshan; Song, Jun; Zhu, Tianqing; Li, Shen; Ning, Bo; Wang, Ziyun; Koike, Tomonari; Shiomi, Masashi; Liu, Enqi; Chen, Luonan; Fan, Jianglin; Chen, Y. Eugene; Li, Yixue

    2016-01-01

    The rabbit (Oryctolagus cuniculus) is an important experimental animal for studying human diseases, such as hypercholesterolemia and atherosclerosis. Despite this, genetic information and RNA expression profiling of laboratory rabbits are lacking. Here, we characterized the whole-genome variants of three breeds of the most popular experimental rabbits, New Zealand White (NZW), Japanese White (JW) and Watanabe heritable hyperlipidemic (WHHL) rabbits. Although the genetic diversity of WHHL rabbits was relatively low, they accumulated a large proportion of high-frequency deleterious mutations due to the small population size. Some of the deleterious mutations were associated with the pathophysiology of WHHL rabbits in addition to the LDLR deficiency. Furthermore, we conducted transcriptome sequencing of different organs of both WHHL and cholesterol-rich diet (Chol)-fed NZW rabbits. We found that gene expression profiles of the two rabbit models were essentially similar in the aorta, even though they exhibited different types of hypercholesterolemia. In contrast, Chol-fed rabbits, but not WHHL rabbits, exhibited pronounced inflammatory responses and abnormal lipid metabolism in the liver. These results provide valuable insights into identifying therapeutic targets of hypercholesterolemia and atherosclerosis with rabbit models. PMID:27245873

  11. Carotenoid metabolic profiling and transcriptome-genome mining reveal functional equivalence among blue-pigmented copepods and appendicularia.

    PubMed

    Mojib, Nazia; Amad, Maan; Thimma, Manjula; Aldanondo, Naroa; Kumaran, Mande; Irigoien, Xabier

    2014-06-01

    The tropical oligotrophic oceanic areas are characterized by high water transparency and annual solar radiation. Under these conditions, a large number of phylogenetically diverse mesozooplankton species living in the surface waters (neuston) are found to be blue pigmented. In the present study, we focused on understanding the metabolic and genetic basis of the observed blue phenotype functional equivalence between the blue-pigmented organisms from the phylum Arthropoda, subclass Copepoda (Acartia fossae) and the phylum Chordata, class Appendicularia (Oikopleura dioica) in the Red Sea. Previous studies have shown that carotenoid-protein complexes are responsible for blue coloration in crustaceans. Therefore, we performed carotenoid metabolic profiling using both targeted and nontargeted (high-resolution mass spectrometry) approaches in four different blue-pigmented genera of copepods and one blue-pigmented species of appendicularia. Astaxanthin was found to be the principal carotenoid in all the species. The pathway analysis showed that all the species can synthesize astaxanthin from β-carotene, ingested from dietary sources, via 3-hydroxyechinenone, canthaxanthin, zeaxanthin, adonirubin or adonixanthin. Further, using de novo assembled transcriptome of blue A. fossae (subclass Copepoda), we identified highly expressed homologous β-carotene hydroxylase enzymes and putative carotenoid-binding proteins responsible for astaxanthin formation and the blue phenotype. In blue O. dioica (class Appendicularia), corresponding putative genes were identified from the reference genome. Collectively, our data provide molecular evidences for the bioconversion and accumulation of blue astaxanthin-protein complexes underpinning the observed ecological functional equivalence and adaptive convergence among neustonic mesozooplankton. PMID:24803335

  12. Genome-wide transcriptome profiling provides insights into floral bud development of summer-flowering Camellia azalea

    PubMed Central

    Fan, Zhengqi; Li, Jiyuan; Li, Xinlei; Wu, Bin; Wang, Jiangying; Liu, Zhongchi; Yin, Hengfu

    2015-01-01

    The transition from vegetative to reproductive growth in woody perennials involves pathways controlling flowering timing, bud dormancy and outgrowth in responses to seasonal cues. However little is known about the mechanism governing the adaptation of signaling pathways to environmental conditions in trees. Camellia azalea is a rare species in this genus flowering during summer, which provides a unique resource for floral timing breeding. Here we reported a comprehensive transcriptomics study to capture the global gene profiles during floral bud development in C. azalea. We examined the genome-wide gene expression between three developmental stages including floral bud initiation, floral organ differentiation and bud outgrowth, and identified nine co-expression clusters with distinctive patterns. Further, we identified the differential expressed genes (DEGs) during development and characterized the functional properties of DEGs by Gene Ontology analysis. We showed that transition from floral bud initiation to floral organ differentiation required changes of genes in flowering timing regulation, while transition to floral bud outgrowth was regulated by various pathways such as cold and light signaling, phytohormone pathways and plant metabolisms. Further analyses of dormancy associated MADS-box genes revealed that SVP- and AGL24- like genes displayed distinct expression patterns suggesting divergent roles during floral bud development. PMID:25978548

  13. Carotenoid metabolic profiling and transcriptome-genome mining reveal functional equivalence among blue-pigmented copepods and appendicularia.

    PubMed

    Mojib, Nazia; Amad, Maan; Thimma, Manjula; Aldanondo, Naroa; Kumaran, Mande; Irigoien, Xabier

    2014-06-01

    The tropical oligotrophic oceanic areas are characterized by high water transparency and annual solar radiation. Under these conditions, a large number of phylogenetically diverse mesozooplankton species living in the surface waters (neuston) are found to be blue pigmented. In the present study, we focused on understanding the metabolic and genetic basis of the observed blue phenotype functional equivalence between the blue-pigmented organisms from the phylum Arthropoda, subclass Copepoda (Acartia fossae) and the phylum Chordata, class Appendicularia (Oikopleura dioica) in the Red Sea. Previous studies have shown that carotenoid-protein complexes are responsible for blue coloration in crustaceans. Therefore, we performed carotenoid metabolic profiling using both targeted and nontargeted (high-resolution mass spectrometry) approaches in four different blue-pigmented genera of copepods and one blue-pigmented species of appendicularia. Astaxanthin was found to be the principal carotenoid in all the species. The pathway analysis showed that all the species can synthesize astaxanthin from β-carotene, ingested from dietary sources, via 3-hydroxyechinenone, canthaxanthin, zeaxanthin, adonirubin or adonixanthin. Further, using de novo assembled transcriptome of blue A. fossae (subclass Copepoda), we identified highly expressed homologous β-carotene hydroxylase enzymes and putative carotenoid-binding proteins responsible for astaxanthin formation and the blue phenotype. In blue O. dioica (class Appendicularia), corresponding putative genes were identified from the reference genome. Collectively, our data provide molecular evidences for the bioconversion and accumulation of blue astaxanthin-protein complexes underpinning the observed ecological functional equivalence and adaptive convergence among neustonic mesozooplankton.

  14. Integration of genomic, transcriptomic and proteomic data identifies two biologically distinct subtypes of invasive lobular breast cancer

    PubMed Central

    Michaut, Magali; Chin, Suet-Feung; Majewski, Ian; Severson, Tesa M.; Bismeijer, Tycho; de Koning, Leanne; Peeters, Justine K.; Schouten, Philip C.; Rueda, Oscar M.; Bosma, Astrid J.; Tarrant, Finbarr; Fan, Yue; He, Beilei; Xue, Zheng; Mittempergher, Lorenza; Kluin, Roelof J.C.; Heijmans, Jeroen; Snel, Mireille; Pereira, Bernard; Schlicker, Andreas; Provenzano, Elena; Ali, Hamid Raza; Gaber, Alexander; O’Hurley, Gillian; Lehn, Sophie; Muris, Jettie J.F.; Wesseling, Jelle; Kay, Elaine; Sammut, Stephen John; Bardwell, Helen A.; Barbet, Aurélie S.; Bard, Floriane; Lecerf, Caroline; O’Connor, Darran P.; Vis, Daniël J.; Benes, Cyril H.; McDermott, Ultan; Garnett, Mathew J.; Simon, Iris M.; Jirström, Karin; Dubois, Thierry; Linn, Sabine C.; Gallagher, William M.; Wessels, Lodewyk F.A.; Caldas, Carlos; Bernards, Rene

    2016-01-01

    Invasive lobular carcinoma (ILC) is the second most frequently occurring histological breast cancer subtype after invasive ductal carcinoma (IDC), accounting for around 10% of all breast cancers. The molecular processes that drive the development of ILC are still largely unknown. We have performed a comprehensive genomic, transcriptomic and proteomic analysis of a large ILC patient cohort and present here an integrated molecular portrait of ILC. Mutations in CDH1 and in the PI3K pathway are the most frequent molecular alterations in ILC. We identified two main subtypes of ILCs: (i) an immune related subtype with mRNA up-regulation of PD-L1, PD-1 and CTLA-4 and greater sensitivity to DNA-damaging agents in representative cell line models; (ii) a hormone related subtype, associated with Epithelial to Mesenchymal Transition (EMT), and gain of chromosomes 1q and 8q and loss of chromosome 11q. Using the somatic mutation rate and eIF4B protein level, we identified three groups with different clinical outcomes, including a group with extremely good prognosis. We provide a comprehensive overview of the molecular alterations driving ILC and have explored links with therapy response. This molecular characterization may help to tailor treatment of ILC through the application of specific targeted, chemo- and/or immune-therapies. PMID:26729235

  15. Genome-wide transcriptomic analysis uncovers the molecular basis underlying early flowering and apetalous characteristic in Brassica napus L.

    PubMed

    Yu, Kunjiang; Wang, Xiaodong; Chen, Feng; Chen, Song; Peng, Qi; Li, Hongge; Zhang, Wei; Hu, Maolong; Chu, Pu; Zhang, Jiefu; Guan, Rongzhan

    2016-01-01

    Floral transition and petal onset, as two main aspects of flower development, are crucial to rapeseed evolutionary success and yield formation. Currently, very little is known regarding the genetic architecture that regulates flowering time and petal morphogenesis in Brassica napus. In the present study, a genome-wide transcriptomic analysis was performed with an absolutely apetalous and early flowering line, APL01, and a normally petalled line, PL01, using high-throughput RNA sequencing. In total, 13,205 differential expressed genes were detected, of which 6111 genes were significantly down-regulated, while 7094 genes were significantly up-regulated in the young inflorescences of APL01 compared with PL01. The expression levels of a vast number of genes involved in protein biosynthesis were altered in response to the early flowering and apetalous character. Based on the putative rapeseed flowering genes, an early flowering network, mainly comprised of vernalization and photoperiod pathways, was built. Additionally, 36 putative upstream genes possibly governing the apetalous character of line APL01 were identified, and six genes potentially regulating petal origination were obtained by combining with three petal-related quantitative trait loci. These findings will facilitate understanding of the molecular mechanisms underlying floral transition and petal initiation in B. napus. PMID:27460760

  16. Genome-wide transcriptomic analysis uncovers the molecular basis underlying early flowering and apetalous characteristic in Brassica napus L

    PubMed Central

    Yu, Kunjiang; Wang, Xiaodong; Chen, Feng; Chen, Song; Peng, Qi; Li, Hongge; Zhang, Wei; Hu, Maolong; Chu, Pu; Zhang, Jiefu; Guan, Rongzhan

    2016-01-01

    Floral transition and petal onset, as two main aspects of flower development, are crucial to rapeseed evolutionary success and yield formation. Currently, very little is known regarding the genetic architecture that regulates flowering time and petal morphogenesis in Brassica napus. In the present study, a genome-wide transcriptomic analysis was performed with an absolutely apetalous and early flowering line, APL01, and a normally petalled line, PL01, using high-throughput RNA sequencing. In total, 13,205 differential expressed genes were detected, of which 6111 genes were significantly down-regulated, while 7094 genes were significantly up-regulated in the young inflorescences of APL01 compared with PL01. The expression levels of a vast number of genes involved in protein biosynthesis were altered in response to the early flowering and apetalous character. Based on the putative rapeseed flowering genes, an early flowering network, mainly comprised of vernalization and photoperiod pathways, was built. Additionally, 36 putative upstream genes possibly governing the apetalous character of line APL01 were identified, and six genes potentially regulating petal origination were obtained by combining with three petal-related quantitative trait loci. These findings will facilitate understanding of the molecular mechanisms underlying floral transition and petal initiation in B. napus. PMID:27460760

  17. Genome-Wide Transcriptome Profiling Revealed Cotton Fuzz Fiber Development Having a Similar Molecular Model as Arabidopsis Trichome

    PubMed Central

    Wan, Qun; Zhang, Hua; Ye, Wenxue; Wu, Huaitong; Zhang, Tianzhen

    2014-01-01

    The cotton fiber, as a single-celled trichome, is a biological model system for studying cell differentiation and elongation. However, the complexity of gene expression and regulation in the fiber complicates genetic research. In this study, we investigated the genome-wide transcriptome profiling in Texas Marker-1 (TM-1) and five naked seed or fuzzless mutants (three dominant and two recessive) during the fuzz initial development stage. More than three million clean tags were generated from each sample representing the expression data for 27,325 genes, which account for 72.8% of the annotated Gossypium raimondii primary transcript genes. Thousands of differentially expressed genes (DEGs) were identified between TM-1 and the mutants. Based on functional enrichment analysis, the DEGs downregulated in the mutants were enriched in protein synthesis-related genes and transcription factors, while DEGs upregulated in the mutants were enriched in DNA/chromatin structure-related genes and transcription factors. Pathway analysis showed that ATP synthesis, and sugar and lipid metabolism-related pathways play important roles in fuzz initial development. Also, we identified a large number of transcription factors such as MYB, bHLH, HB, WRKY, AP2/EREBP, bZIP and C2H2 zinc finger families that were differently expressed between TM-1 and the mutants, and were also related to trichome development in Arabidopsis. PMID:24823367

  18. The transcriptomes of novel marmoset monkey embryonic stem cell lines reflect distinct genomic features

    PubMed Central

    Debowski, Katharina; Drummer, Charis; Lentes, Jana; Cors, Maren; Dressel, Ralf; Lingner, Thomas; Salinas-Riester, Gabriela; Fuchs, Sigrid; Sasaki, Erika; Behr, Rüdiger

    2016-01-01

    Embryonic stem cells (ESCs) are useful for the study of embryonic development. However, since research on naturally conceived human embryos is limited, non-human primate (NHP) embryos and NHP ESCs represent an excellent alternative to the corresponding human entities. Though, ESC lines derived from naturally conceived NHP embryos are still very rare. Here, we report the generation and characterization of four novel ESC lines derived from natural preimplantation embryos of the common marmoset monkey (Callithrix jacchus). For the first time we document derivation of NHP ESCs derived from morula stages. We show that quantitative chromosome-wise transcriptome analyses precisely reflect trisomies present in both morula-derived ESC lines. We also demonstrate that the female ESC lines exhibit different states of X-inactivation which is impressively reflected by the abundance of the lncRNA X inactive-specific transcript (XIST). The novel marmoset ESC lines will promote basic primate embryo and ESC studies as well as preclinical testing of ESC-based regenerative approaches in NHP. PMID:27385131

  19. Identification of candidate genes associated with porcine meat color traits by genome-wide transcriptome analysis

    PubMed Central

    Li, Bojiang; Dong, Chao; Li, Pinghua; Ren, Zhuqing; Wang, Han; Yu, Fengxiang; Ning, Caibo; Liu, Kaiqing; Wei, Wei; Huang, Ruihua; Chen, Jie; Wu, Wangjun; Liu, Honglin

    2016-01-01

    Meat color is considered to be the most important indicator of meat quality, however, the molecular mechanisms underlying traits related to meat color remain mostly unknown. In this study, to elucidate the molecular basis of meat color, we constructed six cDNA libraries from biceps femoris (Bf) and soleus (Sol), which exhibit obvious differences in meat color, and analyzed the whole-transcriptome differences between Bf (white muscle) and Sol (red muscle) using high-throughput sequencing technology. Using DEseq2 method, we identified 138 differentially expressed genes (DEGs) between Bf and Sol. Using DEGseq method, we identified 770, 810, and 476 DEGs in comparisons between Bf and Sol in three separate animals. Of these DEGs, 52 were overlapping DEGs. Using these data, we determined the enriched GO terms, metabolic pathways and candidate genes associated with meat color traits. Additionally, we mapped 114 non-redundant DEGs to the meat color QTLs via a comparative analysis with the porcine quantitative trait loci (QTL) database. Overall, our data serve as a valuable resource for identifying genes whose functions are critical for meat color traits and can accelerate studies of the molecular mechanisms of meat color formation. PMID:27748458

  20. The transcriptomes of novel marmoset monkey embryonic stem cell lines reflect distinct genomic features.

    PubMed

    Debowski, Katharina; Drummer, Charis; Lentes, Jana; Cors, Maren; Dressel, Ralf; Lingner, Thomas; Salinas-Riester, Gabriela; Fuchs, Sigrid; Sasaki, Erika; Behr, Rüdiger

    2016-01-01

    Embryonic stem cells (ESCs) are useful for the study of embryonic development. However, since research on naturally conceived human embryos is limited, non-human primate (NHP) embryos and NHP ESCs represent an excellent alternative to the corresponding human entities. Though, ESC lines derived from naturally conceived NHP embryos are still very rare. Here, we report the generation and characterization of four novel ESC lines derived from natural preimplantation embryos of the common marmoset monkey (Callithrix jacchus). For the first time we document derivation of NHP ESCs derived from morula stages. We show that quantitative chromosome-wise transcriptome analyses precisely reflect trisomies present in both morula-derived ESC lines. We also demonstrate that the female ESC lines exhibit different states of X-inactivation which is impressively reflected by the abundance of the lncRNA X inactive-specific transcript (XIST). The novel marmoset ESC lines will promote basic primate embryo and ESC studies as well as preclinical testing of ESC-based regenerative approaches in NHP. PMID:27385131

  1. Transcriptome resources for the white-footed mouse (Peromyscus leucopus): new genomic tools for investigating ecologically divergent urban and rural populations.

    PubMed

    Harris, Stephen E; O'Neill, Rachel J; Munshi-South, Jason

    2015-03-01

    Genomic resources are important and attainable for examining evolutionary change in divergent natural populations of nonmodel species. We utilized two next-generation sequencing (NGS) platforms, 454 and SOLiD 5500XL, to assemble low-coverage transcriptomes of the white-footed mouse (Peromyscus leucopus), a widespread and abundant native rodent in eastern North America. We sequenced liver mRNA transcripts from multiple individuals collected from urban populations in New York City and rural populations in undisturbed protected areas nearby and assembled a reference transcriptome using 1 080 065 954 SOLiD 5500XL (75 bp) reads and 3 052 640 454 GS FLX + reads. The reference contained 40 908 contigs with a N50 = 1044 bp and a total content of 30.06 Megabases (Mb). Contigs were annotated from Mus musculus (39.96% annotated) Uniprot databases. We identified 104 655 high-quality single nucleotide polymorphisms (SNPs) and 65 single sequence repeats (SSRs) with flanking primers. We also used normalized read counts to identify putative gene expression differences in 10 genes between populations. There were 19 contigs significantly differentially expressed in urban populations compared to rural populations, with gene function annotations generally related to the translation and modification of proteins and those involved in immune responses. The individual transcriptomes generated in this study will be used to investigate evolutionary responses to urbanization. The reference transcriptome provides a valuable resource for the scientific community using North American Peromyscus species as emerging model systems for ecological genetics and adaptation. PMID:24980186

  2. Transcriptome resources for the white-footed mouse (Peromyscus leucopus): new genomic tools for investigating ecologically divergent urban and rural populations.

    PubMed

    Harris, Stephen E; O'Neill, Rachel J; Munshi-South, Jason

    2015-03-01

    Genomic resources are important and attainable for examining evolutionary change in divergent natural populations of nonmodel species. We utilized two next-generation sequencing (NGS) platforms, 454 and SOLiD 5500XL, to assemble low-coverage transcriptomes of the white-footed mouse (Peromyscus leucopus), a widespread and abundant native rodent in eastern North America. We sequenced liver mRNA transcripts from multiple individuals collected from urban populations in New York City and rural populations in undisturbed protected areas nearby and assembled a reference transcriptome using 1 080 065 954 SOLiD 5500XL (75 bp) reads and 3 052 640 454 GS FLX + reads. The reference contained 40 908 contigs with a N50 = 1044 bp and a total content of 30.06 Megabases (Mb). Contigs were annotated from Mus musculus (39.96% annotated) Uniprot databases. We identified 104 655 high-quality single nucleotide polymorphisms (SNPs) and 65 single sequence repeats (SSRs) with flanking primers. We also used normalized read counts to identify putative gene expression differences in 10 genes between populations. There were 19 contigs significantly differentially expressed in urban populations compared to rural populations, with gene function annotations generally related to the translation and modification of proteins and those involved in immune responses. The individual transcriptomes generated in this study will be used to investigate evolutionary responses to urbanization. The reference transcriptome provides a valuable resource for the scientific community using North American Peromyscus species as emerging model systems for ecological genetics and adaptation.

  3. Triplexator: Detecting nucleic acid triple helices in genomic and transcriptomic data

    PubMed Central

    Buske, Fabian A.; Bauer, Denis C.; Mattick, John S.; Bailey, Timothy L.

    2012-01-01

    Double-stranded DNA is able to form triple-helical structures by accommodating a third nucleotide strand in its major groove. This sequence-specific process offers a potent mechanism for targeting genomic loci of interest that is of great value for biotechnological and gene-therapeutic applications. It is likely that nature has leveraged this addressing system for gene regulation, because computational studies have uncovered an abundance of putative triplex target sites in various genomes, with enrichment particularly in gene promoters. However, to draw a more complete picture of the in vivo role of triplexes, not only the putative targets but also the sequences acting as the third strand and their capability to pair with the predicted target sites need to be studied. Here we present Triplexator, the first computational framework that integrates all aspects of triplex formation, and showcase its potential by discussing research examples for which the different aspects of triplex formation are important. We find that chromatin-associated RNAs have a significantly higher fraction of sequence features able to form triplexes than expected at random, suggesting their involvement in gene regulation. We furthermore identify hundreds of human genes that contain sequence features in their promoter predicted to be able to form a triplex with a target within the same promoter, suggesting the involvement of triplexes in feedback-based gene regulation. With focus on biotechnological applications, we screen mammalian genomes for high-affinity triplex target sites that can be used to target genomic loci specifically and find that triplex formation offers a resolution of ∼1300 nt. PMID:22550012

  4. The genome and transcriptome of Phalaenopsis yield insights into floral organ development and flowering regulation.

    PubMed

    Huang, Jian-Zhi; Lin, Chih-Peng; Cheng, Ting-Chi; Huang, Ya-Wen; Tsai, Yi-Jung; Cheng, Shu-Yun; Chen, Yi-Wen; Lee, Chueh-Pai; Chung, Wan-Chia; Chang, Bill Chia-Han; Chin, Shih-Wen; Lee, Chen-Yu; Chen, Fure-Chyi

    2016-01-01

    The Phalaenopsis orchid is an important potted flower of high economic value around the world. We report the 3.1 Gb draft genome assembly of an important winter flowering Phalaenopsis 'KHM190' cultivar. We generated 89.5 Gb RNA-seq and 113 million sRNA-seq reads to use these data to identify 41,153 protein-coding genes and 188 miRNA families. We also generated a draft genome for Phalaenopsis pulcherrima 'B8802,' a summer flowering species, via resequencing. Comparison of genome data between the two Phalaenopsis cultivars allowed the identification of 691,532 single-nucleotide polymorphisms. In this study, we reveal that the key role of PhAGL6b in the regulation of labellum organ development involves alternative splicing in the big lip mutant. Petal or sepal overexpressing PhAGL6b leads to the conversion into a lip-like structure. We also discovered that the gibberellin pathway that regulates the expression of flowering time genes during the reproductive phase change is induced by cool temperature. Our work thus depicted a valuable resource for the flowering control, flower architecture development, and breeding of the Phalaenopsis orchids. PMID:27190718

  5. A whole-genome sequence and transcriptome perspective on HER2-positive breast cancers

    PubMed Central

    Ferrari, Anthony; Vincent-Salomon, Anne; Pivot, Xavier; Sertier, Anne-Sophie; Thomas, Emilie; Tonon, Laurie; Boyault, Sandrine; Mulugeta, Eskeatnaf; Treilleux, Isabelle; MacGrogan, Gaëtan; Arnould, Laurent; Kielbassa, Janice; Le Texier, Vincent; Blanché, Hélène; Deleuze, Jean-François; Jacquemier, Jocelyne; Mathieu, Marie-Christine; Penault-Llorca, Frédérique; Bibeau, Frédéric; Mariani, Odette; Mannina, Cécile; Pierga, Jean-Yves; Trédan, Olivier; Bachelot, Thomas; Bonnefoi, Hervé; Romieu, Gilles; Fumoleau, Pierre; Delaloge, Suzette; Rios, Maria; Ferrero, Jean-Marc; Tarpin, Carole; Bouteille, Catherine; Calvo, Fabien; Gut, Ivo Glynne; Gut, Marta; Martin, Sancha; Nik-Zainal, Serena; Stratton, Michael R.; Pauporté, Iris; Saintigny, Pierre; Birnbaum, Daniel; Viari, Alain; Thomas, Gilles

    2016-01-01

    HER2-positive breast cancer has long proven to be a clinically distinct class of breast cancers for which several targeted therapies are now available. However, resistance to the treatment associated with specific gene expressions or mutations has been observed, revealing the underlying diversity of these cancers. Therefore, understanding the full extent of the HER2-positive disease heterogeneity still remains challenging. Here we carry out an in-depth genomic characterization of 64 HER2-positive breast tumour genomes that exhibit four subgroups, based on the expression data, with distinctive genomic features in terms of somatic mutations, copy-number changes or structural variations. The results suggest that, despite being clinically defined by a specific gene amplification, HER2-positive tumours melt into the whole luminal–basal breast cancer spectrum rather than standing apart. The results also lead to a refined ERBB2 amplicon of 106 kb and show that several cases of amplifications are compatible with a breakage–fusion–bridge mechanism. PMID:27406316

  6. A whole-genome sequence and transcriptome perspective on HER2-positive breast cancers.

    PubMed

    Ferrari, Anthony; Vincent-Salomon, Anne; Pivot, Xavier; Sertier, Anne-Sophie; Thomas, Emilie; Tonon, Laurie; Boyault, Sandrine; Mulugeta, Eskeatnaf; Treilleux, Isabelle; MacGrogan, Gaëtan; Arnould, Laurent; Kielbassa, Janice; Le Texier, Vincent; Blanché, Hélène; Deleuze, Jean-François; Jacquemier, Jocelyne; Mathieu, Marie-Christine; Penault-Llorca, Frédérique; Bibeau, Frédéric; Mariani, Odette; Mannina, Cécile; Pierga, Jean-Yves; Trédan, Olivier; Bachelot, Thomas; Bonnefoi, Hervé; Romieu, Gilles; Fumoleau, Pierre; Delaloge, Suzette; Rios, Maria; Ferrero, Jean-Marc; Tarpin, Carole; Bouteille, Catherine; Calvo, Fabien; Gut, Ivo Glynne; Gut, Marta; Martin, Sancha; Nik-Zainal, Serena; Stratton, Michael R; Pauporté, Iris; Saintigny, Pierre; Birnbaum, Daniel; Viari, Alain; Thomas, Gilles

    2016-07-13

    HER2-positive breast cancer has long proven to be a clinically distinct class of breast cancers for which several targeted therapies are now available. However, resistance to the treatment associated with specific gene expressions or mutations has been observed, revealing the underlying diversity of these cancers. Therefore, understanding the full extent of the HER2-positive disease heterogeneity still remains challenging. Here we carry out an in-depth genomic characterization of 64 HER2-positive breast tumour genomes that exhibit four subgroups, based on the expression data, with distinctive genomic features in terms of somatic mutations, copy-number changes or structural variations. The results suggest that, despite being clinically defined by a specific gene amplification, HER2-positive tumours melt into the whole luminal-basal breast cancer spectrum rather than standing apart. The results also lead to a refined ERBB2 amplicon of 106 kb and show that several cases of amplifications are compatible with a breakage-fusion-bridge mechanism.

  7. The genome and transcriptome of Phalaenopsis yield insights into floral organ development and flowering regulation

    PubMed Central

    Cheng, Ting-Chi; Huang, Ya-Wen; Tsai, Yi-Jung; Chen, Yi-Wen; Lee, Chueh-Pai; Chung, Wan-Chia

    2016-01-01

    The Phalaenopsis orchid is an important potted flower of high economic value around the world. We report the 3.1 Gb draft genome assembly of an important winter flowering Phalaenopsis ‘KHM190’ cultivar. We generated 89.5 Gb RNA-seq and 113 million sRNA-seq reads to use these data to identify 41,153 protein-coding genes and 188 miRNA families. We also generated a draft genome for Phalaenopsis pulcherrima ‘B8802,’ a summer flowering species, via resequencing. Comparison of genome data between the two Phalaenopsis cultivars allowed the identification of 691,532 single-nucleotide polymorphisms. In this study, we reveal that the key role of PhAGL6b in the regulation of labellum organ development involves alternative splicing in the big lip mutant. Petal or sepal overexpressing PhAGL6b leads to the conversion into a lip-like structure. We also discovered that the gibberellin pathway that regulates the expression of flowering time genes during the reproductive phase change is induced by cool temperature. Our work thus depicted a valuable resource for the flowering control, flower architecture development, and breeding of the Phalaenopsis orchids. PMID:27190718

  8. A whole-genome sequence and transcriptome perspective on HER2-positive breast cancers.

    PubMed

    Ferrari, Anthony; Vincent-Salomon, Anne; Pivot, Xavier; Sertier, Anne-Sophie; Thomas, Emilie; Tonon, Laurie; Boyault, Sandrine; Mulugeta, Eskeatnaf; Treilleux, Isabelle; MacGrogan, Gaëtan; Arnould, Laurent; Kielbassa, Janice; Le Texier, Vincent; Blanché, Hélène; Deleuze, Jean-François; Jacquemier, Jocelyne; Mathieu, Marie-Christine; Penault-Llorca, Frédérique; Bibeau, Frédéric; Mariani, Odette; Mannina, Cécile; Pierga, Jean-Yves; Trédan, Olivier; Bachelot, Thomas; Bonnefoi, Hervé; Romieu, Gilles; Fumoleau, Pierre; Delaloge, Suzette; Rios, Maria; Ferrero, Jean-Marc; Tarpin, Carole; Bouteille, Catherine; Calvo, Fabien; Gut, Ivo Glynne; Gut, Marta; Martin, Sancha; Nik-Zainal, Serena; Stratton, Michael R; Pauporté, Iris; Saintigny, Pierre; Birnbaum, Daniel; Viari, Alain; Thomas, Gilles

    2016-01-01

    HER2-positive breast cancer has long proven to be a clinically distinct class of breast cancers for which several targeted therapies are now available. However, resistance to the treatment associated with specific gene expressions or mutations has been observed, revealing the underlying diversity of these cancers. Therefore, understanding the full extent of the HER2-positive disease heterogeneity still remains challenging. Here we carry out an in-depth genomic characterization of 64 HER2-positive breast tumour genomes that exhibit four subgroups, based on the expression data, with distinctive genomic features in terms of somatic mutations, copy-number changes or structural variations. The results suggest that, despite being clinically defined by a specific gene amplification, HER2-positive tumours melt into the whole luminal-basal breast cancer spectrum rather than standing apart. The results also lead to a refined ERBB2 amplicon of 106 kb and show that several cases of amplifications are compatible with a breakage-fusion-bridge mechanism. PMID:27406316

  9. 13C metabolic flux analysis at a genome-scale.

    PubMed

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  10. Genome and transcriptome delineation of two major oncogenic pathways governing invasive ductal breast cancer development

    PubMed Central

    Aswad, Luay; Yenamandra, Surya Pavan; Ow, Ghim Siong; Grinchuk, Oleg; Ivshina, Anna V.; Kuznetsov, Vladimir A.

    2015-01-01

    Invasive ductal carcinoma (IDC) is a major histo-morphologic type of breast cancer. Histological grading (HG) of IDC is widely adopted by oncologists as a prognostic factor. However, HG evaluation is highly subjective with only 50%–85% inter-observer agreements. Specifically, the subjectivity in the assignment of the intermediate grade (histologic grade 2, HG2) breast cancers (comprising ~50% of IDC cases) results in uncertain disease outcome prediction and sub-optimal systemic therapy. Despite several attempts to identify the mechanisms underlying the HG classification, their molecular bases are poorly understood. We performed integrative bioinformatics analysis of TCGA and several other cohorts (total 1246 patients). We identified a 22-gene tumor aggressiveness grading classifier (22g-TAG) that reflects global bifurcation in the IDC transcriptomes and reclassified patients with HG2 tumors into two genetically and clinically distinct subclasses: histological grade 1-like (HG1-like) and histological grade 3-like (HG3-like). The expression profiles and clinical outcomes of these subclasses were similar to the HG1 and HG3 tumors, respectively. We further reclassified IDC into low genetic grade (LGG = HG1+HG1-like) and high genetic grade (HGG = HG3-like+HG3) subclasses. For the HG1-like and HG3-like IDCs we found subclass-specific DNA alterations, somatic mutations, oncogenic pathways, cell cycle/mitosis and stem cell-like expression signatures that discriminate between these tumors. We found similar molecular patterns in the LGG and HGG tumor classes respectively. Our results suggest the existence of two genetically-predefined IDC classes, LGG and HGG, driven by distinct oncogenic pathways. They provide novel prognostic and therapeutic biomarkers and could open unique opportunities for personalized systemic therapies of IDC patients. PMID:26474389

  11. Toward Whole-Transcriptome Editing with CRISPR-Cas9.

    PubMed

    Heckl, Dirk; Charpentier, Emmanuelle

    2015-05-21

    Targeted regulation of gene expression holds huge promise for biomedical research. In a series of recent publications (Gilbert et al., 2014; Konermann et al., 2015; Zalatan et al., 2015), sophisticated, multiplex-compatible transcriptional activator systems based on the CRISPR-Cas9 technology and genome-scale libraries advance the field toward whole-transcriptome control. PMID:26000839

  12. Comprehensive Genome-Wide Transcriptomic Analysis of Immature Articular Cartilage following Ischemic Osteonecrosis of the Femoral Head in Piglets

    PubMed Central

    Adapala, Naga Suresh; Kim, Harry K. W.

    2016-01-01

    Objective Ischemic osteonecrosis of the femoral head (ONFH) in piglets results in an ischemic injury to the immature articular cartilage. The molecular changes in the articular cartilage in response to ONFH have not been investigated using a transcriptomic approach. The purpose of this study was to perform a genome-wide transcriptomic analysis to identify genes that are upregulated in the immature articular cartilage following ONFH. Methods ONFH was induced in the right femoral head of 6-week old piglets. The unoperated femoral head was used as the normal control. At 24 hours (acute ischemic-hypoxic injury), 2 weeks (avascular necrosis in the femoral head) and 4 weeks (early repair) after surgery (n = 4 piglets/time point), RNA was isolated from the articular cartilage of the femoral head. A microarray analysis was performed using Affymetrix Porcine GeneChip Array. An enrichment analysis and functional clustering of the genes upregulated due to ONFH were performed using DAVID and STRING software, respectively. The increased expression of selected genes was confirmed by a real-time qRTPCR analysis. Results Induction of ONFH resulted in the upregulation of 383 genes at 24 hours, 122 genes at 2 weeks and 124 genes at 4 weeks compared to the normal controls. At 24 hours, the genes involved in oxidoreductive, cell-survival, and angiogenic responses were significantly enriched among the upregulated genes. These genes were involved in HIF-1, PI3K-Akt, and MAPK signaling pathways. At 2 weeks, secretory and signaling proteins involved in angiogenic and inflammatory responses, PI3K-Akt and matrix-remodeling pathways were significantly enriched. At 4 weeks, genes that represent inflammatory cytokines and chemokine signaling pathways were significantly enriched. Several index genes (genes that are upregulated at more than one time point following ONFH and are known to be important in various biological processes) including HIF-1A, VEGFA, IL-6, IL6R, IL-8, CCL2, FGF2, TGFB2

  13. Fish-T1K (Transcriptomes of 1,000 Fishes) Project: large-scale transcriptome data for fish evolution studies.

    PubMed

    Sun, Ying; Huang, Yu; Li, Xiaofeng; Baldwin, Carole C; Zhou, Zhuocheng; Yan, Zhixiang; Crandall, Keith A; Zhang, Yong; Zhao, Xiaomeng; Wang, Min; Wong, Alex; Fang, Chao; Zhang, Xinhui; Huang, Hai; Lopez, Jose V; Kilfoyle, Kirk; Zhang, Yong; Ortí, Guillermo; Venkatesh, Byrappa; Shi, Qiong

    2016-01-01

    Ray-finned fishes (Actinopterygii) represent more than 50 % of extant vertebrates and are of great evolutionary, ecologic and economic significance, but they are relatively underrepresented in 'omics studies. Increased availability of transcriptome data for these species will allow researchers to better understand changes in gene expression, and to carry out functional analyses. An international project known as the "Transcriptomes of 1,000 Fishes" (Fish-T1K) project has been established to generate RNA-seq transcriptome sequences for 1,000 diverse species of ray-finned fishes. The first phase of this project has produced transcriptomes from more than 180 ray-finned fishes, representing 142 species and covering 51 orders and 109 families. Here we provide an overview of the goals of this project and the work done so far. PMID:27144000

  14. Fish-T1K (Transcriptomes of 1,000 Fishes) Project: large-scale transcriptome data for fish evolution studies.

    PubMed

    Sun, Ying; Huang, Yu; Li, Xiaofeng; Baldwin, Carole C; Zhou, Zhuocheng; Yan, Zhixiang; Crandall, Keith A; Zhang, Yong; Zhao, Xiaomeng; Wang, Min; Wong, Alex; Fang, Chao; Zhang, Xinhui; Huang, Hai; Lopez, Jose V; Kilfoyle, Kirk; Zhang, Yong; Ortí, Guillermo; Venkatesh, Byrappa; Shi, Qiong

    2016-01-01

    Ray-finned fishes (Actinopterygii) represent more than 50 % of extant vertebrates and are of great evolutionary, ecologic and economic significance, but they are relatively underrepresented in 'omics studies. Increased availability of transcriptome data for these species will allow researchers to better understand changes in gene expression, and to carry out functional analyses. An international project known as the "Transcriptomes of 1,000 Fishes" (Fish-T1K) project has been established to generate RNA-seq transcriptome sequences for 1,000 diverse species of ray-finned fishes. The first phase of this project has produced transcriptomes from more than 180 ray-finned fishes, representing 142 species and covering 51 orders and 109 families. Here we provide an overview of the goals of this project and the work done so far.

  15. Using EMOTE to map the exact 5'-ends of processed RNA on a transcriptome-wide scale.

    PubMed

    Redder, Peter

    2015-01-01

    The presence or absence of structure in an RNA is often crucial to its function. This is evident for highly structured RNAs such as rRNA, tRNA, or riboswitches, but it is also the case for many mRNAs, where secondary structures in the 5' or 3' UTR can determine the efficiency of translation or the half-life of the RNA. There are paths to modify such secondary structures, (1) by the action of a helicase that allows an alternative RNA structure to form, (2) by the formation of a duplex with another RNA, or (3) by cleavage of the RNA in a way that favors a different secondary structure. None of the three exclude the others, and in vivo it is common that two or all three work together to remodel an RNA to the desired form. However, while the first two solutions can be reversible, the cleavage of RNA is final, and there is no chance to go back. In this chapter, a method for tracking the 5' end created by RNA processing on a transcriptome-wide scale is presented. The Exact Mapping Of Transcriptome Ends (EMOTE) allows the large-scale identification of mono-phosphorylated RNA 5'-ends and provides the exact processing sites.

  16. Genome-wide genetic and transcriptomic investigation of variation in antibody response to dietary antigens.

    PubMed

    Rubicz, Rohina; Yolken, Robert; Alaedini, Armin; Drigalenko, Eugene; Charlesworth, Jac C; Carless, Melanie A; Severance, Emily G; Krivogorsky, Bogdana; Dyer, Thomas D; Kent, Jack W; Curran, Joanne E; Johnson, Matthew P; Cole, Shelley A; Almasy, Laura; Moses, Eric K; Blangero, John; Göring, Harald H H

    2014-07-01

    Increased immunoglobulin G (IgG) response to dietary antigens can be associated with gastrointestinal dysfunction and autoimmunity. The underlying processes contributing to these adverse reactions remain largely unknown, and it is likely that genetic factors play a role. Here, we estimate heritability and attempt to localize genetic factors influencing IgG antibody levels against food-derived antigens using an integrative genomics approach. IgG antibody levels were determined by ELISA in >1,300 Mexican Americans for the following food antigens: wheat gliadin; bovine casein; and two forms of bovine serum albumin (BSA-a and BSA-b). Pedigree-based variance components methods were used to estimate additive genetic heritability (h(2) ), perform genome-wide association analyses, and identify transcriptional signatures (based on 19,858 transcripts from peripheral blood lymphocytes). Heritability estimates were significant for all traits (0.15-0.53), and shared environment (based on shared residency among study participants) was significant for casein (0.09) and BSA-a (0.33). Genome-wide significant evidence of association was obtained only for antibody to gliadin (P = 8.57 × 10(-8) ), mapping to the human leukocyte antigen II region, with HLA-DRA and BTNL2 as the best candidate genes. Lack of association of known celiac disease risk alleles HLA-DQ2.5 and -DQ8 with antigliadin antibodies in the studied population suggests a separate genetic etiology. Significant transcriptional signatures were found for all IgG levels except BSA-b. These results demonstrate that individual genetic differences contribute to food antigen antibody measures in this population. Further investigations may elucidate the underlying immunological processes involved.

  17. The Dynamic Genome and Transcriptome of the Human Fungal Pathogen Blastomyces and Close Relative Emmonsia

    PubMed Central

    Gallo, Juan E.; Holder, Jason; Sullivan, Thomas D.; Marty, Amber J.; Carmen, John C.; Chen, Zehua; Ding, Li; Gujja, Sharvari; Magrini, Vincent; Misas, Elizabeth; Mitreva, Makedonka; Priest, Margaret; Saif, Sakina; Whiston, Emily A.; Young, Sarah; Zeng, Qiandong; Goldman, William E.; Mardis, Elaine R.; Taylor, John W.; McEwen, Juan G.; Clay, Oliver K.; Klein, Bruce S.; Cuomo, Christina A.

    2015-01-01

    Three closely related thermally dimorphic pathogens are causal agents of major fungal diseases affecting humans in the Americas: blastomycosis, histoplasmosis and paracoccidioidomycosis. Here we report the genome sequence and analysis of four strains of the etiological agent of blastomycosis, Blastomyces, and two species of the related genus Emmonsia, typically pathogens of small mammals. Compared to related species, Blastomyces genomes are highly expanded, with long, often sharply demarcated tracts of low GC-content sequence. These GC-poor isochore-like regions are enriched for gypsy elements, are variable in total size between isolates, and are least expanded in the avirulent B. dermatitidis strain ER-3 as compared with the virulent B. gilchristii strain SLH14081. The lack of similar regions in related species suggests these isochore-like regions originated recently in the ancestor of the Blastomyces lineage. While gene content is highly conserved between Blastomyces and related fungi, we identified changes in copy number of genes potentially involved in host interaction, including proteases and characterized antigens. In addition, we studied gene expression changes of B. dermatitidis during the interaction of the infectious yeast form with macrophages and in a mouse model. Both experiments highlight a strong antioxidant defense response in Blastomyces, and upregulation of dioxygenases in vivo suggests that dioxide produced by antioxidants may be further utilized for amino acid metabolism. We identify a number of functional categories upregulated exclusively in vivo, such as secreted proteins, zinc acquisition proteins, and cysteine and tryptophan metabolism, which may include critical virulence factors missed before in in vitro studies. Across the dimorphic fungi, loss of certain zinc acquisition genes and differences in amino acid metabolism suggest unique adaptations of Blastomyces to its host environment. These results reveal the dynamics of genome evolution

  18. Genome-wide genetic and transcriptomic investigation of variation in antibody response to dietary antigens

    PubMed Central

    Rubicz, Rohina; Yolken, Robert; Alaedini, Armin; Drigalenko, Eugene; Charlesworth, Jac C.; Carless, Melanie A.; Severance, Emily G.; Krivogorsky, Bogdana; Dyer, Thomas D.; Kent, Jack W.; Curran, Joanne E.; Johnson, Matthew P.; Cole, Shelley A.; Almasy, Laura; Moses, Eric K.; Blangero, John; Göring, Harald H.H.

    2014-01-01

    Increased immunoglobulin G (IgG) response to dietary antigens can be associated with gastrointestinal dysfunction and autoimmunity. The underlying processes contributing to these adverse reactions remain largely unknown, and it is likely that genetic factors play a role. Here we estimate heritability and attempt to localize genetic factors influencing IgG antibody levels against food-derived antigens using an integrative genomics approach. IgG antibody levels were determined by ELISA in >1300 Mexican Americans for the following food antigens: wheat gliadin; bovine casein; and two forms of bovine serum albumin (BSA-a and BSA-b). Pedigree-based variance components methods were used to estimate additive genetic heritability (h2), perform genome-wide association analyses, and identify transcriptional signatures (based on 19,858 transcripts from peripheral blood lymphocytes). Heritability estimates were significant for all traits (0.15-0.53), and shared environment (based on shared residency among study participants) was significant for casein (0.09) and BSA-a (0.33). Genome-wide significant evidence of association was obtained only for antibody to gliadin (p=8.57×10-8), mapping to the human leukocyte antigen II region, with HLA-DRA and BTNL2 as the best candidate genes. Lack of association of known celiac disease risk alleles HLA-DQ2.5 and -DQ8 with anti-gliadin antibodies in the studied population suggests a separate genetic etiology. Significant transcriptional signatures were found for all IgG levels except BSA-b. These results demonstrate that individual genetic differences contribute to food antigen antibody measures in this population. Further investigations may elucidate the underlying immunological processes involved. PMID:24962563

  19. Genome and transcriptome analysis of the grapevine (Vitis vinifera L.) WRKY gene family

    PubMed Central

    Wang, Min; Vannozzi, Alessandro; Wang, Gang; Liang, Ying-Hai; Tornielli, Giovanni Battista; Zenoni, Sara; Cavallini, Erika; Pezzotti, Mario; Cheng, Zong-Ming (Max)

    2014-01-01

    The plant WRKY gene family represents an ancient and complex class of zinc-finger transcription factors (TFs) that are involved in the regulation of various physiological processes, such as development and senescence, and in plant response to many biotic and abiotic stresses. Despite the growing number of studies on the genomic organisation of WRKY gene family in different species, little information is available about this family in grapevine (Vitis vinifera L.). In the present study, a total number of 59 putative grapevine WRKY transcription factors (VvWRKYs) were identified based on the analysis of various genomic and proteomic grapevine databases. According to their structural and phylogentic features, the identified grapevine WRKY transcription factors were classified into three main groups. In order to shed light into their regulatory roles in growth and development as well as in response to biotic and abiotic stress in grapevine, the VvWRKYs expression profiles were examined in publicly available microarray data. Bioinformatics analysis of these data revealed distinct temporal and spatial expression patterns of VvWRKYs in various tissues, organs and developmental stages, as well as in response to biotic and abiotic stresses. To also extend our analysis to situations not covered by the arrays and to validate our results, the expression profiles of selected VvWRKYs in response to drought stress, Erysiphe necator (powdery mildew) infection, and hormone treatments (salicilic acid and ethylene), were investigated by quantitative real-time reverse transcription PCR (qRT-PCR). The present study provides a foundation for further comparative genomics and functional studies of this important class of transcriptional regulators in grapevine. PMID:26504535

  20. Beyond the transcriptome: completion of act one of the Immunological Genome Project.

    PubMed

    Kim, Charles C; Lanier, Lewis L

    2013-10-01

    The Immunological Genome Consortium has generated a public resource (www.immgen.org) that provides a compendium of gene expression profiles of ∼270 leukocyte subsets in the mouse. This effort established carefully standardized operating procedures that resulted in a transcriptional dataset of unprecedented comprehensiveness and quality. The findings have been detailed recently in a series of publications providing molecular insights into the development, heterogeneity, and/or function of these cellular lineages and distinct subpopulations. Here, we review the key findings of these studies, highlighting what has been gained and how the knowledge can be used to accelerate progress toward a comprehensive understanding of the immune system.

  1. The somatic mutation profiles of 2,433 breast cancers refines their genomic and transcriptomic landscapes

    PubMed Central

    Pereira, Bernard; Chin, Suet-Feung; Rueda, Oscar M.; Vollan, Hans-Kristian Moen; Provenzano, Elena; Bardwell, Helen A.; Pugh, Michelle; Jones, Linda; Russell, Roslin; Sammut, Stephen-John; Tsui, Dana W. Y.; Liu, Bin; Dawson, Sarah-Jane; Abraham, Jean; Northen, Helen; Peden, John F.; Mukherjee, Abhik; Turashvili, Gulisa; Green, Andrew R.; McKinney, Steve; Oloumi, Arusha; Shah, Sohrab; Rosenfeld, Nitzan; Murphy, Leigh; Bentley, David R.; Ellis, Ian O.; Purushotham, Arnie; Pinder, Sarah E.; Børresen-Dale, Anne-Lise; Earl, Helena M.; Pharoah, Paul D.; Ross, Mark T.; Aparicio, Samuel; Caldas, Carlos

    2016-01-01

    The genomic landscape of breast cancer is complex, and inter- and intra-tumour heterogeneity are important challenges in treating the disease. In this study, we sequence 173 genes in 2,433 primary breast tumours that have copy number aberration (CNA), gene expression and long-term clinical follow-up data. We identify 40 mutation-driver (Mut-driver) genes, and determine associations between mutations, driver CNA profiles, clinical-pathological parameters and survival. We assess the clonal states of Mut-driver mutations, and estimate levels of intra-tumour heterogeneity using mutant-allele fractions. Associations between PIK3CA mutations and reduced survival are identified in three subgroups of ER-positive cancer (defined by amplification of 17q23, 11q13–14 or 8q24). High levels of intra-tumour heterogeneity are in general associated with a worse outcome, but highly aggressive tumours with 11q13–14 amplification have low levels of intra-tumour heterogeneity. These results emphasize the importance of genome-based stratification of breast cancer, and have important implications for designing therapeutic strategies. PMID:27161491

  2. Tissue-Specific Transcriptomic Profiling of Sorghum propinquum using a Rice Genome Array

    PubMed Central

    Zhang, Ting; Zhao, Xiuqin; Huang, Liyu; Liu, Xiaoyue; Zong, Ying; Zhu, Linghua; Yang, Daichang; Fu, Binying

    2013-01-01

    Sorghum (Sorghum bicolor) is one of the world's most important cereal crops. S. propinquum is a perennial wild relative of S. bicolor with well-developed rhizomes. Functional genomics analysis of S. propinquum, especially with respect to molecular mechanisms related to rhizome growth and development, can contribute to the development of more sustainable grain, forage, and bioenergy cropping systems. In this study, we used a whole rice genome oligonucleotide microarray to obtain tissue-specific gene expression profiles of S. propinquum with special emphasis on rhizome development. A total of 548 tissue-enriched genes were detected, including 31 and 114 unique genes that were expressed predominantly in the rhizome tips (RT) and internodes (RI), respectively. Further GO analysis indicated that the functions of these tissue-enriched genes corresponded to their characteristic biological processes. A few distinct cis-elements, including ABA-responsive RY repeat CATGCA, sugar-repressive TTATCC, and GA-responsive TAACAA, were found to be prevalent in RT-enriched genes, implying an important role in rhizome growth and development. Comprehensive comparative analysis of these rhizome-enriched genes and rhizome-specific genes previously identified in Oryza longistaminata and S. propinquum indicated that phytohormones, including ABA, GA, and SA, are key regulators of gene expression during rhizome development. Co-localization of rhizome-enriched genes with rhizome-related QTLs in rice and sorghum generated functional candidates for future cloning of genes associated with rhizome growth and development. PMID:23536906

  3. Genome-scale study reveals reduced metabolic adaptability in patients with non-alcoholic fatty liver disease.

    PubMed

    Hyötyläinen, Tuulia; Jerby, Livnat; Petäjä, Elina M; Mattila, Ismo; Jäntti, Sirkku; Auvinen, Petri; Gastaldelli, Amalia; Yki-Järvinen, Hannele; Ruppin, Eytan; Orešič, Matej

    2016-01-01

    Non-alcoholic fatty liver disease (NAFLD) is a major risk factor leading to chronic liver disease and type 2 diabetes. Here we chart liver metabolic activity and functionality in NAFLD by integrating global transcriptomic data, from human liver biopsies, and metabolic flux data, measured across the human splanchnic vascular bed, within a genome-scale model of human metabolism. We show that an increased amount of liver fat induces mitochondrial metabolism, lipolysis, glyceroneogenesis and a switch from lactate to glycerol as substrate for gluconeogenesis, indicating an intricate balance of exacerbated opposite metabolic processes in glycemic regulation. These changes were associated with reduced metabolic adaptability on a network level in the sense that liver fat accumulation puts increasing demands on the liver to adaptively regulate metabolic responses to maintain basic liver functions. We propose that failure to meet excessive metabolic challenges coupled with reduced metabolic adaptability may lead to a vicious pathogenic cycle leading to the co-morbidities of NAFLD. PMID:26839171

  4. Genome-scale study reveals reduced metabolic adaptability in patients with non-alcoholic fatty liver disease

    PubMed Central

    Hyötyläinen, Tuulia; Jerby, Livnat; Petäjä, Elina M.; Mattila, Ismo; Jäntti, Sirkku; Auvinen, Petri; Gastaldelli, Amalia; Yki-Järvinen, Hannele; Ruppin, Eytan; Orešič, Matej

    2016-01-01

    Non-alcoholic fatty liver disease (NAFLD) is a major risk factor leading to chronic liver disease and type 2 diabetes. Here we chart liver metabolic activity and functionality in NAFLD by integrating global transcriptomic data, from human liver biopsies, and metabolic flux data, measured across the human splanchnic vascular bed, within a genome-scale model of human metabolism. We show that an increased amount of liver fat induces mitochondrial metabolism, lipolysis, glyceroneogenesis and a switch from lactate to glycerol as substrate for gluconeogenesis, indicating an intricate balance of exacerbated opposite metabolic processes in glycemic regulation. These changes were associated with reduced metabolic adaptability on a network level in the sense that liver fat accumulation puts increasing demands on the liver to adaptively regulate metabolic responses to maintain basic liver functions. We propose that failure to meet excessive metabolic challenges coupled with reduced metabolic adaptability may lead to a vicious pathogenic cycle leading to the co-morbidities of NAFLD. PMID:26839171

  5. Human whole genome genotype and transcriptome data for Alzheimer’s and other neurodegenerative diseases

    PubMed Central

    Allen, Mariet; Carrasquillo, Minerva M.; Funk, Cory; Heavner, Benjamin D.; Zou, Fanggeng; Younkin, Curtis S.; Burgess, Jeremy D.; Chai, High-Seng; Crook, Julia; Eddy, James A.; Li, Hongdong; Logsdon, Ben; Peters, Mette A.; Dang, Kristen K.; Wang, Xue; Serie, Daniel; Wang, Chen; Nguyen, Thuy; Lincoln, Sarah; Malphrus, Kimberly; Bisceglio, Gina; Li, Ma; Golde, Todd E.; Mangravite, Lara M.; Asmann, Yan; Price, Nathan D.; Petersen, Ronald C.; Graff-Radford, Neill R.; Dickson, Dennis W.; Younkin, Steven G.; Ertekin-Taner, Nilüfer

    2016-01-01

    Previous genome-wide association studies (GWAS), conducted by our group and others, have identified loci that harbor risk variants for neurodegenerative diseases, including Alzheimer's disease (AD). Human disease variants are enriched for polymorphisms that affect gene expression, including some that are known to associate with expression changes in the brain. Postulating that many variants confer risk to neurodegenerative disease via transcriptional regulatory mechanisms, we have analyzed gene expression levels in the brain tissue of subjects with AD and related diseases. Herein, we describe our collective datasets comprised of GWAS data from 2,099 subjects; microarray gene expression data from 773 brain samples, 186 of which also have RNAseq; and an independent cohort of 556 brain samples with RNAseq. We expect that these datasets, which are available to all qualified researchers, will enable investigators to explore and identify transcriptional mechanisms contributing to neurodegenerative diseases. PMID:27727239

  6. Comparison of the Mitochondrial Genomes and Steady State Transcriptomes of Two Strains of the Trypanosomatid Parasite, Leishmania tarentolae

    PubMed Central

    Simpson, Larry; Douglass, Stephen M.; Lake, James A.; Pellegrini, Matteo; Li, Feng

    2015-01-01

    U-insertion/deletion RNA editing is a post-transcriptional mitochondrial RNA modification phenomenon required for viability of trypanosomatid parasites. Small guide RNAs encoded mainly by the thousands of catenated minicircles contain the information for this editing. We analyzed by NGS technology the mitochondrial genomes and transcriptomes of two strains, the old lab UC strain and the recently isolated LEM125 strain. PacBio sequencing provided complete minicircle sequences which avoided the assembly problem of short reads caused by the conserved regions. Minicircles were identified by a characteristic size, the presence of three short conserved sequences, a region of inherently bent DNA and the presence of single gRNA genes at a fairly defined location. The LEM125 strain contained over 114 minicircles encoding different gRNAs and the UC strain only ~24 minicircles. Some LEM125 minicircles contained no identifiable gRNAs. Approximate copy numbers of the different minicircle classes in the network were determined by the number of PacBio CCS reads that assembled to each class. Mitochondrial RNA libraries from both strains were mapped against the minicircle and maxicircle sequences. Small RNA reads mapped to the putative gRNA genes but also to multiple regions outside the genes on both strands and large RNA reads mapped in many cases over almost the entire minicircle on both strands. These data suggest that minicircle transcription is complete and bidirectional, with 3’ processing yielding the mature gRNAs. Steady state RNAs in varying abundances are derived from all maxicircle genes, including portions of the repetitive divergent region. The relative extents of editing in both strains correlated with the presence of a cascade of cognate gRNAs. These data should provide the foundation for a deeper understanding of this dynamic genetic system as well as the evolutionary variation of editing in different strains. PMID:26204118

  7. Comparison of the Mitochondrial Genomes and Steady State Transcriptomes of Two Strains of the Trypanosomatid Parasite, Leishmania tarentolae.

    PubMed

    Simpson, Larry; Douglass, Stephen M; Lake, James A; Pellegrini, Matteo; Li, Feng

    2015-01-01

    U-insertion/deletion RNA editing is a post-transcriptional mitochondrial RNA modification phenomenon required for viability of trypanosomatid parasites. Small guide RNAs encoded mainly by the thousands of catenated minicircles contain the information for this editing. We analyzed by NGS technology the mitochondrial genomes and transcriptomes of two strains, the old lab UC strain and the recently isolated LEM125 strain. PacBio sequencing provided complete minicircle sequences which avoided the assembly problem of short reads caused by the conserved regions. Minicircles were identified by a characteristic size, the presence of three short conserved sequences, a region of inherently bent DNA and the presence of single gRNA genes at a fairly defined location. The LEM125 strain contained over 114 minicircles encoding different gRNAs and the UC strain only ~24 minicircles. Some LEM125 minicircles contained no identifiable gRNAs. Approximate copy numbers of the different minicircle classes in the network were determined by the number of PacBio CCS reads that assembled to each class. Mitochondrial RNA libraries from both strains were mapped against the minicircle and maxicircle sequences. Small RNA reads mapped to the putative gRNA genes but also to multiple regions outside the genes on both strands and large RNA reads mapped in many cases over almost the entire minicircle on both strands. These data suggest that minicircle transcription is complete and bidirectional, with 3' processing yielding the mature gRNAs. Steady state RNAs in varying abundances are derived from all maxicircle genes, including portions of the repetitive divergent region. The relative extents of editing in both strains correlated with the presence of a cascade of cognate gRNAs. These data should provide the foundation for a deeper understanding of this dynamic genetic system as well as the evolutionary variation of editing in different strains.

  8. Transcriptomics and functional genomics of ROS-induced cell death regulation by RADICAL-INDUCED CELL DEATH1.

    PubMed

    Brosché, Mikael; Blomster, Tiina; Salojärvi, Jarkko; Cui, Fuqiang; Sipari, Nina; Leppälä, Johanna; Lamminmäki, Airi; Tomai, Gloria; Narayanasamy, Shaman; Reddy, Ramesha A; Keinänen, Markku; Overmyer, Kirk; Kangasjärvi, Jaakko

    2014-02-01

    Plant responses to changes in environmental conditions are mediated by a network of signaling events leading to downstream responses, including changes in gene expression and activation of cell death programs. Arabidopsis thaliana RADICAL-INDUCED CELL DEATH1 (RCD1) has been proposed to regulate plant stress responses by protein-protein interactions with transcription factors. Furthermore, the rcd1 mutant has defective control of cell death in response to apoplastic reactive oxygen species (ROS). Combining transcriptomic and functional genomics approaches we first used microarray analysis in a time series to study changes in gene expression after apoplastic ROS treatment in rcd1. To identify a core set of cell death regulated genes, RCD1-regulated genes were clustered together with other array experiments from plants undergoing cell death or treated with various pathogens, plant hormones or other chemicals. Subsequently, selected rcd1 double mutants were constructed to further define the genetic requirements for the execution of apoplastic ROS induced cell death. Through the genetic analysis we identified WRKY70 and SGT1b as cell death regulators functioning downstream of RCD1 and show that quantitative rather than qualitative differences in gene expression related to cell death appeared to better explain the outcome. Allocation of plant energy to defenses diverts resources from growth. Recently, a plant response termed stress-induced morphogenic response (SIMR) was proposed to regulate the balance between defense and growth. Using a rcd1 double mutant collection we show that SIMR is mostly independent of the classical plant defense signaling pathways and that the redox balance is involved in development of SIMR. PMID:24550736

  9. RNA-Seq Analysis of Cocos nucifera: Transcriptome Sequencing and De Novo Assembly for Subsequent Functional Genomics Approaches

    PubMed Central

    Xia, Wei; Mason, Annaliese S.; Xia, Zhihui; Qiao, Fei; Zhao, Songlin; Tang, Haoru

    2013-01-01

    Background Cocos nucifera (coconut), a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. Methodology/Principal Findings To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr) protein database. The annotated unigenes were then further classified using the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Conclusions/Significance Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species. PMID:23555859

  10. Understanding PRRSV Infection in Porcine Lung Based on Genome-Wide Transcriptome Response Identified by Deep Sequencing

    PubMed Central

    Xiao, Shuqi; Jia, Jianyu; Mo, Delin; Wang, Qiwei; Qin, Limei; He, Zuyong; Zhao, Xiao; Huang, Yuankai; Li, Anning; Yu, Jingwei; Niu, Yuna; Liu, Xiaohong; Chen, Yaosheng

    2010-01-01

    Porcine reproductive and respiratory syndrome (PRRS) has been one of the most economically important diseases affecting swine industry worldwide and causes great economic losses each year. PRRS virus (PRRSV) replicates mainly in porcine alveolar macrophages (PAMs) and dendritic cells (DCs) and develops persistent infections, antibody-dependent enhancement (ADE), interstitial pneumonia and immunosuppression. But the molecular mechanisms of PRRSV infection still are poorly understood. Here we report on the first genome-wide host transcriptional responses to classical North American type PRRSV (N-PRRSV) strain CH 1a infection using Solexa/Illumina's digital gene expression (DGE) system, a tag-based high-throughput transcriptome sequencing method, and analyse systematically the relationship between pulmonary gene expression profiles after N-PRRSV infection and infection pathology. Our results suggest that N-PRRSV appeared to utilize multiple strategies for its replication and spread in infected pigs, including subverting host innate immune response, inducing an anti-apoptotic and anti-inflammatory state as well as developing ADE. Upregulation expression of virus-induced pro-inflammatory cytokines, chemokines, adhesion molecules and inflammatory enzymes and inflammatory cells, antibodies, complement activation were likely to result in the development of inflammatory responses during N-PRRSV infection processes. N-PRRSV-induced immunosuppression might be mediated by apoptosis of infected cells, which caused depletion of immune cells and induced an anti-inflammatory cytokine response in which they were unable to eradicate the primary infection. Our systems analysis will benefit for better understanding the molecular pathogenesis of N-PRRSV infection, developing novel antiviral therapies and identifying genetic components for swine resistance/susceptibility to PRRS. PMID:20614006

  11. Transcriptomics and Functional Genomics of ROS-Induced Cell Death Regulation by RADICAL-INDUCED CELL DEATH1

    PubMed Central

    Salojärvi, Jarkko; Cui, Fuqiang; Sipari, Nina; Leppälä, Johanna; Lamminmäki, Airi; Tomai, Gloria; Narayanasamy, Shaman; Reddy, Ramesha A.; Keinänen, Markku; Overmyer, Kirk; Kangasjärvi, Jaakko

    2014-01-01

    Plant responses to changes in environmental conditions are mediated by a network of signaling events leading to downstream responses, including changes in gene expression and activation of cell death programs. Arabidopsis thaliana RADICAL-INDUCED CELL DEATH1 (RCD1) has been proposed to regulate plant stress responses by protein-protein interactions with transcription factors. Furthermore, the rcd1 mutant has defective control of cell death in response to apoplastic reactive oxygen species (ROS). Combining transcriptomic and functional genomics approaches we first used microarray analysis in a time series to study changes in gene expression after apoplastic ROS treatment in rcd1. To identify a core set of cell death regulated genes, RCD1-regulated genes were clustered together with other array experiments from plants undergoing cell death or treated with various pathogens, plant hormones or other chemicals. Subsequently, selected rcd1 double mutants were constructed to further define the genetic requirements for the execution of apoplastic ROS induced cell death. Through the genetic analysis we identified WRKY70 and SGT1b as cell death regulators functioning downstream of RCD1 and show that quantitative rather than qualitative differences in gene expression related to cell death appeared to better explain the outcome. Allocation of plant energy to defenses diverts resources from growth. Recently, a plant response termed stress-induced morphogenic response (SIMR) was proposed to regulate the balance between defense and growth. Using a rcd1 double mutant collection we show that SIMR is mostly independent of the classical plant defense signaling pathways and that the redox balance is involved in development of SIMR. PMID:24550736

  12. Flux Coupling Analysis of Genome-Scale Metabolic Network Reconstructions

    PubMed Central

    Burgard, Anthony P.; Nikolaev, Evgeni V.; Schilling, Christophe H.; Maranas, Costas D.

    2004-01-01

    In this paper, we introduce the Flux Coupling Finder (FCF) framework for elucidating the topological and flux connectivity features of genome-scale metabolic networks. The framework is demonstrated on genome-scale metabolic reconstructions of Helicobacter pylori, Escherichia coli, and Saccharomyces cerevisiae. The analysis allows one to determine whether any two metabolic fluxes, v1 and v2, are (1) directionally coupled, if a non-zero flux for v1 implies a non-zero flux for v2 but not necessarily the reverse; (2) partially coupled, if a non-zero flux for v1 implies a non-zero, though variable, flux for v2 and vice versa; or (3) fully coupled, if a non-zero flux for v1 implies not only a non-zero but also a fixed flux for v2 and vice versa. Flux coupling analysis also enables the global identification of blocked reactions, which are all reactions incapable of carrying flux under a certain condition; equivalent knockouts, defined as the set of all possible reactions whose deletion forces the flux through a particular reaction to zero; and sets of affected reactions denoting all reactions whose fluxes are forced to zero if a particular reaction is deleted. The FCF approach thus provides a novel and versatile tool for aiding metabolic reconstructions and guiding genetic manipulations. PMID:14718379

  13. Predicting novel pathways in genome-scale metabolic networks.

    PubMed

    Schuster, Stefan; de Figueiredo, Luís F; Kaleta, Christoph

    2010-10-01

    Elementary-modes analysis has become a well-established theoretical tool in metabolic pathway analysis. It allows one to decompose complex metabolic networks into the smallest functional entities, which can be interpreted as biochemical pathways. This analysis has, in medium-size metabolic networks, led to the successful theoretical prediction of hitherto unknown pathways. For illustration, we discuss the example of the phosphoenolpyruvate-glyoxylate cycle in Escherichia coli. Elementary-modes analysis meets with the problem of combinatorial explosion in the number of pathways with increasing system size, which has hampered scaling it up to genome-wide models. We present a novel approach to overcoming this obstacle. That approach is based on elementary flux patterns, which are defined as sets of reactions representing the basic routes through a particular subsystem that are compatible with admissible fluxes in a (possibly) much larger metabolic network. The subsystem can be made up by reactions in which we are interested in, for example, reactions producing a certain metabolite. This allows one to predict novel metabolic pathways in genome-scale networks.

  14. Omic data from evolved E. coli are consistent with computed optimal growth from genome-scale models

    SciTech Connect

    Lewis, Nathan E.; Hixson, Kim K.; Conrad, Tom M.; Lerman, Joshua A.; Charusanti, Pep; Polpitiya, Ashoka D.; Adkins, Joshua N.; Schramm, Gunnar; Purvine, Samuel O.; Lopez-Ferrer, Daniel; Weitz, Karl K.; Eils, Roland; Konig, Rainer; Smith, Richard D.; Palsson, Bernhard O.

    2010-07-27

    After hundreds of generations of mid log phase growth, Escherichia coli acquires a higher growth rate as predicted using flux balance analysis (FBA) on genome-scale metabolic models (GEMs). FBA solutions contain hundreds of variables that can be examined using omics methods. We report that 99% of active reactions from FBA optimal growth solutions are supported by transcriptomic and proteomic data. Moreover, when E. coli adapts to growth rate selective pressure, the resulting evolved strains reinforce the optimal growth predictions. Specifically, through constraint-based analysis of the proteomic and transcriptomic data, we find: 1) selective pressure for the predicted optimal growth states and a minimization of network flux; 2) suppression of genes outside of the optimal growth solutions; and 3) a trend towards usage of more efficient metabolic pathways. For processes not in GEMs, we find 4) an increase in the transcription/translation machinery and stringent response suppression, and 5) that established regulons are significantly down-regulated. Thus, differential expression supports observed growth phenotype changes, and observed expression in evolved strains is consistent with GEM computed optimal growth states.

  15. SeqFold: genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data.

    PubMed

    Ouyang, Zhengqing; Snyder, Michael P; Chang, Howard Y

    2013-02-01

    We present an integrative approach, SeqFold, that combines high-throughput RNA structure profiling data with computational prediction for genome-scale reconstruction of RNA secondary structures. SeqFold transforms experimental RNA structure information into a structure preference profile (SPP) and uses it to select stable RNA structure candidates representing the structure ensemble. Under a high-dimensional classification framework, SeqFold efficiently matches a given SPP to the most likely cluster of structures sampled from the Boltzmann-weighted ensemble. SeqFold is able to incorporate diverse types of RNA structure profiling data, including parallel analysis of RNA structure (PARS), selective 2'-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq), fragmentation sequencing (FragSeq) data generated by deep sequencing, and conventional SHAPE data. Using the known structures of a wide range of mRNAs and noncoding RNAs as benchmarks, we demonstrate that SeqFold outperforms or matches existing approaches in accuracy and is more robust to noise in experimental data. Application of SeqFold to reconstruct the secondary structures of the yeast transcriptome reveals the diverse impact of RNA secondary structure on gene regulation, including translation efficiency, transcription initiation, and protein-RNA interactions. SeqFold can be easily adapted to incorporate any new types of high-throughput RNA structure profiling data and is widely applicable to analyze RNA structures in any transcriptome.

  16. Transcriptomic and genomic evidence for Streptococcus agalactiae adaptation to the bovine environment

    PubMed Central

    2013-01-01

    Background Streptococcus agalactiae is a major cause of bovine mastitis, which is the dominant health disorder affecting milk production within the dairy industry and is responsible for substantial financial losses to the industry worldwide. However, there is considerable evidence for host adaptation (ecotypes) within S. agalactiae, with both bovine and human sourced isolates showing a high degree of distinctiveness, suggesting differing ability to cause mastitis. Here, we (i) generate RNAseq data from three S. agalactiae isolates (two putative bovine adapted and one human) and (ii) compare publicly available whole genome shotgun sequence data from an additional 202 isolates, obtained from six host species, to elucidate possible genetic factors/adaptations likely important for S. agalactiae growth and survival in the bovine mammary gland. Results Tests for differential expression showed distinct expression profiles for the three isolates when grown in bovine milk. A key finding for the two putatively bovine adapted isolates was the up regulation of a lactose metabolism operon (Lac.2) that was strongly correlated with the bovine environment (all 36 bovine sourced isolates on GenBank possessed the operon, in contrast to only 8/151 human sourced isolates). Multi locus sequence typing of all genome sequences and phylogenetic analysis using conserved operon genes from 44 S. agalactiae isolates and 16 additional Streptococcus species provided strong evidence for acquisition of the operon via multiple lateral gene transfer events, with all Streptococcus species known to be major causes of mastitis, identified as possible donors. Furthermore, lactose fermentation tests were only positive for isolates possessing Lac.2. Combined, these findings suggest that lactose metabolism is likely an important adaptation to the bovine environment. Additional up regulation in the bovine adapted isolates included genes involved in copper homeostasis, metabolism of purine, pyrimidine

  17. Transcriptomic and genomic analysis of cellulose fermentation by Clostridium thermocellum ATCC 27405

    SciTech Connect

    Raman, Babu; McKeown, Catherine K; Rodriguez, Jr., Miguel; Brown, Steven D; Mielenz, Jonathan R

    2011-01-01

    The ability of Clostridium thermocellum ATCC 27405 wild-type strain to hydrolyze cellulose and ferment the degradation products directly to ethanol and other metabolic byproducts makes it an attractive candidate for consolidated bioprocessing of cellulosic biomass to biofuels. In this study, whole-genome microarrays were used to investigate the expression of C. thermocellum mRNA during growth on crystalline cellulose in controlled replicate batch fermentations. A time-series analysis of gene expression revealed changes in transcript levels of {approx}40% of genes ({approx}1300 out of 3198 ORFs encoded in the genome) during transition from early-exponential to late-stationary phase. K-means clustering of genes with statistically significant changes in transcript levels identified six distinct clusters of temporal expression. Broadly, genes involved in energy production, translation, glycolysis and amino acid, nucleotide and coenzyme metabolism displayed a decreasing trend in gene expression as cells entered stationary phase. In comparison, genes involved in cell structure and motility, chemotaxis, signal transduction and transcription showed an increasing trend in gene expression. Hierarchical clustering of cellulosome-related genes highlighted temporal changes in composition of this multi-enzyme complex during batch growth on crystalline cellulose, with increased expression of several genes encoding hydrolytic enzymes involved in degradation of non-cellulosic substrates in stationary phase. Overall, the results suggest that under low substrate availability, growth slows due to decreased metabolic potential and C. thermocellum alters its gene expression to (i) modulate the composition of cellulosomes that are released into the environment with an increased proportion of enzymes than can efficiently degrade plant polysaccharides other than cellulose, (ii) enhance signal transduction and chemotaxis mechanisms perhaps to sense the oligosaccharide hydrolysis products

  18. De novo transcriptome sequencing facilitates genomic resource generation in Tinospora cordifolia.

    PubMed

    Singh, Rakesh; Kumar, Rajesh; Mahato, Ajay Kumar; Paliwal, Ritu; Singh, Amit Kumar; Kumar, Sundeep; Marla, Soma S; Kumar, Ashok; Singh, Nagendra K

    2016-09-01

    Tinospora cordifolia is known for its medicinal properties owing to the presence of useful constituents such as terpenes, glycosides, steroids, alkaloids, and flavonoids belonging to secondary metabolism origin. However, there is little information available pertaining to critical genomic elements (ESTs, molecular markers) necessary for judicious exploitation of its germplasm. We employed 454 GS-FLX pyrosequencing of entire transcripts and altogether ∼25 K assembled transcripts or Expressed sequence tags (ESTs) were identified. As the interest in T. cordifolia is primarily due to its secondary metabolite constituents, the ESTs pertaining to terpenoids biosynthetic pathway were identified in the present study. Additionally, several ESTs were assigned to different transcription factor families. To validate our transcripts dataset, the novel EST-SSR markers were generated to assess the genetic diversity among germplasm of T. cordifolia. These EST-SSR markers were found to be polymorphic and the dendrogram based on dice similarity index revealed three distinct clustering of accessions. The present study demonstrates effectiveness in using both NEWBLER and MIRA sequence read assembler software for enriching transcript-dataset and thus enables better exploitation of EST resources for mining candidate genes and designing molecular markers. PMID:27465295

  19. Transcriptomics and adaptive genomics of the asymptomatic bacteriuria Escherichia coli strain 83972

    PubMed Central

    Hancock, Viktoria; Seshasayee, Aswin S.; Ussery, David W.; Luscombe, Nicholas M.

    2008-01-01

    Escherichia coli strains are the major cause of urinary tract infections in humans. Such strains can be divided into virulent, UPEC strains causing symptomatic infections, and asymptomatic, commensal-like strains causing asymptomatic bacteriuria, ABU. The best-characterized ABU strain is strain 83972. Global gene expression profiling of strain 83972 has been carried out under seven different sets of environmental conditions ranging from laboratory minimal medium to human bladders. The data reveal highly specific gene expression responses to different conditions. A number of potential fitness factors for the human urinary tract could be identified. Also, presence/absence data of the gene expression was used as an adaptive genomics tool to model the gene pool of 83972 using primarily UPEC strain CFT073 as a scaffold. In our analysis, 96% of the transcripts filtered present in strain 83972 can be found in CFT073, and genes on six of the seven pathogenicity islands were expressed in 83972. Despite the very different patient symptom profiles, the two strains seem to be very similar. Genes expressed in CFT073 but not in 83972 were identified and can be considered as virulence factor candidates. Strain 83972 is a deconstructed pathogen rather than a commensal strain that has acquired fitness properties. PMID:18317809

  20. Global genomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism

    PubMed Central

    Fadista, João; Vikman, Petter; Laakso, Emilia Ottosson; Mollet, Inês Guerra; Esguerra, Jonathan Lou; Taneera, Jalal; Storm, Petter; Osmark, Peter; Ladenvall, Claes; Prasad, Rashmi B.; Hansson, Karin B.; Finotello, Francesca; Uvebrant, Kristina; Ofori, Jones K.; Di Camillo, Barbara; Krus, Ulrika; Cilio, Corrado M.; Hansson, Ola; Eliasson, Lena; Rosengren, Anders H.; Renström, Erik; Wollheim, Claes B.; Groop, Leif

    2014-01-01

    Genetic variation can modulate gene expression, and thereby phenotypic variation and susceptibility to complex diseases such as type 2 diabetes (T2D). Here we harnessed the potential of DNA and RNA sequencing in human pancreatic islets from 89 deceased donors to identify genes of potential importance in the pathogenesis of T2D. We present a catalog of genetic variants regulating gene expression (eQTL) and exon use (sQTL), including many long noncoding RNAs, which are enriched in known T2D-associated loci. Of 35 eQTL genes, whose expression differed between normoglycemic and hyperglycemic individuals, siRNA of tetraspanin 33 (TSPAN33), 5′-nucleotidase, ecto (NT5E), transmembrane emp24 protein transport domain containing 6 (TMED6), and p21 protein activated kinase 7 (PAK7) in INS1 cells resulted in reduced glucose-stimulated insulin secretion. In addition, we provide a genome-wide catalog of allelic expression imbalance, which is also enriched in known T2D-associated loci. Notably, allelic imbalance in paternally expressed gene 3 (PEG3) was associated with its promoter methylation and T2D status. Finally, RNA editing events were less common in islets than previously suggested in other tissues. Taken together, this study provides new insights into the complexity of gene regulation in human pancreatic islets and better understanding of how genetic variation can influence glucose metabolism. PMID:25201977

  1. Global genomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism.

    PubMed

    Fadista, João; Vikman, Petter; Laakso, Emilia Ottosson; Mollet, Inês Guerra; Esguerra, Jonathan Lou; Taneera, Jalal; Storm, Petter; Osmark, Peter; Ladenvall, Claes; Prasad, Rashmi B; Hansson, Karin B; Finotello, Francesca; Uvebrant, Kristina; Ofori, Jones K; Di Camillo, Barbara; Krus, Ulrika; Cilio, Corrado M; Hansson, Ola; Eliasson, Lena; Rosengren, Anders H; Renström, Erik; Wollheim, Claes B; Groop, Leif

    2014-09-23

    Genetic variation can modulate gene expression, and thereby phenotypic variation and susceptibility to complex diseases such as type 2 diabetes (T2D). Here we harnessed the potential of DNA and RNA sequencing in human pancreatic islets from 89 deceased donors to identify genes of potential importance in the pathogenesis of T2D. We present a catalog of genetic variants regulating gene expression (eQTL) and exon use (sQTL), including many long noncoding RNAs, which are enriched in known T2D-associated loci. Of 35 eQTL genes, whose expression differed between normoglycemic and hyperglycemic individuals, siRNA of tetraspanin 33 (TSPAN33), 5'-nucleotidase, ecto (NT5E), transmembrane emp24 protein transport domain containing 6 (TMED6), and p21 protein activated kinase 7 (PAK7) in INS1 cells resulted in reduced glucose-stimulated insulin secretion. In addition, we provide a genome-wide catalog of allelic expression imbalance, which is also enriched in known T2D-associated loci. Notably, allelic imbalance in paternally expressed gene 3 (PEG3) was associated with its promoter methylation and T2D status. Finally, RNA editing events were less common in islets than previously suggested in other tissues. Taken together, this study provides new insights into the complexity of gene regulation in human pancreatic islets and better understanding of how genetic variation can influence glucose metabolism.

  2. Combined Large-Scale Phenotyping and Transcriptomics in Maize Reveals a Robust Growth Regulatory Network1[OPEN

    PubMed Central

    Herman, Dorota; Slabbinck, Bram; Pè, Mario Enrico

    2016-01-01

    Leaves are vital organs for biomass and seed production because of their role in the generation of metabolic energy and organic compounds. A better understanding of the molecular networks underlying leaf development is crucial to sustain global requirements for food and renewable energy. Here, we combined transcriptome profiling of proliferative leaf tissue with in-depth phenotyping of the fourth leaf at later stages of development in 197 recombinant inbred lines of two different maize (Zea mays) populations. Previously, correlation analysis in a classical biparental mapping population identified 1,740 genes correlated with at least one of 14 traits. Here, we extended these results with data from a multiparent advanced generation intercross population. As expected, the phenotypic variability was found to be larger in the latter population than in the biparental population, although general conclusions on the correlations among the traits are comparable. Data integration from the two diverse populations allowed us to identify a set of 226 genes that are robustly associated with diverse leaf traits. This set of genes is enriched for transcriptional regulators and genes involved in protein synthesis and cell wall metabolism. In order to investigate the molecular network context of the candidate gene set, we integrated our data with publicly available functional genomics data and identified a growth regulatory network of 185 genes. Our results illustrate the power of combining in-depth phenotyping with transcriptomics in mapping populations to dissect the genetic control of complex traits and present a set of candidate genes for use in biomass improvement. PMID:26754667

  3. The Tetraodon nigroviridis reference transcriptome: developmental transition, length retention and microsynteny of long non-coding RNAs in a compact vertebrate genome

    PubMed Central

    Basu, Swaraj; Hadzhiev, Yavor; Petrosino, Giuseppe; Nepal, Chirag; Gehrig, Jochen; Armant, Olivier; Ferg, Marco; Strahle, Uwe; Sanges, Remo; Müller, Ferenc

    2016-01-01

    Pufferfish such as fugu and tetraodon carry the smallest genomes among all vertebrates and are ideal for studying genome evolution. However, comparative genomics using these species is hindered by the poor annotation of their genomes. We performed RNA sequencing during key stages of maternal to zygotic transition of Tetraodon nigroviridis and report its first developmental transcriptome. We assembled 61,033 transcripts (23,837 loci) representing 80% of the annotated gene models and 3816 novel coding transcripts from 2667 loci. We demonstrate the similarities of gene expression profiles between pufferfish and zebrafish during maternal to zygotic transition and annotated 1120 long non-coding RNAs (lncRNAs) many of which differentially expressed during development. The promoters for 60% of the assembled transcripts result validated by CAGE-seq. Despite the extreme compaction of the tetraodon genome and the dramatic loss of transposons, the length of lncRNA exons remain comparable to that of other vertebrates and a small set of lncRNAs appears enriched for transposable elements suggesting a selective pressure acting on lncRNAs length and composition. Finally, a set of lncRNAs are microsyntenic between teleost and vertebrates, which indicates potential regulatory interactions between lncRNAs and their flanking coding genes. Our work provides a fundamental molecular resource for vertebrate comparative genomics and embryogenesis studies. PMID:27628538

  4. The Tetraodon nigroviridis reference transcriptome: developmental transition, length retention and microsynteny of long non-coding RNAs in a compact vertebrate genome.

    PubMed

    Basu, Swaraj; Hadzhiev, Yavor; Petrosino, Giuseppe; Nepal, Chirag; Gehrig, Jochen; Armant, Olivier; Ferg, Marco; Strahle, Uwe; Sanges, Remo; Müller, Ferenc

    2016-01-01

    Pufferfish such as fugu and tetraodon carry the smallest genomes among all vertebrates and are ideal for studying genome evolution. However, comparative genomics using these species is hindered by the poor annotation of their genomes. We performed RNA sequencing during key stages of maternal to zygotic transition of Tetraodon nigroviridis and report its first developmental transcriptome. We assembled 61,033 transcripts (23,837 loci) representing 80% of the annotated gene models and 3816 novel coding transcripts from 2667 loci. We demonstrate the similarities of gene expression profiles between pufferfish and zebrafish during maternal to zygotic transition and annotated 1120 long non-coding RNAs (lncRNAs) many of which differentially expressed during development. The promoters for 60% of the assembled transcripts result validated by CAGE-seq. Despite the extreme compaction of the tetraodon genome and the dramatic loss of transposons, the length of lncRNA exons remain comparable to that of other vertebrates and a small set of lncRNAs appears enriched for transposable elements suggesting a selective pressure acting on lncRNAs length and composition. Finally, a set of lncRNAs are microsyntenic between teleost and vertebrates, which indicates potential regulatory interactions between lncRNAs and their flanking coding genes. Our work provides a fundamental molecular resource for vertebrate comparative genomics and embryogenesis studies. PMID:27628538

  5. MicroRNAs shape circadian hepatic gene expression on a transcriptome-wide scale

    PubMed Central

    Du, Ngoc-Hien; Arpat, Alaaddin Bulak; De Matos, Mara; Gatfield, David

    2014-01-01

    A considerable proportion of mammalian gene expression undergoes circadian oscillations. Post-transcriptional mechanisms likely make important contributions to mRNA abundance rhythms. We have investigated how microRNAs (miRNAs) contribute to core clock and clock-controlled gene expression using mice in which miRNA biogenesis can be inactivated in the liver. While the hepatic core clock was surprisingly resilient to miRNA loss, whole transcriptome sequencing uncovered widespread effects on clock output gene expression. Cyclic transcription paired with miRNA-mediated regulation was thus identified as a frequent phenomenon that affected up to 30% of the rhythmic transcriptome and served to post-transcriptionally adjust the phases and amplitudes of rhythmic mRNA accumulation. However, only few mRNA rhythms were actually generated by miRNAs. Overall, our study suggests that miRNAs function to adapt clock-driven gene expression to tissue-specific requirements. Finally, we pinpoint several miRNAs predicted to act as modulators of rhythmic transcripts, and identify rhythmic pathways particularly prone to miRNA regulation. DOI: http://dx.doi.org/10.7554/eLife.02510.001 PMID:24867642

  6. Dissection of the Inflammatory Bowel Disease Transcriptome Using Genome-Wide cDNA Microarrays

    PubMed Central

    2005-01-01

    Background The differential pathophysiologic mechanisms that trigger and maintain the two forms of inflammatory bowel disease (IBD), Crohn disease (CD), and ulcerative colitis (UC) are only partially understood. cDNA microarrays can be used to decipher gene regulation events at a genome-wide level and to identify novel unknown genes that might be involved in perpetuating inflammatory disease progression. Methods and Findings High-density cDNA microarrays representing 33,792 UniGene clusters were prepared. Biopsies were taken from the sigmoid colon of normal controls (n = 11), CD patients (n = 10) and UC patients (n = 10). 33P-radiolabeled cDNA from purified poly(A)+ RNA extracted from biopsies (unpooled) was hybridized to the arrays. We identified 500 and 272 transcripts differentially regulated in CD and UC, respectively. Interesting hits were independently verified by real-time PCR in a second sample of 100 individuals, and immunohistochemistry was used for exemplary localization. The main findings point to novel molecules important in abnormal immune regulation and the highly disturbed cell biology of colonic epithelial cells in IBD pathogenesis, e.g., CYLD (cylindromatosis, turban tumor syndrome) and CDH11 (cadherin 11, type 2). By the nature of the array setup, many of the genes identified were to our knowledge previously uncharacterized, and prediction of the putative function of a subsection of these genes indicate that some could be involved in early events in disease pathophysiology. Conclusion A comprehensive set of candidate genes not previously associated with IBD was revealed, which underlines the polygenic and complex nature of the disease. It points out substantial differences in pathophysiology between CD and UC. The multiple unknown genes identified may stimulate new research in the fields of barrier mechanisms and cell signalling in the context of IBD, and ultimately new therapeutic approaches. PMID:16107186

  7. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    PubMed

    Lin, Ching-Ping; Ko, Chia-Yun; Kuo, Ching-I; Liu, Mao-Sen; Schafleitner, Roland; Chen, Long-Fang Oliver

    2015-01-01

    We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911). In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum), two monocots (Oryza sativa and Zea mays), two gymnosperms (Pinus taeda and Ginkgo biloba) and one moss (Physcomitrella patens). Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  8. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome

    PubMed Central

    Kuo, Ching-I; Liu, Mao-Sen; Schafleitner, Roland; Chen, Long-Fang Oliver

    2015-01-01

    We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911). In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum), two monocots (Oryza sativa and Zea mays), two gymnosperms (Pinus taeda and Ginkgo biloba) and one moss (Physcomitrella patens). Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants. PMID:26076132

  9. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    PubMed

    Lin, Ching-Ping; Ko, Chia-Yun; Kuo, Ching-I; Liu, Mao-Sen; Schafleitner, Roland; Chen, Long-Fang Oliver

    2015-01-01

    We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911). In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum), two monocots (Oryza sativa and Zea mays), two gymnosperms (Pinus taeda and Ginkgo biloba) and one moss (Physcomitrella patens). Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants. PMID:26076132

  10. Genome-wide transcriptomic analysis reveals correlation between higher WRKY61 expression and reduced symptom severity in Turnip crinkle virus infe