Science.gov

Sample records for genome scale transcriptome

  1. Genome scale transcriptomics of baculovirus-insect interactions.

    PubMed

    Nguyen, Quan; Nielsen, Lars K; Reid, Steven

    2013-11-12

    Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors' and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system' which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.

  2. Inferring the choreography of parental genomes during fertilization from ultralarge-scale whole-transcriptome analysis.

    PubMed

    Park, Sung-Joon; Komata, Makiko; Inoue, Fukashi; Yamada, Kaori; Nakai, Kenta; Ohsugi, Miho; Shirahige, Katsuhiko

    2013-12-15

    Fertilization precisely choreographs parental genomes by using gamete-derived cellular factors and activating genome regulatory programs. However, the mechanism remains elusive owing to the technical difficulties of preparing large numbers of high-quality preimplantation cells. Here, we collected >14 × 10(4) high-quality mouse metaphase II oocytes and used these to establish detailed transcriptional profiles for four early embryo stages and parthenogenetic development. By combining these profiles with other public resources, we found evidence that gene silencing appeared to be mediated in part by noncoding RNAs and that this was a prerequisite for post-fertilization development. Notably, we identified 817 genes that were differentially expressed in embryos after fertilization compared with parthenotes. The regulation of these genes was distinctly different from those expressed in parthenotes, suggesting functional specialization of particular transcription factors prior to first cell cleavage. We identified five transcription factors that were potentially necessary for developmental progression: Foxd1, Nkx2-5, Sox18, Myod1, and Runx1. Our very large-scale whole-transcriptome profile of early mouse embryos yielded a novel and valuable resource for studies in developmental biology and stem cell research. The database is available at http://dbtmee.hgc.jp.

  3. Genome-Scale Transcriptome Analysis in Response to Nitric Oxide in Birch Cells: Implications of the Triterpene Biosynthetic Pathway

    PubMed Central

    Zeng, Fansuo; Sun, Fengkun; Li, Leilei; Liu, Kun; Zhan, Yaguang

    2014-01-01

    Evidence supporting nitric oxide (NO) as a mediator of plant biochemistry continues to grow, but its functions at the molecular level remains poorly understood and, in some cases, controversial. To study the role of NO at the transcriptional level in Betula platyphylla cells, we conducted a genome-scale transcriptome analysis of these cells. The transcriptome of untreated birch cells and those treated by sodium nitroprusside (SNP) were analyzed using the Solexa sequencing. Data were collected by sequencing cDNA libraries of birch cells, which had a long period to adapt to the suspension culture conditions before SNP-treated cells and untreated cells were sampled. Among the 34,100 UniGenes detected, BLASTX search revealed that 20,631 genes showed significant (E-values≤10−5) sequence similarity with proteins from the NR-database. Numerous expressed sequence tags (i.e., 1374) were identified as differentially expressed between the 12 h SNP-treated cells and control cells samples: 403 up-regulated and 971 down-regulated. From this, we specifically examined a core set of NO-related transcripts. The altered expression levels of several transcripts, as determined by transcriptome analysis, was confirmed by qRT-PCR. The results of transcriptome analysis, gene expression quantification, the content of triterpenoid and activities of defensive enzymes elucidated NO has a significant effect on many processes including triterpenoid production, carbohydrate metabolism and cell wall biosynthesis. PMID:25551661

  4. Genome-scale transcriptomic insights into early-stage fruit development in woodland strawberry Fragaria vesca.

    PubMed

    Kang, Chunying; Darwish, Omar; Geretz, Aviva; Shahan, Rachel; Alkharouf, Nadim; Liu, Zhongchi

    2013-06-01

    Fragaria vesca, a diploid woodland strawberry with a small and sequenced genome, is an excellent model for studying fruit development. The strawberry fruit is unique in that the edible flesh is actually enlarged receptacle tissue. The true fruit are the numerous dry achenes dotting the receptacle's surface. Auxin produced from the achene is essential for the receptacle fruit set, a paradigm for studying crosstalk between hormone signaling and development. To investigate the molecular mechanism underlying strawberry fruit set, next-generation sequencing was employed to profile early-stage fruit development with five fruit tissue types and five developmental stages from floral anthesis to enlarged fruits. This two-dimensional data set provides a systems-level view of molecular events with precise spatial and temporal resolution. The data suggest that the endosperm and seed coat may play a more prominent role than the embryo in auxin and gibberellin biosynthesis for fruit set. A model is proposed to illustrate how hormonal signals produced in the endosperm and seed coat coordinate seed, ovary wall, and receptacle fruit development. The comprehensive fruit transcriptome data set provides a wealth of genomic resources for the strawberry and Rosaceae communities as well as unprecedented molecular insight into fruit set and early stage fruit development.

  5. Genome-Scale Transcriptome Analysis of the Desert Shrub Artemisia sphaerocephala

    PubMed Central

    Zhang, Lijing; Hu, Xiaowei; Miao, Xiumei; Chen, Xiaolong; Nan, Shuzhen; Fu, Hua

    2016-01-01

    Background Artemisia sphaerocephala, a semi-shrub belonging to the Artemisia genus of the Compositae family, is an important pioneer plant that inhabits moving and semi-stable sand dunes in the deserts and steppes of northwest and north-central China. It is very resilient in extreme environments. Additionally, its seeds have excellent nutritional value, and the abundant lipids and polysaccharides in the seeds make this plant a potential valuable source of bio-energy. However, partly due to the scarcity of genetic information, the genetic mechanisms controlling the traits and environmental adaptation capacity of A. sphaerocephala are unknown. Results Here, we present the first in-depth transcriptomic analysis of A. sphaerocephala. To maximize the representation of conditional transcripts, mRNA was obtained from 17 samples, including living tissues of desert-growing A. sphaerocephala, seeds germinated in the laboratory, and calli subjected to no stress (control) and high and low temperature, high and low osmotic, and salt stresses. De novo transcriptome assembly performed using an Illumina HiSeq 2500 platform resulted in the generation of 68,373 unigenes. We analyzed the key genes involved in the unsaturated fatty acid synthesis pathway and identified 26 A. sphaerocephala fad2 genes, which is the largest fad2 gene family reported to date. Furthermore, a set of genes responsible for resistance to extreme temperatures, salt, drought and a combination of stresses was identified. Conclusion The present work provides abundant genomic information for functional dissection of the important traits of A. sphaerocephala and contributes to the current understanding of molecular adaptive mechanisms of A. sphaerocephala in the desert environment. Identification of the key genes in the unsaturated fatty acid synthesis pathway could increase understanding of the biological regulatory mechanisms of fatty acid composition traits in plants and facilitate genetic manipulation of the

  6. Genome-Scale Transcriptome Analysis of the Desert Shrub Artemisia sphaerocephala.

    PubMed

    Zhang, Lijing; Hu, Xiaowei; Miao, Xiumei; Chen, Xiaolong; Nan, Shuzhen; Fu, Hua

    2016-01-01

    Artemisia sphaerocephala, a semi-shrub belonging to the Artemisia genus of the Compositae family, is an important pioneer plant that inhabits moving and semi-stable sand dunes in the deserts and steppes of northwest and north-central China. It is very resilient in extreme environments. Additionally, its seeds have excellent nutritional value, and the abundant lipids and polysaccharides in the seeds make this plant a potential valuable source of bio-energy. However, partly due to the scarcity of genetic information, the genetic mechanisms controlling the traits and environmental adaptation capacity of A. sphaerocephala are unknown. Here, we present the first in-depth transcriptomic analysis of A. sphaerocephala. To maximize the representation of conditional transcripts, mRNA was obtained from 17 samples, including living tissues of desert-growing A. sphaerocephala, seeds germinated in the laboratory, and calli subjected to no stress (control) and high and low temperature, high and low osmotic, and salt stresses. De novo transcriptome assembly performed using an Illumina HiSeq 2500 platform resulted in the generation of 68,373 unigenes. We analyzed the key genes involved in the unsaturated fatty acid synthesis pathway and identified 26 A. sphaerocephala fad2 genes, which is the largest fad2 gene family reported to date. Furthermore, a set of genes responsible for resistance to extreme temperatures, salt, drought and a combination of stresses was identified. The present work provides abundant genomic information for functional dissection of the important traits of A. sphaerocephala and contributes to the current understanding of molecular adaptive mechanisms of A. sphaerocephala in the desert environment. Identification of the key genes in the unsaturated fatty acid synthesis pathway could increase understanding of the biological regulatory mechanisms of fatty acid composition traits in plants and facilitate genetic manipulation of the fatty acid composition of oil

  7. Genome-scale transcriptome analysis of the desert poplar, Populus euphratica.

    PubMed

    Qiu, Qiang; Ma, Tao; Hu, Quanjun; Liu, Bingbing; Wu, Yuxia; Zhou, Haihong; Wang, Qian; Wang, Juan; Liu, Jianquan

    2011-04-01

    Populus euphratica is well-adapted to extreme desert environments and is an important model species for studying the effects of abiotic stresses on trees. Here we present the first deep transcriptomic analysis of this species. To maximize representation of conditional transcripts, mRNA was obtained from living tissues of desert-grown trees and two types of callus (salt-stressed and unstressed). De novo assembly generated 86,777 Unigenes using Solexa sequence data. These sequences covered 92% of previously reported P. euphratica expressed sequence tags (ESTs) and 90% of the TIGR poplar ESTs, and a total of 58,499 high-quality unique sequences were annotated by BLAST similarity searches against public databases. We found that 27% of the total Unigenes were differentially expressed (up- or down-regulated) in response to salt stress in P. euphratica callus. These differentially expressed genes are mainly involved in transport, transcription, cellular communication and metabolism. In addition, we found that numerous putative genes involved in ABA regulation and biosynthesis were also differentially regulated. This study represents the deepest transcriptomic and gene-annotation analysis of P. euphratica to date. The genetic knowledge acquired should be very useful for future studies of the molecular adaptation of this tree species to abiotic stress and facilitate genetic manipulation of other poplar species.

  8. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales

    PubMed Central

    2012-01-01

    Background To date, exon capture has largely been restricted to species with fully sequenced genomes, which has precluded its application to lineages that lack high quality genomic resources. We developed a novel strategy for designing array-based exon capture in chipmunks (Tamias) based on de novo transcriptome assemblies. We evaluated the performance of our approach across specimens from four chipmunk species. Results We selectively targeted 11,975 exons (~4 Mb) on custom capture arrays, and enriched over 99% of the targets in all libraries. The percentage of aligned reads was highly consistent (24.4-29.1%) across all specimens, including in multiplexing up to 20 barcoded individuals on a single array. Base coverage among specimens and within targets in each species library was uniform, and the performance of targets among independent exon captures was highly reproducible. There was no decrease in coverage among chipmunk species, which showed up to 1.5% sequence divergence in coding regions. We did observe a decline in capture performance of a subset of targets designed from a much more divergent ground squirrel genome (30 My), however, over 90% of the targets were also recovered. Final assemblies yielded over ten thousand orthologous loci (~3.6 Mb) with thousands of fixed and polymorphic SNPs among species identified. Conclusions Our study demonstrates the potential of a transcriptome-enabled, multiplexed, exon capture method to create thousands of informative markers for population genomic and phylogenetic studies in non-model species across the tree of life. PMID:22900609

  9. Genome-Scale Transcriptomic Insights into Early-Stage Fruit Development in Woodland Strawberry Fragaria vesca[C][W

    PubMed Central

    Kang, Chunying; Darwish, Omar; Geretz, Aviva; Shahan, Rachel; Alkharouf, Nadim; Liu, Zhongchi

    2013-01-01

    Fragaria vesca, a diploid woodland strawberry with a small and sequenced genome, is an excellent model for studying fruit development. The strawberry fruit is unique in that the edible flesh is actually enlarged receptacle tissue. The true fruit are the numerous dry achenes dotting the receptacle’s surface. Auxin produced from the achene is essential for the receptacle fruit set, a paradigm for studying crosstalk between hormone signaling and development. To investigate the molecular mechanism underlying strawberry fruit set, next-generation sequencing was employed to profile early-stage fruit development with five fruit tissue types and five developmental stages from floral anthesis to enlarged fruits. This two-dimensional data set provides a systems-level view of molecular events with precise spatial and temporal resolution. The data suggest that the endosperm and seed coat may play a more prominent role than the embryo in auxin and gibberellin biosynthesis for fruit set. A model is proposed to illustrate how hormonal signals produced in the endosperm and seed coat coordinate seed, ovary wall, and receptacle fruit development. The comprehensive fruit transcriptome data set provides a wealth of genomic resources for the strawberry and Rosaceae communities as well as unprecedented molecular insight into fruit set and early stage fruit development. PMID:23898027

  10. Illuminating the Transcriptome through the Genome.

    PubMed

    Elliott, David J

    2014-03-14

    Sequencing the human genome was a huge milestone in genetic research that revealed almost the total DNA sequence required to create a human being. However, in order to function, the DNA genome needs to be expressed as an RNA transcriptome. This article reviews how knowledge of genome sequence information has led to fundamental discoveries in how the transcriptome is processed, with a focus on new system-wide insights into how pre-mRNAs that are encoded by split genes in the genome are rearranged by splicing into functional mRNAs. These advances have been made possible by the development of new post-genome technologies to probe splicing patterns. Transcriptome-wide approaches have characterised a "splicing code" that is embedded within and has a significant role in deciphering the genome, and is deciphered by RNA binding proteins. These analyses have also found that most human genes encode multiple mRNA isoforms, and in some cases proteins, leading in turn to a re-assessment of what exactly a gene is. Analysis of the transcriptome has given insights into how the genome is packaged and transcribed, and is helping to explain important aspects of genome evolution.

  11. Identification of candidate network hubs involved in metabolic adjustments of rice under drought stress by integrating transcriptome data and genome-scale metabolic network.

    PubMed

    Mohanty, Bijayalaxmi; Kitazumi, Ai; Cheung, C Y Maurice; Lakshmanan, Meiyappan; de los Reyes, Benildo G; Jang, In-Cheol; Lee, Dong-Yup

    2016-01-01

    In this study, we have integrated a rice genome-scale metabolic network and the transcriptome of a drought-tolerant rice line, DK151, to identify the major transcriptional regulators involved in metabolic adjustments necessary for adaptation to drought. This was achieved by examining the differential expressions of transcription factors and metabolic genes in leaf, root and young panicle of rice plants subjected to drought stress during tillering, booting and panicle elongation stages. Critical transcription factors such as AP2/ERF, bZIP, MYB and NAC that control the important nodes in the gene regulatory pathway were identified through correlative analysis of the patterns of spatio-temporal expression and cis-element enrichment. We showed that many of the candidate transcription factors involved in metabolic adjustments were previously linked to phenotypic variation for drought tolerance. This approach represents the first attempt to integrate models of transcriptional regulation and metabolic pathways for the identification of candidate regulatory genes for targeted selection in rice breeding.

  12. Optimal Scaling of Digital Transcriptomes

    PubMed Central

    Glusman, Gustavo; Caballero, Juan; Robinson, Max; Kutlu, Burak; Hood, Leroy

    2013-01-01

    Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers. PMID:24223126

  13. International Standards for Genomes, Transcriptomes, and Metagenomes

    PubMed Central

    Mason, Christopher E.; Afshinnekoo, Ebrahim; Tighe, Scott; Wu, Shixiu; Levy, Shawn

    2017-01-01

    Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single-cells, RNA profiling, and metagenomics (across multiple genomes). Technical artifacts and contamination can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous. Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data. Here we review current standards and their applications in genomics, including whole genomes, transcriptomes, mixed genomic samples (metagenomes), and the modified bases within each (epigenomes and epitranscriptomes). These standards, tools, and metrics are critical for quantifying the accuracy of NGS methods, which will be essential for robust approaches in clinical genomics and precision medicine. PMID:28337071

  14. International Standards for Genomes, Transcriptomes, and Metagenomes.

    PubMed

    Mason, Christopher E; Afshinnekoo, Ebrahim; Tighe, Scott; Wu, Shixiu; Levy, Shawn

    2017-04-01

    Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single-cells, RNA profiling, and metagenomics (across multiple genomes). Technical artifacts and contamination can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous. Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data. Here we review current standards and their applications in genomics, including whole genomes, transcriptomes, mixed genomic samples (metagenomes), and the modified bases within each (epigenomes and epitranscriptomes). These standards, tools, and metrics are critical for quantifying the accuracy of NGS methods, which will be essential for robust approaches in clinical genomics and precision medicine.

  15. Understanding Haemonchus contortus Better Through Genomics and Transcriptomics.

    PubMed

    Gasser, R B; Schwarz, E M; Korhonen, P K; Young, N D

    2016-01-01

    Parasitic roundworms (nematodes) cause substantial mortality and morbidity in animals globally. The barber's pole worm, Haemonchus contortus, is one of the most economically significant parasitic nematodes of small ruminants worldwide. Although this and related nematodes can be controlled relatively well using anthelmintics, resistance against most drugs in common use has become a major problem. Until recently, almost nothing was known about the molecular biology of H. contortus on a global scale. This chapter gives a brief background on H. contortus and haemonchosis, immune responses, vaccine research, chemotherapeutics and current problems associated with drug resistance. It also describes progress in transcriptomics before the availability of H. contortus genomes and the challenges associated with such work. It then reviews major progress on the two draft genomes and developmental transcriptomes of H. contortus, and summarizes their implications for the molecular biology of this worm in both the free-living and the parasitic stages of its life cycle. The chapter concludes by considering how genomics and transcriptomics can accelerate research on Haemonchus and related parasites, and can enable the development of new interventions against haemonchosis. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Multi-Scale Genomic, Transcriptomic and Proteomic Analysis of Colorectal Cancer Cell Lines to Identify Novel Biomarkers

    PubMed Central

    Briffa, Romina; Um, Inhwa; Faratian, Dana; Zhou, Ying; Turnbull, Arran K.; Langdon, Simon P.; Harrison, David J.

    2015-01-01

    Selecting colorectal cancer (CRC) patients likely to respond to therapy remains a clinical challenge. The objectives of this study were to establish which genes were differentially expressed with respect to treatment sensitivity and relate this to copy number in a panel of 15 CRC cell lines. Copy number variations of the identified genes were assessed in a cohort of CRCs. IC50’s were measured for 5-fluorouracil, oxaliplatin, and BEZ-235, a PI3K/mTOR inhibitor. Cell lines were profiled using array comparative genomic hybridisation, Illumina gene expression analysis, reverse phase protein arrays, and targeted sequencing of KRAS hotspot mutations. Frequent gains were observed at 2p, 3q, 5p, 7p, 7q, 8q, 12p, 13q, 14q, and 17q and losses at 2q, 3p, 5q, 8p, 9p, 9q, 14q, 18q, and 20p. Frequently gained regions contained EGFR, PIK3CA, MYC, SMO, TRIB1, FZD1, and BRCA2, while frequently lost regions contained FHIT and MACROD2. TRIB1 was selected for further study. Gene enrichment analysis showed that differentially expressed genes with respect to treatment response were involved in Wnt signalling, EGF receptor signalling, apoptosis, cell cycle, and angiogenesis. Stepwise integration of copy number and gene expression data yielded 47 candidate genes that were significantly correlated. PDCD6 was differentially expressed in all three treatment responses. Tissue microarrays were constructed for a cohort of 118 CRC patients and TRIB1 and MYC amplifications were measured using fluorescence in situ hybridisation. TRIB1 and MYC were amplified in 14.5% and 7.4% of the cohort, respectively, and these amplifications were significantly correlated (p≤0.0001). TRIB1 protein expression in the patient cohort was significantly correlated with pERK, Akt, and Caspase 3 expression. In conclusion, a set of candidate predictive biomarkers for 5-fluorouracil, oxaliplatin, and BEZ235 are described that warrant further study. Amplification of the putative oncogene TRIB1 has been described for

  17. Genomic and Transcriptomic Analyses of Foodborne Bacterial Pathogens

    NASA Astrophysics Data System (ADS)

    Zhang, Wei; Dudley, Edward G.; Wade, Joseph T.

    DNA microarrays (often interchangeably called DNA chips or DNA arrays) are among the most popular analytical tools for high-throughput comparative genomic and transcriptomic analyses of foodborne bacterial pathogens. A typical DNA microarray contains hundreds to millions of small DNA probes that are chemically attached (or "printed") onto the surface of a microscopic glass slide. Depending on the specific "printing" and probe synthesis technologies for different microarray platforms, such DNA probes can be PCR amplicons or in situ synthesized short oligonucleotides. DNA microarray technologies have revolutionized the way that we investigate the biology of foodborne bacterial pathogens. The major advantage of these technologies is that DNA microarrays allow comparison of subtle genomic or transcriptomic variations between two bacterial samples, such as genomic variations between two different bacterial strains or transcriptomic alterations of same bacterial strain under two different treatments. Some applications of comparative genomic hybridization microarrays and global gene expression microarrays have been covered in previous chapters of this book.

  18. Translating Cancer Genomes and Transcriptomes for Precision Oncology

    PubMed Central

    Roychowdhury, Sameek; Chinnaiyan, Arul M.

    2015-01-01

    Understanding the molecular landscape of cancer has facilitated the development of diagnostic, prognostic, and predictive biomarkers for clinical oncology. Developments in next generation DNA sequencing technologies have increased the speed and reduced the cost of sequencing the nucleic acids of cancer cells. This has unlocked opportunities to characterize the genomic and transcriptomic landscapes of cancer for basic science research through projects such as The Cancer Genome Atlas. The cancer genome includes DNA-based alterations such as point mutations or gene duplications. The cancer transcriptome involves RNA-based alterations including changes in messenger RNAs. Together the genome and transcriptome can provide a comprehensive view of an individual patient’s cancer and is beginning to impact real-time clinical decision-making. We discuss several opportunities for translating this basic science knowledge into clinical practice including a molecular classification of cancer, heritable risk of cancer, eligibility for targeted therapies, and the development of innovative genomic-based clinical trials. In this review, we outline key applications and new directions for translating the cancer genome and transcriptome into patient care in the clinic. PMID:26528881

  19. Translating cancer genomes and transcriptomes for precision oncology.

    PubMed

    Roychowdhury, Sameek; Chinnaiyan, Arul M

    2016-01-01

    Understanding the molecular landscape of cancer has facilitated the development of diagnostic, prognostic, and predictive biomarkers for clinical oncology. Developments in next-generation DNA sequencing technologies have increased the speed and reduced the cost of sequencing the nucleic acids of cancer cells. This has unlocked opportunities to characterize the genomic and transcriptomic landscapes of cancer for basic science research through projects like The Cancer Genome Atlas. The cancer genome includes DNA-based alterations, such as point mutations or gene duplications. The cancer transcriptome involves RNA-based alterations, including changes in messenger RNAs. Together, the genome and transcriptome can provide a comprehensive view of an individual patient's cancer that is beginning to impact real-time clinical decision-making. The authors discuss several opportunities for translating this basic science knowledge into clinical practice, including a molecular classification of cancer, heritable risk of cancer, eligibility for targeted therapies, and the development of innovative, genomic-based clinical trials. In this review, key applications and new directions are outlined for translating the cancer genome and transcriptome into patient care in the clinic. © 2015 American Cancer Society.

  20. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics

    PubMed Central

    Tzika, Athanasia C.; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C.

    2015-01-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the “Reptilian Transcriptomes Database 2.0,” which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. PMID:26133641

  1. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics.

    PubMed

    Tzika, Athanasia C; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C

    2015-07-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the "Reptilian Transcriptomes Database 2.0," which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  2. Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.

    PubMed

    Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong

    2014-05-01

    We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.

  3. CarrotDB: a genomic and transcriptomic database for carrot

    PubMed Central

    Xu, Zhi-Sheng; Tan, Hua-Wei; Wang, Feng; Hou, Xi-Lin; Xiong, Ai-Sheng

    2014-01-01

    Carrot (Daucus carota L.) is an economically important vegetable worldwide and is the largest source of carotenoids and provitamin A in the human diet. Given the importance of this vegetable to humans, research and breeding communities on carrot should obtain useful genomic and transcriptomic information. The first whole-genome sequences of ‘DC-27’ carrot were de novo assembled and analyzed. Transcriptomic sequences of 14 carrot genotypes were downloaded from the Sequence Read Archive (SRA) database of National Center for Biotechnology Information (NCBI) and mapped to the whole-genome sequence before assembly. Based on these data sets, the first Web-based genomic and transcriptomic database for D. carota (CarrotDB) was developed (database homepage: http://apiaceae.njau.edu.cn/car rotdb). CarrotDB offers the tools of Genome Map and Basic Local Alignment Search Tool. Using these tools, users can search certain target genes and simple sequence repeats along with designed primers of ‘DC-27’. Assembled transcriptomic sequences along with fragments per kilobase of transcript sequence per millions base pairs sequenced information (FPKM) information of 14 carrot genotypes are also provided. Users can download de novo assembled whole-genome sequences, putative gene sequences and putative protein sequences of ‘DC-27’. Users can also download transcriptome sequence assemblies of 14 carrot genotypes along with their FPKM information. A total of 2826 transcription factor (TF) genes classified into 57 families were identified in the entire genome sequences. These TF genes were embedded in CarrotDB as an interface. The ‘GERMPLASM’ part of CarrotDB also offers taproot photos of 45 carrot genotypes and a table containing accession numbers, names, countries of origin and colors of cortex, phloem and xylem parts of taproots corresponding to each carrot genotype. CarrotDB will be continuously updated with new information. Database URL: http

  4. Genome interplay in the grain transcriptome of hexaploid bread wheat.

    PubMed

    Pfeifer, Matthias; Kugler, Karl G; Sandve, Simen R; Zhan, Bujie; Rudi, Heidi; Hvidsten, Torgeir R; Mayer, Klaus F X; Olsen, Odd-Arne

    2014-07-18

    Allohexaploid bread wheat (Triticum aestivum L.) provides approximately 20% of calories consumed by humans. Lack of genome sequence for the three homeologous and highly similar bread wheat genomes (A, B, and D) has impeded expression analysis of the grain transcriptome. We used previously unknown genome information to analyze the cell type-specific expression of homeologous genes in the developing wheat grain and identified distinct co-expression clusters reflecting the spatiotemporal progression during endosperm development. We observed no global but cell type- and stage-dependent genome dominance, organization of the wheat genome into transcriptionally active chromosomal regions, and asymmetric expression in gene families related to baking quality. Our findings give insight into the transcriptional dynamics and genome interplay among individual grain cell types in a polyploid cereal genome.

  5. InsectBase: a resource for insect genomes and transcriptomes.

    PubMed

    Yin, Chuanlin; Shen, Gengyu; Guo, Dianhao; Wang, Shuping; Ma, Xingzhou; Xiao, Huamei; Liu, Jinding; Zhang, Zan; Liu, Ying; Zhang, Yiqun; Yu, Kaixiang; Huang, Shuiqing; Li, Fei

    2016-01-04

    The genomes and transcriptomes of hundreds of insects have been sequenced. However, insect community lacks an integrated, up-to-date collection of insect gene data. Here, we introduce the first release of InsectBase, available online at http://www.insect-genome.com. The database encompasses 138 insect genomes, 116 insect transcriptomes, 61 insect gene sets, 36 gene families of 60 insects, 7544 miRNAs of 69 insects, 96,925 piRNAs of Drosophila melanogaster and Chilo suppressalis, 2439 lncRNA of Nilaparvata lugens, 22,536 pathways of 78 insects, 678,881 untranslated regions (UTR) of 84 insects and 160,905 coding sequences (CDS) of 70 insects. This release contains over 12 million sequences and provides search functionality, a BLAST server, GBrowse, insect pathway construction, a Facebook-like network for the insect community (iFacebook), and phylogenetic analysis of selected genes.

  6. InsectBase: a resource for insect genomes and transcriptomes

    PubMed Central

    Yin, Chuanlin; Shen, Gengyu; Guo, Dianhao; Wang, Shuping; Ma, Xingzhou; Xiao, Huamei; Liu, Jinding; Zhang, Zan; Liu, Ying; Zhang, Yiqun; Yu, Kaixiang; Huang, Shuiqing; Li, Fei

    2016-01-01

    The genomes and transcriptomes of hundreds of insects have been sequenced. However, insect community lacks an integrated, up-to-date collection of insect gene data. Here, we introduce the first release of InsectBase, available online at http://www.insect-genome.com. The database encompasses 138 insect genomes, 116 insect transcriptomes, 61 insect gene sets, 36 gene families of 60 insects, 7544 miRNAs of 69 insects, 96 925 piRNAs of Drosophila melanogaster and Chilo suppressalis, 2439 lncRNA of Nilaparvata lugens, 22 536 pathways of 78 insects, 678 881 untranslated regions (UTR) of 84 insects and 160 905 coding sequences (CDS) of 70 insects. This release contains over 12 million sequences and provides search functionality, a BLAST server, GBrowse, insect pathway construction, a Facebook-like network for the insect community (iFacebook), and phylogenetic analysis of selected genes. PMID:26578584

  7. Pichia stipitis genomics, transcriptomics, and gene clusters

    Treesearch

    Thomas W. Jeffries; Jennifer R. Headman Van Vleet

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...

  8. Genomic and transcriptomic resources for assassin flies including the complete genome sequence of Proctacanthus coquilletti (Insecta: Diptera: Asilidae) and 16 representative transcriptomes.

    PubMed

    Dikow, Rebecca B; Frandsen, Paul B; Turcatel, Mauren; Dikow, Torsten

    2017-01-01

    A high-quality draft genome for Proctacanthus coquilletti (Insecta: Diptera: Asilidae) is presented along with transcriptomes for 16 Diptera species from five families: Asilidae, Apioceridae, Bombyliidae, Mydidae, and Tabanidae. Genome sequencing reveals that P. coquilletti has a genome size of approximately 210 Mbp and remarkably low heterozygosity (0.47%) and few repeats (15%). These characteristics helped produce a highly contiguous (N50 = 862 kbp) assembly, particularly given that only a single 2 × 250 bp PCR-free Illumina library was sequenced. A phylogenomic hypothesis is presented based on thousands of putative orthologs across the 16 transcriptomes. Phylogenetic relationships support the sister group relationship of Apioceridae + Mydidae to Asilidae. A time-calibrated phylogeny is also presented, with seven fossil calibration points, which suggests an older age of the split among Apioceridae, Asilidae, and Mydidae (158 mya) and Apioceridae and Mydidae (135 mya) than proposed in the AToL FlyTree project. Future studies will be able to take advantage of the resources presented here in order to produce large scale phylogenomic and evolutionary studies of assassin fly phylogeny, life histories, or venom. The bioinformatics tools and workflow presented here will be useful to others wishing to generate de novo genomic resources in species-rich taxa without a closely-related reference genome.

  9. Genomic and transcriptomic resources for assassin flies including the complete genome sequence of Proctacanthus coquilletti (Insecta: Diptera: Asilidae) and 16 representative transcriptomes

    PubMed Central

    Frandsen, Paul B.; Turcatel, Mauren

    2017-01-01

    A high-quality draft genome for Proctacanthus coquilletti (Insecta: Diptera: Asilidae) is presented along with transcriptomes for 16 Diptera species from five families: Asilidae, Apioceridae, Bombyliidae, Mydidae, and Tabanidae. Genome sequencing reveals that P. coquilletti has a genome size of approximately 210 Mbp and remarkably low heterozygosity (0.47%) and few repeats (15%). These characteristics helped produce a highly contiguous (N50 = 862 kbp) assembly, particularly given that only a single 2 × 250 bp PCR-free Illumina library was sequenced. A phylogenomic hypothesis is presented based on thousands of putative orthologs across the 16 transcriptomes. Phylogenetic relationships support the sister group relationship of Apioceridae + Mydidae to Asilidae. A time-calibrated phylogeny is also presented, with seven fossil calibration points, which suggests an older age of the split among Apioceridae, Asilidae, and Mydidae (158 mya) and Apioceridae and Mydidae (135 mya) than proposed in the AToL FlyTree project. Future studies will be able to take advantage of the resources presented here in order to produce large scale phylogenomic and evolutionary studies of assassin fly phylogeny, life histories, or venom. The bioinformatics tools and workflow presented here will be useful to others wishing to generate de novo genomic resources in species-rich taxa without a closely-related reference genome. PMID:28168115

  10. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    SciTech Connect

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.; Bergman, Nicholas; Borodovsky, Mark

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

  11. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree

    PubMed Central

    Rangiah, Kannan; Mahesh, HB; Rajamani, Anantharamanan; Shirke, Meghana D.; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, BN

    2015-01-01

    Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways. PMID:26290780

  12. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree.

    PubMed

    Kuravadi, Nagesh A; Yenagi, Vijay; Rangiah, Kannan; Mahesh, H B; Rajamani, Anantharamanan; Shirke, Meghana D; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, B N; Gowda, Malali

    2015-01-01

    Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC-600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways.

  13. Status of duckweed genomics and transcriptomics.

    PubMed

    Wang, W; Messing, J

    2015-01-01

    Duckweeds belong to the smallest flowering plants that undergo fast vegetative growth in an aquatic environment. They are commonly used in wastewater treatment and animal feed. Whereas duckweeds have been studied at the biochemical level, their reduced morphology and wide environmental adaption had not been subjected to molecular analysis until recently. Here, we review the progress that has been made in using a DNA barcode system and the sequences of chloroplast and mitochondrial genomes to identify duckweed species at the species or population level. We also review analysis of the nuclear genome sequence of Spirodela that provides new insights into fundamental biological questions. Indeed, reduced gene families and missing genes are consistent with its compact morphogenesis, aquatic floating and suppression of juvenile-to-adult transition. Furthermore, deep RNA sequencing of Spirodela at the onset of dormancy and Landoltia in exposure of nutrient deficiency illustrate the molecular network for environmental adaption and stress response, constituting major progress towards a post-genome sequencing phase, where further functional genomic details can be explored. Rapid advances in sequencing technologies could continue to promote a proliferation of genome sequences for additional ecotypes as well as for other duckweed species. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.

  14. Genomics and transcriptomics in drug discovery.

    PubMed

    Dopazo, Joaquin

    2014-02-01

    The popularization of genomic high-throughput technologies is causing a revolution in biomedical research and, particularly, is transforming the field of drug discovery. Systems biology offers a framework to understand the extensive human genetic heterogeneity revealed by genomic sequencing in the context of the network of functional, regulatory and physical protein-drug interactions. Thus, approaches to find biomarkers and therapeutic targets will have to take into account the complex system nature of the relationships of the proteins with the disease. Pharmaceutical companies will have to reorient their drug discovery strategies considering the human genetic heterogeneity. Consequently, modeling and computational data analysis will have an increasingly important role in drug discovery.

  15. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome

    PubMed Central

    2013-01-01

    Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871

  16. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome.

    PubMed

    Wenger, Yvan; Galliot, Brigitte

    2013-03-25

    Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.

  17. The past, present, and future of Leishmania genomics and transcriptomics

    PubMed Central

    Cantacessi, Cinzia; Dantas-Torres, Filipe; Nolan, Matthew J.; Otranto, Domenico

    2015-01-01

    It has been nearly 10 years since the completion of the first entire genome sequence of a Leishmania parasite. Genomic and transcriptomic analyses have advanced our understanding of the biology of Leishmania, and shed new light on the complex interactions occurring within the parasite–host–vector triangle. Here, we review these advances and examine potential avenues for translation of these discoveries into treatment and control programs. In addition, we argue for a strong need to explore how disease in dogs relates to that in humans, and how an improved understanding in line with the ‘One Health’ concept may open new avenues for the control of these devastating diseases. PMID:25638444

  18. The past, present, and future of Leishmania genomics and transcriptomics.

    PubMed

    Cantacessi, Cinzia; Dantas-Torres, Filipe; Nolan, Matthew J; Otranto, Domenico

    2015-03-01

    It has been nearly 10 years since the completion of the first entire genome sequence of a Leishmania parasite. Genomic and transcriptomic analyses have advanced our understanding of the biology of Leishmania, and shed new light on the complex interactions occurring within the parasite-host-vector triangle. Here, we review these advances and examine potential avenues for translation of these discoveries into treatment and control programs. In addition, we argue for a strong need to explore how disease in dogs relates to that in humans, and how an improved understanding in line with the 'One Health' concept may open new avenues for the control of these devastating diseases.

  19. Single Cell Genomics and Transcriptomics for Unicellular Eukaryotes

    SciTech Connect

    Ciobanu, Doina; Clum, Alicia; Singh, Vasanth; Salamov, Asaf; Han, James; Copeland, Alex; Grigoriev, Igor; James, Timothy; Singer, Steven; Woyke, Tanja; Malmstrom, Rex; Cheng, Jan-Fang

    2014-03-14

    Despite their small size, unicellular eukaryotes have complex genomes with a high degree of plasticity that allow them to adapt quickly to environmental changes. Unicellular eukaryotes live with prokaryotes and higher eukaryotes, frequently in symbiotic or parasitic niches. To this day their contribution to the dynamics of the environmental communities remains to be understood. Unfortunately, the vast majority of eukaryotic microorganisms are either uncultured or unculturable, making genome sequencing impossible using traditional approaches. We have developed an approach to isolate unicellular eukaryotes of interest from environmental samples, and to sequence and analyze their genomes and transcriptomes. We have tested our methods with six species: an uncharacterized protist from cellulose-enriched compost identified as Platyophrya, a close relative of P. vorax; the fungus Metschnikowia bicuspidate, a parasite of water flea Daphnia; the mycoparasitic fungi Piptocephalis cylindrospora, a parasite of Cokeromyces and Mucor; Caulochytrium protosteloides, a parasite of Sordaria; Rozella allomycis, a parasite of the water mold Allomyces; and the microalgae Chlamydomonas reinhardtii. Here, we present the four components of our approach: pre-sequencing methods, sequence analysis for single cell genome assembly, sequence analysis of single cell transcriptomes, and genome annotation. This technology has the potential to uncover the complexity of single cell eukaryotes and their role in the environmental samples.

  20. Transcriptome and genome sequencing uncovers functional variation in humans.

    PubMed

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; 't Hoen, Peter A C; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk P J; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Angel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-09-26

    Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

  1. Introduction to Nematode Genome and Transcriptome Announcements in the Journal of Nematology.

    PubMed

    Denver, Dee R; Ragsdale, Erik J; Thomas, W Kelley; Zasada, Inga A

    2017-06-01

    The Journal of Nematology now offers publication of Nematode Genome Announcements (NGA) and Nematode Transcriptome Announcements (NTA). These brief reports announce the sequencing and assembly of a nematode genome or transcriptome resource, along with basic technical information on DNA sequencing and bioinformatic methods used. This publishing initiative offers a new avenue to openly and concisely communicate the availability and relevance of genome and transcriptome sequence resources to the broader scientific community.

  2. Bamboo Flowering from the Perspective of Comparative Genomics and Transcriptomics

    PubMed Central

    Biswas, Prasun; Chakraborty, Sukanya; Dutta, Smritikana; Pal, Amita; Das, Malay

    2016-01-01

    Bamboos are an important member of the subfamily Bambusoideae, family Poaceae. The plant group exhibits wide variation with respect to the timing (1–120 years) and nature (sporadic vs. gregarious) of flowering among species. Usually flowering in woody bamboos is synchronous across culms growing over a large area, known as gregarious flowering. In many monocarpic bamboos this is followed by mass death and seed setting. While in sporadic flowering an isolated wild clump may flower, set little or no seed and remain alive. Such wide variation in flowering time and extent means that the plant group serves as repositories for genes and expression patterns that are unique to bamboo. Due to the dearth of available genomic and transcriptomic resources, limited studies have been undertaken to identify the potential molecular players in bamboo flowering. The public release of the first bamboo genome sequence Phyllostachys heterocycla, availability of related genomes Brachypodium distachyon and Oryza sativa provide us the opportunity to study this long-standing biological problem in a comparative and functional genomics framework. We identified bamboo genes homologous to those of Oryza and Brachypodium that are involved in established pathways such as vernalization, photoperiod, autonomous, and hormonal regulation of flowering. Additionally, we investigated triggers like stress (drought), physiological maturity and micro RNAs that may play crucial roles in flowering. We also analyzed available transcriptome datasets of different bamboo species to identify genes and their involvement in bamboo flowering. Finally, we summarize potential research hurdles that need to be addressed in future research. PMID:28018419

  3. Bamboo Flowering from the Perspective of Comparative Genomics and Transcriptomics.

    PubMed

    Biswas, Prasun; Chakraborty, Sukanya; Dutta, Smritikana; Pal, Amita; Das, Malay

    2016-01-01

    Bamboos are an important member of the subfamily Bambusoideae, family Poaceae. The plant group exhibits wide variation with respect to the timing (1-120 years) and nature (sporadic vs. gregarious) of flowering among species. Usually flowering in woody bamboos is synchronous across culms growing over a large area, known as gregarious flowering. In many monocarpic bamboos this is followed by mass death and seed setting. While in sporadic flowering an isolated wild clump may flower, set little or no seed and remain alive. Such wide variation in flowering time and extent means that the plant group serves as repositories for genes and expression patterns that are unique to bamboo. Due to the dearth of available genomic and transcriptomic resources, limited studies have been undertaken to identify the potential molecular players in bamboo flowering. The public release of the first bamboo genome sequence Phyllostachys heterocycla, availability of related genomes Brachypodium distachyon and Oryza sativa provide us the opportunity to study this long-standing biological problem in a comparative and functional genomics framework. We identified bamboo genes homologous to those of Oryza and Brachypodium that are involved in established pathways such as vernalization, photoperiod, autonomous, and hormonal regulation of flowering. Additionally, we investigated triggers like stress (drought), physiological maturity and micro RNAs that may play crucial roles in flowering. We also analyzed available transcriptome datasets of different bamboo species to identify genes and their involvement in bamboo flowering. Finally, we summarize potential research hurdles that need to be addressed in future research.

  4. Metabonomic, transcriptomic, and genomic variation of a population cohort.

    PubMed

    Inouye, Michael; Kettunen, Johannes; Soininen, Pasi; Silander, Kaisa; Ripatti, Samuli; Kumpula, Linda S; Hämäläinen, Eija; Jousilahti, Pekka; Kangas, Antti J; Männistö, Satu; Savolainen, Markku J; Jula, Antti; Leiviskä, Jaana; Palotie, Aarno; Salomaa, Veikko; Perola, Markus; Ala-Korpela, Mika; Peltonen, Leena

    2010-12-21

    Comprehensive characterization of human tissues promises novel insights into the biological architecture of human diseases and traits. We assessed metabonomic, transcriptomic, and genomic variation for a large population-based cohort from the capital region of Finland. Network analyses identified a set of highly correlated genes, the lipid-leukocyte (LL) module, as having a prominent role in over 80 serum metabolites (of 134 measures quantified), including lipoprotein subclasses, lipids, and amino acids. Concurrent association with immune response markers suggested the LL module as a possible link between inflammation, metabolism, and adiposity. Further, genomic variation was used to generate a directed network and infer LL module's largely reactive nature to metabolites. Finally, gene co-expression in circulating leukocytes was shown to be dependent on serum metabolite concentrations, providing evidence for the hypothesis that the coherence of molecular networks themselves is conditional on environmental factors. These findings show the importance and opportunity of systematic molecular investigation of human population samples. To facilitate and encourage this investigation, the metabonomic, transcriptomic, and genomic data used in this study have been made available as a resource for the research community.

  5. Transcriptome and Genome Size Analysis of the Venus Flytrap

    PubMed Central

    Bressendorff, Simon; Seguin-Orlando, Andaine; Petersen, Morten; Sicheritz-Pontén, Thomas; Mundy, John

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin’s studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations. PMID:25886597

  6. Transcriptome and genome size analysis of the Venus flytrap.

    PubMed

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon; Seguin-Orlando, Andaine; Petersen, Morten; Sicheritz-Pontén, Thomas; Mundy, John

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations.

  7. The draft genome and transcriptome of Cannabis sativa

    PubMed Central

    2011-01-01

    Background Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. Results We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of Δ9-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid Δ9-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp. Conclusions The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics. PMID:22014239

  8. The draft genome and transcriptome of Cannabis sativa.

    PubMed

    van Bakel, Harm; Stout, Jake M; Cote, Atina G; Tallon, Carling M; Sharpe, Andrew G; Hughes, Timothy R; Page, Jonathan E

    2011-10-20

    Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of Δ9-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid Δ9-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp. The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics.

  9. Genomics and transcriptomics across the diversity of the Nematoda.

    PubMed

    Blaxter, M; Kumar, S; Kaur, G; Koutsovoulos, G; Elsworth, B

    2012-01-01

    The diversity of biology in nematodes is reflected in the diversity of their genomes. Parasitic species in particular have evolved mechanisms to invade and outwit their hosts, and these offer opportunities for the development of control measures. Genomic analyses can reveal the molecular underpinnings of phenotypes such as parasitism and thus, initiate and support research programmes that explore the manipulation of host and parasite physiologies to achieve favourable outcomes. Wide sampling across nematode diversity allows phylogenetically informed formulation of research hypotheses, identification of core features shared by all species or important evolutionary novelties present in isolated clades. Many nematode species have been investigated through the use of the expressed sequence tag approach, which samples from the transcribed genome. Gene catalogues generated in this way can be explored to reveal the patterns of expression associated with parasitism and candidates for testing as drug targets or vaccine components. Analysis environments, such as NEMBASE facilitate exploitation of these data. The development of new high-throughput DNA-sequencing technologies has facilitated transcriptomic and genomic approaches to parasite biology. Whole genome sequencing offers more complete catalogues of genes and assists a systems approach to phenotype dissection. These efforts are being coordinated through the 959 Nematode Genomes initiative. © 2011 Blackwell Publishing Ltd.

  10. Comparative genomics and transcriptomics of trait-gene association

    PubMed Central

    2012-01-01

    Background The Order Rickettsiales includes important tick-borne pathogens, from Rickettsia rickettsii, which causes Rocky Mountain spotted fever, to Anaplasma marginale, the most prevalent vector-borne pathogen of cattle. Although most pathogens in this Order are transmitted by arthropod vectors, little is known about the microbial determinants of transmission. A. marginale provides unique tools for studying the determinants of transmission, with multiple strain sequences available that display distinct and reproducible transmission phenotypes. The closed core A. marginale genome suggests that any phenotypic differences are due to single nucleotide polymorphisms (SNPs). We combined DNA/RNA comparative genomic approaches using strains with different tick transmission phenotypes and identified genes that segregate with transmissibility. Results Comparison of seven strains with different transmission phenotypes generated a list of SNPs affecting 18 genes and nine promoters. Transcriptional analysis found two candidate genes downstream from promoter SNPs that were differentially transcribed. To corroborate the comparative genomics approach we used three RNA-seq platforms to analyze the transcriptomes from two A. marginale strains with different transmission phenotypes. RNA-seq analysis confirmed the comparative genomics data and found 10 additional genes whose transcription between strains with distinct transmission efficiencies was significantly different. Six regions of the genome that contained no annotation were found to be transcriptionally active, and two of these newly identified transcripts were differentially transcribed. Conclusions This approach identified 30 genes and two novel transcripts potentially involved in tick transmission. We describe the transcriptome of an obligate intracellular bacterium in depth, while employing massive parallel sequencing to dissect an important trait in bacterial pathogenesis. PMID:23181781

  11. Cajal body function in genome organization and transcriptome diversity.

    PubMed

    Sawyer, Iain A; Sturgill, David; Sung, Myong-Hee; Hager, Gordon L; Dundr, Miroslav

    2016-12-01

    Nuclear bodies contribute to non-random organization of the human genome and nuclear function. Using a major prototypical nuclear body, the Cajal body, as an example, we suggest that these structures assemble at specific gene loci located across the genome as a result of high transcriptional activity. Subsequently, target genes are physically clustered in close proximity in Cajal body-containing cells. However, Cajal bodies are observed in only a limited number of human cell types, including neuronal and cancer cells. Ultimately, Cajal body depletion perturbs splicing kinetics by reducing target small nuclear RNA (snRNA) transcription and limiting the levels of spliceosomal snRNPs, including their modification and turnover following each round of RNA splicing. As such, Cajal bodies are capable of shaping the chromatin interaction landscape and the transcriptome by influencing spliceosome kinetics. Future studies should concentrate on characterizing the direct influence of Cajal bodies upon snRNA gene transcriptional dynamics. Also see the video abstract here.

  12. The capsicum transcriptome DB: a "hot" tool for genomic research.

    PubMed

    Góngora-Castillo, Elsa; Fajardo-Jaime, Rubén; Fernández-Cortes, Araceli; Jofre-Garfias, Alba E; Lozoya-Gloria, Edmundo; Martínez, Octavio; Ochoa-Alejo, Neftalí; Rivera-Bustamante, Rafael

    2012-01-01

    Chili pepper (Capsicum annuum) is an economically important crop with no available public genome sequence. We describe a genomic resource to facilitate Capsicum annuum research. A collection of Expressed Sequence Tags (ESTs) derived from five C. annuum organs (root, stem, leaf, flower and fruit) were sequenced using the Sanger method and multiple leaf transcriptomes were deeply sampled using with GS-pyrosequencing. A hybrid assembly of 1,324,516 raw reads yielded 32,314 high quality contigs as validated by coverage and identity analysis with existing pepper sequences. Overall, 75.5% of the contigs had significant sequence similarity to entries in nucleic acid and protein databases; 23% of the sequences have not been previously reported for C. annuum and expand sequence resources for this species. A MySQL database and a user-friendly Web interface were constructed with search-tools that permit queries of the ESTs including sequence, functional annotation, Gene Ontology classification, metabolic pathways, and assembly information. The Capsicum Transcriptome DB is free available from http://www.bioingenios.ira.cinvestav.mx:81/Joomla/

  13. Computational analysis of conserved RNA secondary structure in transcriptomes and genomes

    PubMed Central

    Eddy, Sean R.

    2017-01-01

    Transcriptomics experiments and computational predictions both enable systematic discovery of new functional RNAs, but many putative noncoding transcripts arise instead from artifacts and biological noise, and current computational prediction methods have high false positive rates. I discuss prospects for improving computational methods for analyzing and identifying functional RNAs, with a focus on detecting signatures of conserved RNA secondary structure. An interesting new front is the application of chemical and enzymatic RNA structure probing experiments on a transcriptome-wide scale. I review several proposed approaches for incorporating structure probing data into computational RNA secondary structure prediction. Using probabilistic inference formalisms, I show how all these approaches can be unified in a well-principled framework. Using that framework, RNA probing data can easily be integrated into a wide range of different analyses that depend on RNA secondary structure inference, including homology search and genome-wide detection of new structural RNAs. PMID:24895857

  14. Genome sequence and transcriptome analyses of the thermophilic zygomycete fungus Rhizomucor miehei.

    PubMed

    Zhou, Peng; Zhang, Guoqiang; Chen, Shangwu; Jiang, Zhengqiang; Tang, Yanbin; Henrissat, Bernard; Yan, Qiaojuan; Yang, Shaoqing; Chen, Chin-Fu; Zhang, Bing; Du, Zhenglin

    2014-04-21

    The zygomycete fungi like Rhizomucor miehei have been extensively exploited for the production of various enzymes. As a thermophilic fungus, R. miehei is capable of growing at temperatures that approach the upper limits for all eukaryotes. To date, over hundreds of fungal genomes are publicly available. However, Zygomycetes have been rarely investigated both genetically and genomically. Here, we report the genome of R. miehei CAU432 to explore the thermostable enzymatic repertoire of this fungus. The assembled genome size is 27.6-million-base (Mb) with 10,345 predicted protein-coding genes. Even being thermophilic, the G + C contents of fungal whole genome (43.8%) and coding genes (47.4%) are less than 50%. Phylogenetically, R. miehei is more closerly related to Phycomyces blakesleeanus than to Mucor circinelloides and Rhizopus oryzae. The genome of R. miehei harbors a large number of genes encoding secreted proteases, which is consistent with the characteristics of R. miehei being a rich producer of proteases. The transcriptome profile of R. miehei showed that the genes responsible for degrading starch, glucan, protein and lipid were highly expressed. The genome information of R. miehei will facilitate future studies to better understand the mechanisms of fungal thermophilic adaptation and the exploring of the potential of R. miehei in industrial-scale production of thermostable enzymes. Based on the existence of a large repertoire of amylolytic, proteolytic and lipolytic genes in the genome, R. miehei has potential in the production of a variety of such enzymes.

  15. LegumeIP: an integrative database for comparative genomics and transcriptomics of model legumes.

    PubMed

    Li, Jun; Dai, Xinbin; Liu, Tingsong; Zhao, Patrick Xuechun

    2012-01-01

    Legumes play a vital role in maintaining the nitrogen cycle of the biosphere. They conduct symbiotic nitrogen fixation through endosymbiotic relationships with bacteria in root nodules. However, this and other characteristics of legumes, including mycorrhization, compound leaf development and profuse secondary metabolism, are absent in the typical model plant Arabidopsis thaliana. We present LegumeIP (http://plantgrn.noble.org/LegumeIP/), an integrative database for comparative genomics and transcriptomics of model legumes, for studying gene function and genome evolution in legumes. LegumeIP compiles gene and gene family information, syntenic and phylogenetic context and tissue-specific transcriptomic profiles. The database holds the genomic sequences of three model legumes, Medicago truncatula, Glycine max and Lotus japonicus plus two reference plant species, A. thaliana and Populus trichocarpa, with annotations based on UniProt, InterProScan, Gene Ontology and the Kyoto Encyclopedia of Genes and Genomes databases. LegumeIP also contains large-scale microarray and RNA-Seq-based gene expression data. Our new database is capable of systematic synteny analysis across M. truncatula, G. max, L. japonicas and A. thaliana, as well as construction and phylogenetic analysis of gene families across the five hosted species. Finally, LegumeIP provides comprehensive search and visualization tools that enable flexible queries based on gene annotation, gene family, synteny and relative gene expression.

  16. Comparative Genomics and Transcriptomics Analyses Reveal Divergent Lifestyle Features of Nematode Endoparasitic Fungus Hirsutella minnesotensis

    PubMed Central

    Lai, Yiling; Liu, Keke; Zhang, Xinyu; Zhang, Xiaoling; Li, Kuan; Wang, Niuniu; Shu, Chi; Wu, Yunpeng; Wang, Chengshu; Bushley, Kathryn E.; Xiang, Meichun; Liu, Xingzhong

    2014-01-01

    Hirsutella minnesotensis [Ophiocordycipitaceae (Hypocreales, Ascomycota)] is a dominant endoparasitic fungus by using conidia that adhere to and penetrate the secondary stage juveniles of soybean cyst nematode. Its genome was de novo sequenced and compared with five entomopathogenic fungi in the Hypocreales and three nematode-trapping fungi in the Orbiliales (Ascomycota). The genome of H. minnesotensis is 51.4 Mb and encodes 12,702 genes enriched with transposable elements up to 32%. Phylogenomic analysis revealed that H. minnesotensis was diverged from entomopathogenic fungi in Hypocreales. Genome of H. minnesotensis is similar to those of entomopathogenic fungi to have fewer genes encoding lectins for adhesion and glycoside hydrolases for cellulose degradation, but is different from those of nematode-trapping fungi to possess more genes for protein degradation, signal transduction, and secondary metabolism. Those results indicate that H. minnesotensis has evolved different mechanism for nematode endoparasitism compared with nematode-trapping fungi. Transcriptomics analyses for the time-scale parasitism revealed the upregulations of lectins, secreted proteases and the genes for biosynthesis of secondary metabolites that could be putatively involved in host surface adhesion, cuticle degradation, and host manipulation. Genome and transcriptome analyses provided comprehensive understanding of the evolution and lifestyle of nematode endoparasitism. PMID:25359922

  17. Transcriptome-guided characterization of genomic rearrangements in a breast cancer cell line.

    PubMed

    Zhao, Qi; Caballero, Otavia L; Levy, Samuel; Stevenson, Brian J; Iseli, Christian; de Souza, Sandro J; Galante, Pedro A; Busam, Dana; Leversha, Margaret A; Chadalavada, Kalyani; Rogers, Yu-Hui; Venter, J Craig; Simpson, Andrew J G; Strausberg, Robert L

    2009-02-10

    We have identified new genomic alterations in the breast cancer cell line HCC1954, using high-throughput transcriptome sequencing. With 120 Mb of cDNA sequences, we were able to identify genomic rearrangement events leading to fusions or truncations of genes including MRE11 and NSD1, genes already implicated in oncogenesis, and 7 rearrangements involving other additional genes. This approach demonstrates that high-throughput transcriptome sequencing is an effective strategy for the characterization of genomic rearrangements in cancers.

  18. Genome-wide transcriptome analysis of human epidermal melanocytes

    PubMed Central

    Haltaufderhyde, Kirk D.; Oancea, Elena

    2015-01-01

    Because human epidermal melanocytes (HEMs) provide critical protection against skin cancer, sunburn, and photoaging, a genome-wide perspective of gene expression in these cells is vital to understanding human skin physiology. In this study we performed high throughput sequencing of HEMs to obtain a complete data set of transcript sizes, abundances, and splicing. As expected, we found that melanocyte specific genes that function in pigmentation were among the highest expressed genes. We analyzed receptor, ion channel and transcription factor gene families to get a better understanding of the cell signalling pathways used by melanocytes. We also performed a comparative transcriptomic analysis of lightly versus darkly pigmented HEMs and found 16 genes differentially expressed in the two pigmentation phenotypes; of those, only one putative melanosomal transporter (SLC45A2) has known function in pigmentation. In addition, we found 166 genes with splice isoforms expressed exclusively in one pigmentation phenotype, 17 of which are genes involved in signal transduction. Our melanocyte transcriptome study provides a comprehensive view and may help identify novel pigmentation genes and potential pharmacological targets. PMID:25451175

  19. TraV: A Genome Context Sensitive Transcriptome Browser

    PubMed Central

    Dietrich, Sascha; Wiegand, Sandra; Liesegang, Heiko

    2014-01-01

    Next-generation sequencing (NGS) technologies like Illumina and ABI Solid enable the investigation of transcriptional activities of genomes. While read mapping tools have been continually improved to enable the processing of the increasing number of reads generated by NGS technologies, analysis and visualization tools are struggling with the amount of data they are presented with. Current tools are capable of handling at most two to three datasets simultaneously before they are limited by available memory or due to processing overhead. In order to process fifteen transcriptome sequencing experiments of Bacillus licheniformis DSM13 obtained in a previous study, we developed TraV, a RNA-Seq analysis and visualization tool. The analytical methods are designed for prokaryotic RNA-seq experiments. TraV calculates single nucleotide activities from the mapping information to visualize and analyze multiple transcriptome sequencing experiments. The use of nucleotide activities instead of single read mapping information is highly memory efficient without incurring a processing overhead. TraV is available at http://appmibio.uni-goettingen.de/index.php?sec=serv. PMID:24709941

  20. TraV: a genome context sensitive transcriptome browser.

    PubMed

    Dietrich, Sascha; Wiegand, Sandra; Liesegang, Heiko

    2014-01-01

    Next-generation sequencing (NGS) technologies like Illumina and ABI Solid enable the investigation of transcriptional activities of genomes. While read mapping tools have been continually improved to enable the processing of the increasing number of reads generated by NGS technologies, analysis and visualization tools are struggling with the amount of data they are presented with. Current tools are capable of handling at most two to three datasets simultaneously before they are limited by available memory or due to processing overhead. In order to process fifteen transcriptome sequencing experiments of Bacillus licheniformis DSM13 obtained in a previous study, we developed TraV, a RNA-Seq analysis and visualization tool. The analytical methods are designed for prokaryotic RNA-seq experiments. TraV calculates single nucleotide activities from the mapping information to visualize and analyze multiple transcriptome sequencing experiments. The use of nucleotide activities instead of single read mapping information is highly memory efficient without incurring a processing overhead. TraV is available at http://appmibio.uni-goettingen.de/index.php?sec=serv.

  1. Transcriptome-wide investigation of genomic imprinting in chicken.

    PubMed

    Frésard, Laure; Leroux, Sophie; Servin, Bertrand; Gourichon, David; Dehais, Patrice; Cristobal, Magali San; Marsaud, Nathalie; Vignoles, Florence; Bed'hom, Bertrand; Coville, Jean-Luc; Hormozdiari, Farhad; Beaumont, Catherine; Zerjal, Tatiana; Vignal, Alain; Morisson, Mireille; Lagarrigue, Sandrine; Pitel, Frédérique

    2014-04-01

    Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken.

  2. Transcriptome-wide investigation of genomic imprinting in chicken

    PubMed Central

    Frésard, Laure; Leroux, Sophie; Servin, Bertrand; Gourichon, David; Dehais, Patrice; Cristobal, Magali San; Marsaud, Nathalie; Vignoles, Florence; Bed'hom, Bertrand; Coville, Jean-Luc; Hormozdiari, Farhad; Beaumont, Catherine; Zerjal, Tatiana; Vignal, Alain; Morisson, Mireille; Lagarrigue, Sandrine; Pitel, Frédérique

    2014-01-01

    Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken. PMID:24452801

  3. Comprehensive Annotation of the Parastagonospora nodorum Reference Genome Using Next-Generation Genomics, Transcriptomics and Proteogenomics.

    PubMed

    Syme, Robert A; Tan, Kar-Chun; Hane, James K; Dodhia, Kejal; Stoll, Thomas; Hastie, Marcus; Furuki, Eiko; Ellwood, Simon R; Williams, Angela H; Tan, Yew-Foon; Testa, Alison C; Gorman, Jeffrey J; Oliver, Richard P

    2016-01-01

    Parastagonospora nodorum, the causal agent of Septoria nodorum blotch (SNB), is an economically important pathogen of wheat (Triticum spp.), and a model for the study of necrotrophic pathology and genome evolution. The reference P. nodorum strain SN15 was the first Dothideomycete with a published genome sequence, and has been used as the basis for comparison within and between species. Here we present an updated reference genome assembly with corrections of SNP and indel errors in the underlying genome assembly from deep resequencing data as well as extensive manual annotation of gene models using transcriptomic and proteomic sources of evidence (https://github.com/robsyme/Parastagonospora_nodorum_SN15). The updated assembly and annotation includes 8,366 genes with modified protein sequence and 866 new genes. This study shows the benefits of using a wide variety of experimental methods allied to expert curation to generate a reliable set of gene models.

  4. IMPROVED PERFORMANCE OF GENE SET ANALYSIS ON GENOME-WIDE TRANSCRIPTOMICS DATA WHEN USING GENE ACTIVITY STATE ESTIMATES.

    PubMed

    Kamp, Thomas; Adams, Micah; Disselkoen, Craig; Tintle, Nathan

    2016-01-01

    Gene set analysis methods continue to be a popular and powerful method of evaluating genome-wide transcriptomics data. These approach require a priori grouping of genes into biologically meaningful sets, and then conducting downstream analyses at the set (instead of gene) level of analysis. Gene set analysis methods have been shown to yield more powerful statistical conclusions than single-gene analyses due to both reduced multiple testing penalties and potentially larger observed effects due to the aggregation of effects across multiple genes in the set. Traditionally, gene set analysis methods have been applied directly to normalized, log-transformed, transcriptomics data. Recently, efforts have been made to transform transcriptomics data to scales yielding more biologically interpretable results. For example, recently proposed models transform log-transformed transcriptomics data to a confidence metric (ranging between 0 and 100%) that a gene is active (roughly speaking, that the gene product is part of an active cellular mechanism). In this manuscript, we demonstrate, on both real and simulated transcriptomics data, that tests for differential expression between sets of genes using are typically more powerful when using gene activity state estimates as opposed to log-transformed gene expression data. Our analysis suggests further exploration of techniques to transform transcriptomics data to meaningful quantities for improved downstream inference.

  5. A comparative genomics approach to identifying the plasticity transcriptome

    PubMed Central

    Pfenning, Andreas R; Schwartz, Russell; Barth, Alison L

    2007-01-01

    Background Neuronal activity regulates gene expression to control learning and memory, homeostasis of neuronal function, and pathological disease states such as epilepsy. A great deal of experimental evidence supports the involvement of two particular transcription factors in shaping the genomic response to neuronal activity and mediating plasticity: CREB and zif268 (egr-1, krox24, NGFI-A). The gene targets of these two transcription factors are of considerable interest, since they may help develop hypotheses about how neural activity is coupled to changes in neural function. Results We have developed a computational approach for identifying binding sites for these transcription factors within the promoter regions of annotated genes in the mouse, rat, and human genomes. By combining a robust search algorithm to identify discrete binding sites, a comparison of targets across species, and an analysis of binding site locations within promoter regions, we have defined a group of candidate genes that are strong CREB- or zif268 targets and are thus regulated by neural activity. Our analysis revealed that CREB and zif268 share a disproportionate number of targets in common and that these common targets are dominated by transcription factors. Conclusion These observations may enable a more detailed understanding of the regulatory networks that are induced by neural activity and contribute to the plasticity transcriptome. The target genes identified in this study will be a valuable resource for investigators who hope to define the functions of specific genes that underlie activity-dependent changes in neuronal properties. PMID:17355637

  6. Transcriptome profiling reveals mosaic genomic origins of modern cultivated barley

    PubMed Central

    Dai, Fei; Chen, Zhong-Hua; Wang, Xiaolei; Li, Zefeng; Jin, Gulei; Wu, Dezhi; Cai, Shengguan; Wang, Ning; Wu, Feibo; Nevo, Eviatar; Zhang, Guoping

    2014-01-01

    The domestication of cultivated barley has been used as a model system for studying the origins and early spread of agrarian culture. Our previous results indicated that the Tibetan Plateau and its vicinity is one of the centers of domestication of cultivated barley. Here we reveal multiple origins of domesticated barley using transcriptome profiling of cultivated and wild-barley genotypes. Approximately 48-Gb of clean transcript sequences in 12 Hordeum spontaneum and 9 Hordeum vulgare accessions were generated. We reported 12,530 de novo assembled transcripts in all of the 21 samples. Population structure analysis showed that Tibetan hulless barley (qingke) might have existed in the early stage of domestication. Based on the large number of unique genomic regions showing the similarity between cultivated and wild-barley groups, we propose that the genomic origin of modern cultivated barley is derived from wild-barley genotypes in the Fertile Crescent (mainly in chromosomes 1H, 2H, and 3H) and Tibet (mainly in chromosomes 4H, 5H, 6H, and 7H). This study indicates that the domestication of barley may have occurred over time in geographically distinct regions. PMID:25197090

  7. Consequences of Normalizing Transcriptomic and Genomic Libraries of Plant Genomes Using a Duplex-Specific Nuclease and Tetramethylammonium Chloride

    PubMed Central

    Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

    2013-01-01

    Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088

  8. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  9. INTEGRATE: gene fusion discovery using whole genome and transcriptome data.

    PubMed

    Zhang, Jin; White, Nicole M; Schmidt, Heather K; Fulton, Robert S; Tomlinson, Chad; Warren, Wesley C; Wilson, Richard K; Maher, Christopher A

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use.

  10. Reference genomes and transcriptomes of Nicotiana sylvestris and Nicotiana tomentosiformis

    PubMed Central

    2013-01-01

    Background Nicotiana sylvestris and Nicotiana tomentosiformis are members of the Solanaceae family that includes tomato, potato, eggplant and pepper. These two Nicotiana species originate from South America and exhibit different alkaloid and diterpenoid production. N. sylvestris is cultivated largely as an ornamental plant and it has been used as a diploid model system for studies of terpenoid production, plastid engineering, and resistance to biotic and abiotic stress. N. sylvestris and N. tomentosiformis are considered to be modern descendants of the maternal and paternal donors that formed Nicotiana tabacum about 200,000 years ago through interspecific hybridization. Here we report the first genome-wide analysis of these two Nicotiana species. Results Draft genomes of N. sylvestris and N. tomentosiformis were assembled to 82.9% and 71.6% of their expected size respectively, with N50 sizes of about 80 kb. The repeat content was 72-75%, with a higher proportion of retrotransposons and copia-like long terminal repeats in N. tomentosiformis. The transcriptome assemblies showed that 44,000-53,000 transcripts were expressed in the roots, leaves or flowers. The key genes involved in terpenoid metabolism, alkaloid metabolism and heavy metal transport showed differential expression in the leaves, roots and flowers of N. sylvestris and N. tomentosiformis. Conclusions The reference genomes of N. sylvestris and N. tomentosiformis represent a significant contribution to the SOL100 initiative because, as members of the Nicotiana genus of Solanaceae, they strengthen the value of the already existing resources by providing additional comparative information, thereby helping to improve our understanding of plant metabolism and evolution. PMID:23773524

  11. Reference genomes and transcriptomes of Nicotiana sylvestris and Nicotiana tomentosiformis.

    PubMed

    Sierro, Nicolas; Battey, James N D; Ouadi, Sonia; Bovet, Lucien; Goepfert, Simon; Bakaher, Nicolas; Peitsch, Manuel C; Ivanov, Nikolai V

    2013-06-17

    Nicotiana sylvestris and Nicotiana tomentosiformis are members of the Solanaceae family that includes tomato, potato, eggplant and pepper. These two Nicotiana species originate from South America and exhibit different alkaloid and diterpenoid production. N. sylvestris is cultivated largely as an ornamental plant and it has been used as a diploid model system for studies of terpenoid production, plastid engineering, and resistance to biotic and abiotic stress. N. sylvestris and N. tomentosiformis are considered to be modern descendants of the maternal and paternal donors that formed Nicotiana tabacum about 200,000 years ago through interspecific hybridization. Here we report the first genome-wide analysis of these two Nicotiana species. Draft genomes of N. sylvestris and N. tomentosiformis were assembled to 82.9% and 71.6% of their expected size respectively, with N50 sizes of about 80 kb. The repeat content was 72-75%, with a higher proportion of retrotransposons and copia-like long terminal repeats in N. tomentosiformis. The transcriptome assemblies showed that 44,000-53,000 transcripts were expressed in the roots, leaves or flowers. The key genes involved in terpenoid metabolism, alkaloid metabolism and heavy metal transport showed differential expression in the leaves, roots and flowers of N. sylvestris and N. tomentosiformis. The reference genomes of N. sylvestris and N. tomentosiformis represent a significant contribution to the SOL100 initiative because, as members of the Nicotiana genus of Solanaceae, they strengthen the value of the already existing resources by providing additional comparative information, thereby helping to improve our understanding of plant metabolism and evolution.

  12. Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants

    PubMed Central

    2010-01-01

    Background Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. Results The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conclusions Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution. PMID:20565927

  13. Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants.

    PubMed

    Li, Xinguo; Wu, Harry X; Southerton, Simon G

    2010-06-21

    Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution.

  14. Genome sequence and transcriptome analyses of the thermophilic zygomycete fungus Rhizomucor miehei

    PubMed Central

    2014-01-01

    Background The zygomycete fungi like Rhizomucor miehei have been extensively exploited for the production of various enzymes. As a thermophilic fungus, R. miehei is capable of growing at temperatures that approach the upper limits for all eukaryotes. To date, over hundreds of fungal genomes are publicly available. However, Zygomycetes have been rarely investigated both genetically and genomically. Results Here, we report the genome of R. miehei CAU432 to explore the thermostable enzymatic repertoire of this fungus. The assembled genome size is 27.6-million-base (Mb) with 10,345 predicted protein-coding genes. Even being thermophilic, the G + C contents of fungal whole genome (43.8%) and coding genes (47.4%) are less than 50%. Phylogenetically, R. miehei is more closerly related to Phycomyces blakesleeanus than to Mucor circinelloides and Rhizopus oryzae. The genome of R. miehei harbors a large number of genes encoding secreted proteases, which is consistent with the characteristics of R. miehei being a rich producer of proteases. The transcriptome profile of R. miehei showed that the genes responsible for degrading starch, glucan, protein and lipid were highly expressed. Conclusions The genome information of R. miehei will facilitate future studies to better understand the mechanisms of fungal thermophilic adaptation and the exploring of the potential of R. miehei in industrial-scale production of thermostable enzymes. Based on the existence of a large repertoire of amylolytic, proteolytic and lipolytic genes in the genome, R. miehei has potential in the production of a variety of such enzymes. PMID:24746234

  15. A genome survey and postharvest transcriptome analysis in Lentinula edodes.

    PubMed

    Sakamoto, Yuichi; Nakade, Keiko; Sato, Shiho; Yoshida, Kentaro; Miyazaki, Kazuhiro; Natsume, Satoshi; Konno, Naotake

    2017-03-17

    Lentinula edodes is a popular cultivated edible and medicinal mushroom. Lentinula edodes is susceptible to postharvest problems such as gill browning, fruiting body softening, and lentinan degradation. We constructed a de novo assembly draft genome sequence and performed gene prediction of Lentinula edodesDe novo assembly was carried out using short reads from paired-end and mate-paired libraries and long reads by PacBio, resulting in a contig number of 1951 and an N50 of 1 Mb. Further, we predicted genes by Augustus using RNA-seq data from the whole life cycle of Lentinula edodes, resulting in 12,959 predicted genes. This analysis revealed that Lentinula edodes lacks lignin peroxidase. To reveal genes involved in Lentinula edodes postharvest fruiting body quality loss, transcriptome analysis was carried out using Super-SAGE. This analysis revealed that many cell wall-related enzymes are upregulated after harvest, such as β-1,3-1,6-glucan-degrading enzymes in glycoside hydrolase (GH) families 5, 16, 30, 55, 128, and thaumatin-like proteins. In addition, we found several chitin-related genes are upregulated, such as putative chitinases in GH family18, exo-chitinases in GH 20, and a putative chitosanase in GH 75. The results suggest that cell wall-degrading enzymes synergistically cooperate for rapid fruiting body autolysis. Many putative transcription factor genes were upregulated postharvest, such as genes containing high mobility group (HMG) domains and zinc finger domains. Several cell death-related proteins were also upregulated postharvest.Importance Our data collectively suggest that there is a rapid fruiting body autolysis system in Lentinula edodes The genes for postharvest quality loss newly found in this research will be targets for future breeding of strains that can keep freshness longer than present strains. De novo Lentinula edodes genome assembly data will be used for construction of the complete Lentinula edodes chromosome map for the future

  16. Lessons for livestock genomics from genome and transcriptome sequencing in cattle and other mammals.

    PubMed

    Taylor, Jeremy F; Whitacre, Lynsey K; Hoff, Jesse L; Tizioto, Polyana C; Kim, JaeWoo; Decker, Jared E; Schnabel, Robert D

    2016-08-17

    Decreasing sequencing costs and development of new protocols for characterizing global methylation, gene expression patterns and regulatory regions have stimulated the generation of large livestock datasets. Here, we discuss experiences in the analysis of whole-genome and transcriptome sequence data. We analyzed whole-genome sequence (WGS) data from 132 individuals from five canid species (Canis familiaris, C. latrans, C. dingo, C. aureus and C. lupus) and 61 breeds, three bison (Bison bison), 64 water buffalo (Bubalus bubalis) and 297 bovines from 17 breeds. By individual, data vary in extent of reference genome depth of coverage from 4.9X to 64.0X. We have also analyzed RNA-seq data for 580 samples representing 159 Bos taurus and Rattus norvegicus animals and 98 tissues. By aligning reads to a reference assembly and calling variants, we assessed effects of average depth of coverage on the actual coverage and on the number of called variants. We examined the identity of unmapped reads by assembling them and querying produced contigs against the non-redundant nucleic acids database. By imputing high-density single nucleotide polymorphism data on 4010 US registered Angus animals to WGS using Run4 of the 1000 Bull Genomes Project and assessing the accuracy of imputation, we identified misassembled reference sequence regions. We estimate that a 24X depth of coverage is required to achieve 99.5 % coverage of the reference assembly and identify 95 % of the variants within an individual's genome. Genomes sequenced to low average coverage (e.g., <10X) may fail to cover 10 % of the reference genome and identify <75 % of variants. About 10 % of genomic DNA or transcriptome sequence reads fail to align to the reference assembly. These reads include loci missing from the reference assembly and misassembled genes and interesting symbionts, commensal and pathogenic organisms. Assembly errors and a lack of annotation of functional elements significantly limit the utility of

  17. Metaplastic breast carcinomas display genomic and transcriptomic heterogeneity [corrected]. .

    PubMed

    Weigelt, Britta; Ng, Charlotte K Y; Shen, Ronglai; Popova, Tatiana; Schizas, Michail; Natrajan, Rachael; Mariani, Odette; Stern, Marc-Henri; Norton, Larry; Vincent-Salomon, Anne; Reis-Filho, Jorge S

    2015-03-01

    features of metaplastic breast carcinomas is reflected at the transcriptomic level, and an association between molecular subtypes and histology was observed. BRCA1-like genomic profiles were found only in a subset (31%) of metaplastic breast cancers, and were not associated with a specific molecular or histologic subtype.

  18. Development of genome- and transcriptome-derived microsatellites in related species of snapping shrimps with highly duplicated genomes.

    PubMed

    Gaynor, Kaitlyn M; Solomon, Joseph W; Siller, Stefanie; Jessell, Linnet; Duffy, J Emmett; Rubenstein, Dustin R

    2017-08-04

    Molecular markers are powerful tools for studying patterns of relatedness and parentage within populations and for making inferences about social evolution. However, the development of molecular markers for simultaneous study of multiple species presents challenges, particularly when species exhibit genome duplication or polyploidy. We developed microsatellite markers for Synalpheus shrimp, a genus in which species exhibit not only great variation in social organization, but also interspecific variation in genome size and partial genome duplication. From the four primary clades within Synalpheus, we identified microsatellites in the genomes of four species and in the consensus transcriptome of two species. Ultimately, we designed and tested primers for 143 microsatellite markers across 25 species. Although the majority of markers were disomic, many markers were polysomic for certain species. Surprisingly, we found no relationship between genome size and the number of polysomic markers. As expected, markers developed for a given species amplified better for closely related species than for more distant relatives. Finally, the markers developed from the transcriptome were more likely to work successfully and to be disomic than those developed from the genome, suggesting that consensus transcriptomes are likely to be conserved across species. Our findings suggest that the transcriptome, particularly consensus sequences from multiple species, can be a valuable source of molecular markers for taxa with complex, duplicated genomes. © 2017 John Wiley & Sons Ltd.

  19. Annotation of the zebrafish genome through an integrated transcriptomic and proteomic analysis.

    PubMed

    Kelkar, Dhanashree S; Provost, Elayne; Chaerkady, Raghothama; Muthusamy, Babylakshmi; Manda, Srikanth S; Subbannayya, Tejaswini; Selvan, Lakshmi Dhevi N; Wang, Chieh-Huei; Datta, Keshava K; Woo, Sunghee; Dwivedi, Sutopa B; Renuse, Santosh; Getnet, Derese; Huang, Tai-Chung; Kim, Min-Sik; Pinto, Sneha M; Mitchell, Christopher J; Madugundu, Anil K; Kumar, Praveen; Sharma, Jyoti; Advani, Jayshree; Dey, Gourav; Balakrishnan, Lavanya; Syed, Nazia; Nanjappa, Vishalakshi; Subbannayya, Yashwanth; Goel, Renu; Prasad, T S Keshava; Bafna, Vineet; Sirdeshmukh, Ravi; Gowda, Harsha; Wang, Charles; Leach, Steven D; Pandey, Akhilesh

    2014-11-01

    Accurate annotation of protein-coding genes is one of the primary tasks upon the completion of whole genome sequencing of any organism. In this study, we used an integrated transcriptomic and proteomic strategy to validate and improve the existing zebrafish genome annotation. We undertook high-resolution mass-spectrometry-based proteomic profiling of 10 adult organs, whole adult fish body, and two developmental stages of zebrafish (SAT line), in addition to transcriptomic profiling of six organs. More than 7,000 proteins were identified from proteomic analyses, and ∼ 69,000 high-confidence transcripts were assembled from the RNA sequencing data. Approximately 15% of the transcripts mapped to intergenic regions, the majority of which are likely long non-coding RNAs. These high-quality transcriptomic and proteomic data were used to manually reannotate the zebrafish genome. We report the identification of 157 novel protein-coding genes. In addition, our data led to modification of existing gene structures including novel exons, changes in exon coordinates, changes in frame of translation, translation in annotated UTRs, and joining of genes. Finally, we discovered four instances of genome assembly errors that were supported by both proteomic and transcriptomic data. Our study shows how an integrative analysis of the transcriptome and the proteome can extend our understanding of even well-annotated genomes.

  20. Novel genomic resources for a climate change sensitive mammal: characterization of the American pika transcriptome

    PubMed Central

    2013-01-01

    a positive match with the hemoglobin alpha chain from the plateau pika, a species restricted to high elevation steppes in Asia. Elevation-specific contigs may represent candidate regions subject to differential levels of gene expression along this elevation gradient. Conclusions To our knowledge, this is the first broad-scale, transcriptome-level study conducted within the Ochotonidae, providing novel genomic resources for studying pika ecology, behaviour and population history. PMID:23663654

  1. Novel genomic resources for a climate change sensitive mammal: characterization of the American pika transcriptome.

    PubMed

    Lemay, Matthew A; Henry, Philippe; Lamb, Clayton T; Robson, Kelsey M; Russello, Michael A

    2013-05-10

    with the hemoglobin alpha chain from the plateau pika, a species restricted to high elevation steppes in Asia. Elevation-specific contigs may represent candidate regions subject to differential levels of gene expression along this elevation gradient. To our knowledge, this is the first broad-scale, transcriptome-level study conducted within the Ochotonidae, providing novel genomic resources for studying pika ecology, behaviour and population history.

  2. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica.

    PubMed

    Krishnan, Neeraja M; Pattnaik, Swetansu; Jain, Prachi; Gaur, Prakhar; Choudhary, Rakshit; Vaidyanathan, Srividya; Deepak, Sa; Hariharan, Arun K; Krishna, Pg Bharath; Nair, Jayalakshmi; Varghese, Linu; Valivarthi, Naveen K; Dhas, Kunal; Ramaswamy, Krishna; Panda, Binay

    2012-09-09

    The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides.

  3. RNA-seq analysis of Rubus idaeus cv. Nova: transcriptome sequencing and de novo assembly for subsequent functional genomics approaches.

    PubMed

    Hyun, Tae Kyung; Lee, Sarah; Kumar, Dhinesh; Rim, Yeonggil; Kumar, Ritesh; Lee, Sang Yeol; Lee, Choong Hwan; Kim, Jae-Yean

    2014-10-01

    Using Illumina sequencing technology, we have generated the large-scale transcriptome sequencing data containing abundant information on genes involved in the metabolic pathways in R. idaeus cv. Nova fruits. Rubus idaeus (Red raspberry) is one of the important economical crops that possess numerous nutrients, micronutrients and phytochemicals with essential health benefits to human. The molecular mechanism underlying the ripening process and phytochemical biosynthesis in red raspberry is attributed to the changes in gene expression, but very limited transcriptomic and genomic information in public databases is available. To address this issue, we generated more than 51 million sequencing reads from R. idaeus cv. Nova fruit using Illumina RNA-Seq technology. After de novo assembly, we obtained 42,604 unigenes with an average length of 812 bp. At the protein level, Nova fruit transcriptome showed 77 and 68 % sequence similarities with Rubus coreanus and Fragaria versa, respectively, indicating the evolutionary relationship between them. In addition, 69 % of assembled unigenes were annotated using public databases including NCBI non-redundant, Cluster of Orthologous Groups and Gene ontology database, suggesting that our transcriptome dataset provides a valuable resource for investigating metabolic processes in red raspberry. To analyze the relationship between several novel transcripts and the amounts of metabolites such as γ-aminobutyric acid and anthocyanins, real-time PCR and target metabolite analysis were performed on two different ripening stages of Nova. This is the first attempt using Illumina sequencing platform for RNA sequencing and de novo assembly of Nova fruit without reference genome. Our data provide the most comprehensive transcriptome resource available for Rubus fruits, and will be useful for understanding the ripening process and for breeding R. idaeus cultivars with improved fruit quality.

  4. Genomic and transcriptomic alterations following hybridisation and genome doubling in trigenomic allohexaploid Brassica carinata × Brassica rapa.

    PubMed

    Xu, Y; Zhao, Q; Mei, S; Wang, J

    2012-09-01

    Allopolyploidisation is a prominent evolutionary force that involves two major events: interspecific hybridisation and genome doubling. Both events have important functional consequences in shaping the genomic architecture of the neo-allopolyploids. The respective effects of hybridisation and genome doubling upon genomic and transcriptomic changes in Brassica allopolyploids are unresolved. In this study, amplified fragment length polymorphism (AFLP), methylation-sensitive amplification polymorphism (MSAP) and cDNA-AFLP approaches were used to track genetic, epigenetic and transcriptional changes in both allohexaploid Brassica (ArArBcBcCcCc genome) and triploid hybrids (ArBcCc genome). Results from these groups were compared with each other and also to their parents Brassica carinata (BBCC genome) and Brassica rapa (AA genome). Rapid and dramatic genetic, DNA methylation and gene expression changes were detected in the triploid hybrids. During the shift from triploidy to allohexaploidy, some of the hybridisation-induced alterations underwent reversion. Additionally, novel genetic, epigenetic and transcriptional alterations were also detected. The proportions of A-genome-specific DNA methylation and gene expression alterations were significantly greater than those of BC-genome-specific alterations in the triploid hybrids. However, the two parental genomes were equally affected during the ploidy shift. Hemi-CCG methylation changes induced by hybridisation were recovered after genome doubling. Full-CG methylation changes were a more general process initiated in the hybrid and continued after genome doubling. These results indicate that genome doubling could ameliorate genomic and transcriptomic alterations induced by hybridisation and instigate additional alterations in trigenomic Brassica allohexaploids. Moreover, genome doubling also modified hybridisation-induced progenitor genome-biased alterations and epigenetic alteration characteristics.

  5. Whole-genome duplication and molecular evolution in Cornus L. (Cornaceae) - Insights from transcriptome sequences.

    PubMed

    Yu, Yan; Xiang, Qiuyun; Manos, Paul S; Soltis, Douglas E; Soltis, Pamela S; Song, Bao-Hua; Cheng, Shifeng; Liu, Xin; Wong, Gane

    2017-01-01

    The pattern and rate of genome evolution have profound consequences in organismal evolution. Whole-genome duplication (WGD), or polyploidy, has been recognized as an important evolutionary mechanism of plant diversification. However, in non-model plants the molecular signals of genome duplications have remained largely unexplored. High-throughput transcriptome data from next-generation sequencing have set the stage for novel investigations of genome evolution using new bioinformatic and methodological tools in a phylogenetic framework. Here we compare ten de novo-assembled transcriptomes representing the major lineages of the angiosperm genus Cornus (dogwood) and relevant outgroups using a customized pipeline for analyses. Using three distinct approaches, molecular dating of orthologous genes, analyses of the distribution of synonymous substitutions between paralogous genes, and examination of substitution rates through time, we detected a shared WGD event in the late Cretaceous across all taxa sampled. The inferred doubling event coincides temporally with the paleoclimatic changes associated with the initial divergence of the genus into three major lineages. Analyses also showed an acceleration of rates of molecular evolution after WGD. The highest rates of molecular evolution were observed in the transcriptome of the herbaceous lineage, C. canadensis, a species commonly found at higher latitudes, including the Arctic. Our study demonstrates the value of transcriptome data for understanding genome evolution in closely related species. The results suggest dramatic increase in sea surface temperature in the late Cretaceous may have contributed to the evolution and diversification of flowering plants.

  6. Whole-genome duplication and molecular evolution in Cornus L. (Cornaceae) – Insights from transcriptome sequences

    PubMed Central

    Yu, Yan; Xiang, Qiuyun; Manos, Paul S.; Soltis, Douglas E.; Soltis, Pamela S.; Song, Bao-Hua; Cheng, Shifeng; Liu, Xin; Wong, Gane

    2017-01-01

    The pattern and rate of genome evolution have profound consequences in organismal evolution. Whole-genome duplication (WGD), or polyploidy, has been recognized as an important evolutionary mechanism of plant diversification. However, in non-model plants the molecular signals of genome duplications have remained largely unexplored. High-throughput transcriptome data from next-generation sequencing have set the stage for novel investigations of genome evolution using new bioinformatic and methodological tools in a phylogenetic framework. Here we compare ten de novo-assembled transcriptomes representing the major lineages of the angiosperm genus Cornus (dogwood) and relevant outgroups using a customized pipeline for analyses. Using three distinct approaches, molecular dating of orthologous genes, analyses of the distribution of synonymous substitutions between paralogous genes, and examination of substitution rates through time, we detected a shared WGD event in the late Cretaceous across all taxa sampled. The inferred doubling event coincides temporally with the paleoclimatic changes associated with the initial divergence of the genus into three major lineages. Analyses also showed an acceleration of rates of molecular evolution after WGD. The highest rates of molecular evolution were observed in the transcriptome of the herbaceous lineage, C. canadensis, a species commonly found at higher latitudes, including the Arctic. Our study demonstrates the value of transcriptome data for understanding genome evolution in closely related species. The results suggest dramatic increase in sea surface temperature in the late Cretaceous may have contributed to the evolution and diversification of flowering plants. PMID:28225773

  7. A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome.

    PubMed

    Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

    2012-06-15

    The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.

  8. Primary analysis of repeat elements of the Asian seabass (Lates calcarifer) transcriptome and genome

    PubMed Central

    Kuznetsova, Inna S.; Thevasagayam, Natascha M.; Sridatta, Prakki S. R.; Komissarov, Aleksey S.; Saju, Jolly M.; Ngoh, Si Y.; Jiang, Junhui; Shen, Xueyan; Orbán, László

    2014-01-01

    As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n = 24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8–14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionarily conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates. PMID:25120555

  9. Identifying characteristic scales in the human genome

    NASA Astrophysics Data System (ADS)

    Carpena, P.; Bernaola-Galván, P.; Coronado, A. V.; Hackenberg, M.; Oliver, J. L.

    2007-03-01

    The scale-free, long-range correlations detected in DNA sequences contrast with characteristic lengths of genomic elements, being particularly incompatible with the isochores (long, homogeneous DNA segments). By computing the local behavior of the scaling exponent α of detrended fluctuation analysis (DFA), we discriminate between sequences with and without true scaling, and we find that no single scaling exists in the human genome. Instead, human chromosomes show a common compositional structure with two characteristic scales, the large one corresponding to the isochores and the other to small and medium scale genomic elements.

  10. [Genomics and transcriptomics of the Chinese liver fluke Clonorchis sinensis (Opisthorchiidae, Trematoda)].

    PubMed

    Chelomina, G N

    2017-01-01

    The review summarizes the results of first genomic and transcriptomic investigations of the liver fluke Clonorchis sinensis (Opisthorchiidae, Trematoda). The studies mark the dawn of the genomic era for opisthorchiids, which cause severe hepatobiliary diseases in humans and animals. Their results aided in understanding the molecular mechanisms of adaptation to parasitism, parasite survival in mammalian biliary tracts, and genome dynamics in the individual development and the development of parasite-host relationships. Special attention is paid to the achievements in studying the codon usage bias and the roles of mobile genetic elements (MGEs) and small interfering RNAs (siRNAs). Interspecific comparisons at the genomic and transcriptomic levels revealed molecular differences, which may contribute to understanding the specialized niches and physiological needs of the respective species. The studies in C. sinensis provide a basis for further basic and applied research in liver flukes and, in particular, the development of efficient means to prevent, diagnose, and treat clonorchiasis.

  11. Simul-seq: combined DNA and RNA sequencing for whole-genome and transcriptome profiling.

    PubMed

    Reuter, Jason A; Spacek, Damek V; Pai, Reetesh K; Snyder, Michael P

    2016-11-01

    Paired DNA and RNA profiling is increasingly employed in genomics research to uncover molecular mechanisms of disease and to explore personal genotype and phenotype correlations. Here, we introduce Simul-seq, a technique for the production of high-quality whole-genome and transcriptome sequencing libraries from small quantities of cells or tissues. We apply the method to laser-capture-microdissected esophageal adenocarcinoma tissue, revealing a highly aneuploid tumor genome with extensive blocks of increased homozygosity and corresponding increases in allele-specific expression. Among this widespread allele-specific expression, we identify germline polymorphisms that are associated with response to cancer therapies. We further leverage this integrative data to uncover expressed mutations in several known cancer genes as well as a recurrent mutation in the motor domain of KIF3B that significantly affects kinesin-microtubule interactions. Simul-seq provides a new streamlined approach for generating comprehensive genome and transcriptome profiles from limited quantities of clinically relevant samples.

  12. VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

    SciTech Connect

    Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.; Jensen, Jeffrey L.; Walker, Julia; Kobold, Mark A.; Webb, Samantha R.; Payne, Samuel H.; Ansong, Charles; Adkins, Joshua N.; Cannon, William R.; Webb-Robertson, Bobbie-Jo M.

    2012-04-25

    Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.

  13. De novo Transcriptome Assemblies of Rana (Lithobates) catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes

    PubMed Central

    Birol, Inanc; Behsaz, Bahar; Hammond, S. Austin; Kucuk, Erdi; Veldhoen, Nik; Helbing, Caren C.

    2015-01-01

    In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates) catesbeiana) and the African clawed frog (Xenopus laevis). We used high throughput RNA sequencing (RNA-seq) data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences. PMID:26121473

  14. Genomic, transcriptomic and phenomic variation reveals the complex adaptation to stress response of modern maize breeding

    USDA-ARS?s Scientific Manuscript database

    Early maize adaptation to different agricultural environments was an important process associated with the creation of a stable food supply that allowed the evolution of human civilization in the Americas. To explore the mechanisms of maize adaptation, genomic, transcriptomic and phenomic data were ...

  15. Comparative genomics and transcriptomics of lineages I, II, and III strains of Listeria monocytogenes

    PubMed Central

    2012-01-01

    Background Listeria monocytogenes is a food-borne pathogen that causes infections with a high-mortality rate and has served as an invaluable model for intracellular parasitism. Here, we report complete genome sequences for two L. monocytogenes strains belonging to serotype 4a (L99) and 4b (CLIP80459), and transcriptomes of representative strains from lineages I, II, and III, thereby permitting in-depth comparison of genome- and transcriptome -based data from three lineages of L. monocytogenes. Lineage III, represented by the 4a L99 genome is known to contain strains less virulent for humans. Results The genome analysis of the weakly pathogenic L99 serotype 4a provides extensive evidence of virulence gene decay, including loss of several important surface proteins. The 4b CLIP80459 genome, unlike the previously sequenced 4b F2365 genome harbours an intact inlB invasion gene. These lineage I strains are characterized by the lack of prophage genes, as they share only a single prophage locus with other L. monocytogenes genomes 1/2a EGD-e and 4a L99. Comparative transcriptome analysis during intracellular growth uncovered adaptive expression level differences in lineages I, II and III of Listeria, notable amongst which was a strong intracellular induction of flagellar genes in strain 4a L99 compared to the other lineages. Furthermore, extensive differences between strains are manifest at levels of metabolic flux control and phosphorylated sugar uptake. Intriguingly, prophage gene expression was found to be a hallmark of intracellular gene expression. Deletion mutants in the single shared prophage locus of lineage II strain EGD-e 1/2a, the lma operon, revealed severe attenuation of virulence in a murine infection model. Conclusion Comparative genomics and transcriptome analysis of L. monocytogenes strains from three lineages implicate prophage genes in intracellular adaptation and indicate that gene loss and decay may have led to the emergence of attenuated lineages

  16. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes.

    PubMed

    Macaulay, Iain C; Haerty, Wilfried; Kumar, Parveen; Li, Yang I; Hu, Tim Xiaoming; Teng, Mabel J; Goolam, Mubeen; Saurat, Nathalie; Coupland, Paul; Shirley, Lesley M; Smith, Miriam; Van der Aa, Niels; Banerjee, Ruby; Ellis, Peter D; Quail, Michael A; Swerdlow, Harold P; Zernicka-Goetz, Magdalena; Livesey, Frederick J; Ponting, Chris P; Voet, Thierry

    2015-06-01

    The simultaneous sequencing of a single cell's genome and transcriptome offers a powerful means to dissect genetic variation and its effect on gene expression. Here we describe G&T-seq, a method for separating and sequencing genomic DNA and full-length mRNA from single cells. By applying G&T-seq to over 220 single cells from mice and humans, we discovered cellular properties that could not be inferred from DNA or RNA sequencing alone.

  17. Genomic resources for a model in adaptation and speciation research: characterization of the Poecilia mexicana transcriptome

    PubMed Central

    2012-01-01

    Background Elucidating the genomic basis of adaptation and speciation is a major challenge in natural systems with large quantities of environmental and phenotypic data, mostly because of the scarcity of genomic resources for non-model organisms. The Atlantic molly (Poecilia mexicana, Poeciliidae) is a small livebearing fish that has been extensively studied for evolutionary ecology research, particularly because this species has repeatedly colonized extreme environments in the form of caves and toxic hydrogen sulfide containing springs. In such extreme environments, populations show strong patterns of adaptive trait divergence and the emergence of reproductive isolation. Here, we used RNA-sequencing to assemble and annotate the first transcriptome of P. mexicana to facilitate ecological genomics studies in the future and aid the identification of genes underlying adaptation and speciation in the system. Description We provide the first annotated reference transcriptome of P. mexicana. Our transcriptome shows high congruence with other published fish transcriptomes, including that of the guppy, medaka, zebrafish, and stickleback. Transcriptome annotation uncovered the presence of candidate genes relevant in the study of adaptation to extreme environments. We describe general and oxidative stress response genes as well as genes involved in pathways induced by hypoxia or involved in sulfide metabolism. To facilitate future comparative analyses, we also conducted quantitative comparisons between P. mexicana from different river drainages. 106,524 single nucleotide polymorphisms were detected in our dataset, including potential markers that are putatively fixed across drainages. Furthermore, specimens from different drainages exhibited some consistent differences in gene regulation. Conclusions Our study provides a valuable genomic resource to study the molecular underpinnings of adaptation to extreme environments in replicated sulfide spring and cave environments. In

  18. Genome Annotation and Transcriptomics of Oil-Producing Algae

    DTIC Science & Technology

    2015-03-16

    diatoms ). We had proposed to use whole transcriptome analyses to detail the changes in gene expression that occur during N-starvation induced TAG...Abstract Most algae accumulate triacylglycerols (TAGs) when they are starved for essential nutrients like N, S, P (or Si in the case of some diatoms ...triacylglycerols (TAGs) when they are starved for essential nutrients like N, S, P (or Si in the case of some diatoms ). In the absence of such essential nutrients

  19. Transcriptome sequencing and whole genome expression profiling of chrysanthemum under dehydration stress

    PubMed Central

    2013-01-01

    Background Chrysanthemum is one of the most important ornamental crops in the world and drought stress seriously limits its production and distribution. In order to generate a functional genomics resource and obtain a deeper understanding of the molecular mechanisms regarding chrysanthemum responses to dehydration stress, we performed large-scale transcriptome sequencing of chrysanthemum plants under dehydration stress using the Illumina sequencing technology. Results Two cDNA libraries constructed from mRNAs of control and dehydration-treated seedlings were sequenced by Illumina technology. A total of more than 100 million reads were generated and de novo assembled into 98,180 unique transcripts which were further extensively annotated by comparing their sequencing to different protein databases. Biochemical pathways were predicted from these transcript sequences. Furthermore, we performed gene expression profiling analysis upon dehydration treatment in chrysanthemum and identified 8,558 dehydration-responsive unique transcripts, including 307 transcription factors and 229 protein kinases and many well-known stress responsive genes. Gene ontology (GO) term enrichment and biochemical pathway analyses showed that dehydration stress caused changes in hormone response, secondary and amino acid metabolism, and light and photoperiod response. These findings suggest that drought tolerance of chrysanthemum plants may be related to the regulation of hormone biosynthesis and signaling, reduction of oxidative damage, stabilization of cell proteins and structures, and maintenance of energy and carbon supply. Conclusions Our transcriptome sequences can provide a valuable resource for chrysanthemum breeding and research and novel insights into chrysanthemum responses to dehydration stress and offer candidate genes or markers that can be used to guide future studies attempting to breed drought tolerant chrysanthemum cultivars. PMID:24074255

  20. Genome wide transcriptome profiling reveals differential gene expression in secondary metabolite pathway of Cymbopogon winterianus

    PubMed Central

    Devi, Kamalakshi; Mishra, Surajit K.; Sahu, Jagajjit; Panda, Debashis; Modi, Mahendra K.; Sen, Priyabrata

    2016-01-01

    Advances in transcriptome sequencing provide fast, cost-effective and reliable approach to generate large expression datasets especially suitable for non-model species to identify putative genes, key pathway and regulatory mechanism. Citronella (Cymbopogon winterianus) is an aromatic medicinal grass used for anti-tumoral, antibacterial, anti-fungal, antiviral, detoxifying and natural insect repellent properties. Despite of having number of utilities, the genes involved in terpenes biosynthetic pathway is not yet clearly elucidated. The present study is a pioneering attempt to generate an exhaustive molecular information of secondary metabolite pathway and to increase genomic resources in Citronella. Using high-throughput RNA-Seq technology, root and leaf transcriptome was analysed at an unprecedented depth (11.7 Gb). Targeted searches identified majority of the genes associated with metabolic pathway and other natural product pathway viz. antibiotics synthesis along with many novel genes. Terpenoid biosynthesis genes comparative expression results were validated for 15 unigenes by RT-PCR and qRT-PCR. Thus the coverage of these transcriptome is comprehensive enough to discover all known genes of major metabolic pathways. This transcriptome dataset can serve as important public information for gene expression, genomics and function genomics studies in Citronella and shall act as a benchmark for future improvement of the crop. PMID:26877149

  1. Genome wide transcriptome profiling reveals differential gene expression in secondary metabolite pathway of Cymbopogon winterianus.

    PubMed

    Devi, Kamalakshi; Mishra, Surajit K; Sahu, Jagajjit; Panda, Debashis; Modi, Mahendra K; Sen, Priyabrata

    2016-02-15

    Advances in transcriptome sequencing provide fast, cost-effective and reliable approach to generate large expression datasets especially suitable for non-model species to identify putative genes, key pathway and regulatory mechanism. Citronella (Cymbopogon winterianus) is an aromatic medicinal grass used for anti-tumoral, antibacterial, anti-fungal, antiviral, detoxifying and natural insect repellent properties. Despite of having number of utilities, the genes involved in terpenes biosynthetic pathway is not yet clearly elucidated. The present study is a pioneering attempt to generate an exhaustive molecular information of secondary metabolite pathway and to increase genomic resources in Citronella. Using high-throughput RNA-Seq technology, root and leaf transcriptome was analysed at an unprecedented depth (11.7 Gb). Targeted searches identified majority of the genes associated with metabolic pathway and other natural product pathway viz. antibiotics synthesis along with many novel genes. Terpenoid biosynthesis genes comparative expression results were validated for 15 unigenes by RT-PCR and qRT-PCR. Thus the coverage of these transcriptome is comprehensive enough to discover all known genes of major metabolic pathways. This transcriptome dataset can serve as important public information for gene expression, genomics and function genomics studies in Citronella and shall act as a benchmark for future improvement of the crop.

  2. Genome and transcriptome analysis of surfactin biosynthesis in Bacillus amyloliquefaciens MT45

    PubMed Central

    Zhi, Yan; Wu, Qun; Xu, Yan

    2017-01-01

    Natural Bacillus isolates generate limited amounts of surfactin (<10% of their biomass), which functions as an antibiotic or signalling molecule in inter-/intra-specific interactions. However, overproduction of surfactin in Bacillus amyloliquefaciens MT45 was observed at a titre of 2.93 g/l, which is equivalent to half of the maximum biomass. To systemically unravel this efficient biosynthetic process, the genome and transcriptome of this bacterium were compared with those of B. amyloliquefaciens type strain DSM7T. MT45 possesses a smaller genome while containing more unique transporters and resistance-associated genes. Comparative transcriptome analysis revealed notable enrichment of the surfactin synthesis pathway in MT45, including central carbon metabolism and fatty acid biosynthesis to provide sufficient quantities of building precursors. Most importantly, the modular surfactin synthase overexpressed (9 to 49-fold) in MT45 compared to DSM7T suggested efficient surfactin assembly and resulted in the overproduction of surfactin. Furthermore, based on the expression trends observed in the transcriptome, there are multiple potential regulatory genes mediating the expression of surfactin synthase. Thus, the results of the present study provide new insights regarding the synthesis and regulation of surfactin in high-producing strain and enrich the genomic and transcriptomic resources available for B. amyloliquefaciens. PMID:28112210

  3. Plant Aquaporins: Genome-Wide Identification, Transcriptomics, Proteomics, and Advanced Analytical Tools

    PubMed Central

    Deshmukh, Rupesh K.; Sonah, Humira; Bélanger, Richard R.

    2016-01-01

    Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is

  4. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    PubMed Central

    2012-01-01

    Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro. PMID:22901030

  5. Genome and Transcriptome Sequencing of the Ostreid herpesvirus 1 From Tomales Bay, California

    NASA Astrophysics Data System (ADS)

    Burge, C. A.; Langevin, S.; Closek, C. J.; Roberts, S. B.; Friedman, C. S.

    2016-02-01

    Mass mortalities of larval and seed bivalve molluscs attributed to the Ostreid herpesvirus 1 (OsHV-1) occur globally. OsHV-1 was fully sequenced and characterized as a member of the Family Malacoherpesviridae. Multiple strains of OsHV-1 exist and may vary in virulence, i.e. OsHV-1 µvar. For most global variants of OsHV-1, sequence data is limited to PCR-based sequencing of segments, including two recent genomes. In the United States, OsHV-1 is limited to detection in adjacent embayments in California, Tomales and Drakes bays. Limited DNA sequence data of OsHV-1 infecting oysters in Tomales Bay indicates the virus detected in Tomales Bay is similar but not identical to any one global variant of OsHV-1. In order to better understand both strain variation and virulence of OsHV-1 infecting oysters in Tomales Bay, we used genomic and transcriptomic sequencing. Meta-genomic sequencing (Illumina MiSeq) was conducted from infected oysters (n=4 per year) collected in 2003, 2007, and 2014, where full OsHV-1 genome sequences and low overall microbial diversity were achieved from highly infected oysters. Increased microbial diversity was detected in three of four samples sequenced from 2003, where qPCR based genome copy numbers of OsHV-1 were lower. Expression analysis (SOLiD RNA sequencing) of OsHV-1 genes expressed in oyster larvae at 24 hours post exposure revealed a nearly complete transcriptome, with several highly expressed genes, which are similar to recent transcriptomic analyses of other OsHV-1 variants. Taken together, our results indicate that genome and transcriptome sequencing may be powerful tools in understanding both strain variation and virulence of non-culturable marine viruses.

  6. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

    PubMed Central

    2012-01-01

    Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331

  7. Draft genome of spinach and transcriptome diversity of 120 Spinacia accessions.

    PubMed

    Xu, Chenxi; Jiao, Chen; Sun, Honghe; Cai, Xiaofeng; Wang, Xiaoli; Ge, Chenhui; Zheng, Yi; Liu, Wenli; Sun, Xuepeng; Xu, Yimin; Deng, Jie; Zhang, Zhonghua; Huang, Sanwen; Dai, Shaojun; Mou, Beiquan; Wang, Quanxi; Fei, Zhangjun; Wang, Quanhua

    2017-05-24

    Spinach is an important leafy vegetable enriched with multiple necessary nutrients. Here we report the draft genome sequence of spinach (Spinacia oleracea, 2n=12), which contains 25,495 protein-coding genes. The spinach genome is highly repetitive with 74.4% of its content in the form of transposable elements. No recent whole genome duplication events are observed in spinach. Genome syntenic analysis between spinach and sugar beet suggests substantial inter- and intra-chromosome rearrangements during the Caryophyllales genome evolution. Transcriptome sequencing of 120 cultivated and wild spinach accessions reveals more than 420 K variants. Our data suggests that S. turkestanica is likely the direct progenitor of cultivated spinach and spinach domestication has a weak bottleneck. We identify 93 domestication sweeps in the spinach genome, some of which are associated with important agronomic traits including bolting, flowering and leaf numbers. This study offers insights into spinach evolution and domestication and provides resources for spinach research and improvement.

  8. Generation of an integrated Hieracium genomic and transcriptomic resource enables exploration of small RNA pathways during apomixis initiation.

    PubMed

    Rabiger, David S; Taylor, Jennifer M; Spriggs, Andrew; Hand, Melanie L; Henderson, Steven T; Johnson, Susan D; Oelkers, Karsten; Hrmova, Maria; Saito, Keisuke; Suzuki, Go; Mukai, Yasuhiko; Carroll, Bernard J; Koltunow, Anna M G

    2016-10-06

    Application of apomixis, or asexual seed formation, in crop breeding would allow rapid fixation of complex traits, economizing improved crop delivery. Identification of apomixis genes is confounded by the polyploid nature, high genome complexity and lack of genomic sequence integration with reproductive tissue transcriptomes in most apomicts. A genomic and transcriptomic resource was developed for Hieracium subgenus Pilosella (Asteraceae) which incorporates characterized sexual, apomictic and mutant apomict plants exhibiting reversion to sexual reproduction. Apomicts develop additional female gametogenic cells that suppress the sexual pathway in ovules. Disrupting small RNA pathways in sexual Arabidopsis also induces extra female gametogenic cells; therefore, the resource was used to examine if changes in small RNA pathways correlate with apomixis initiation. An initial characterization of small RNA pathway genes within Hieracium was undertaken, and ovary-expressed ARGONAUTE genes were identified and cloned. Comparisons of whole ovary transcriptomes from mutant apomicts, relative to the parental apomict, revealed that differentially expressed genes were enriched for processes involved in small RNA biogenesis and chromatin silencing. Small RNA profiles within mutant ovaries did not reveal large-scale alterations in composition or length distributions; however, a small number of differentially expressed, putative small RNA targets were identified. The established Hieracium resource represents a substantial contribution towards the investigation of early sexual and apomictic female gamete development, and the generation of new candidate genes and markers. Observed changes in small RNA targets and biogenesis pathways within sexual and apomictic ovaries will underlie future functional research into apomixis initiation in Hieracium.

  9. Genome and transcriptome sequencing identifies breeding targets in the orphan crop tef (Eragrostis tef).

    PubMed

    Cannarozzi, Gina; Plaza-Wüthrich, Sonia; Esfeld, Korinna; Larti, Stéphanie; Wilson, Yi Song; Girma, Dejene; de Castro, Edouard; Chanyalew, Solomon; Blösch, Regula; Farinelli, Laurent; Lyons, Eric; Schneider, Michel; Falquet, Laurent; Kuhlemeier, Cris; Assefa, Kebebew; Tadele, Zerihun

    2014-07-09

    Tef (Eragrostis tef), an indigenous cereal critical to food security in the Horn of Africa, is rich in minerals and protein, resistant to many biotic and abiotic stresses and safe for diabetics as well as sufferers of immune reactions to wheat gluten. We present the genome of tef, the first species in the grass subfamily Chloridoideae and the first allotetraploid assembled de novo. We sequenced the tef genome for marker-assisted breeding, to shed light on the molecular mechanisms conferring tef's desirable nutritional and agronomic properties, and to make its genome publicly available as a community resource. The draft genome contains 672 Mbp representing 87% of the genome size estimated from flow cytometry. We also sequenced two transcriptomes, one from a normalized RNA library and another from unnormalized RNASeq data. The normalized RNA library revealed around 38000 transcripts that were then annotated by the SwissProt group. The CoGe comparative genomics platform was used to compare the tef genome to other genomes, notably sorghum. Scaffolds comprising approximately half of the genome size were ordered by syntenic alignment to sorghum producing tef pseudo-chromosomes, which were sorted into A and B genomes as well as compared to the genetic map of tef. The draft genome was used to identify novel SSR markers, investigate target genes for abiotic stress resistance studies, and understand the evolution of the prolamin family of proteins that are responsible for the immune response to gluten. It is highly plausible that breeding targets previously identified in other cereal crops will also be valuable breeding targets in tef. The draft genome and transcriptome will be of great use for identifying these targets for genetic improvement of this orphan crop that is vital for feeding 50 million people in the Horn of Africa.

  10. Comprehensive identification and quantification of microbial transcriptomes by genome-wide unbiased methods.

    PubMed

    Mäder, Ulrike; Nicolas, Pierre; Richard, Hugues; Bessières, Philippe; Aymerich, Stéphane

    2011-02-01

    Genomic tiling array transcriptomics and RNA-seq are two powerful and rapidly developing approaches for unbiased transcriptome analysis. Providing comprehensive identification and quantification of transcripts with an unprecedented resolution, they are leading to major breakthroughs in systems biology. Here we review each step of the analysis from library preparation to the interpretation of the data, with particular attention paid to the possible sources of artifacts. Methodological requirements and statistical frameworks are often similar in both the approaches despite differences in the nature of the data. Tiling array analysis does not require rRNA depletion and benefits from a more mature computational workflow, whereas RNA-Seq has a clear lead in terms of background noise and dynamic range with a considerable potential for evolution with the improvements of sequencing technologies. Being independent of prior sequence knowledge, RNA-seq will boost metatranscriptomics and evolutionary transcriptomics applications.

  11. Applications of genome-scale metabolic reconstructions

    PubMed Central

    Oberhardt, Matthew A; Palsson, Bernhard Ø; Papin, Jason A

    2009-01-01

    The availability and utility of genome-scale metabolic reconstructions have exploded since the first genome-scale reconstruction was published a decade ago. Reconstructions have now been built for a wide variety of organisms, and have been used toward five major ends: (1) contextualization of high-throughput data, (2) guidance of metabolic engineering, (3) directing hypothesis-driven discovery, (4) interrogation of multi-species relationships, and (5) network property discovery. In this review, we examine the many uses and future directions of genome-scale metabolic reconstructions, and we highlight trends and opportunities in the field that will make the greatest impact on many fields of biology. PMID:19888215

  12. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data

    PubMed Central

    2012-01-01

    Background The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. Results VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. Conclusions VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic

  13. Bioinformatics challenges in de novo transcriptome assembly using short read sequences in the absence of a reference genome sequence.

    PubMed

    Góngora-Castillo, Elsa; Buell, C Robin

    2013-04-01

    Plant natural product research can be facilitated through genome and transcriptome sequencing approaches that generate informative sequence and expression datasets that enable characterization of biochemical pathways of interest. As the overwhelming majority of plant-derived natural products are derived from species with little, if any, sequence and/or genomic resources, the ability to perform whole genome shotgun sequencing and assembly has been and will continue to be transformative as access to a genome sequence provides molecular resources and a context for discovery and characterization of biosynthetic pathways. Due to the reduced size and complexity of the transcriptome relative to the genome, transcriptome sequencing provides a rapid, inexpensive approach to access gene sequences, gene expression abundances, and gene expression patterns in any species, including those that lack a reference genome sequence. To date, successful applications of RNA sequencing in conjunction with de novo transcriptome assembly has enabled identification of new genes in an array of biochemical pathways in plants. While sequencing technologies are well developed, challenges remain in the handling and analysis of transcriptome sequences. In this Highlight article, we provide an overview of the bioinformatics challenges associated with transcriptome analyses using short read sequences and how to address these issues in plant species that lack a reference genome.

  14. Chapter 4 genomics, transcriptomics, and epigenomics in traumatic brain injury research.

    PubMed

    Puccio, Ava M; Alexander, Sheila

    2015-01-01

    The long-term effects and significant impact of the full spectrum of traumatic brain injury (TBI) has received increased attention in recent years. Despite increased research efforts, there has been little movement toward improving outcomes for the survivors of TBI. TBI is a heterogeneous condition with a complex biological response, and significant variability in human recovery contributes to the difficulty in identifying therapeutics that improve outcomes. Personalized medicine, identifying the best course of treatment for a given individual based on individual characteristics, has great potential to improve recovery for TBI survivors. The advances in medical genetics and genomics over the past 20 years have increased our understanding of many biological processes. A substantial amount of research has focused on the genomic, transcriptomic, and epigenomic profiles in many health and disease states, including recovery from TBI. The focus of this review chapter is to describe the current state of the science in genomic, transcriptomic, and epigenomic research in the TBI population. There have been some advancements toward understanding the genomic, transcriptomic, and epigenomic processes in humans, but much of this work remains at the preclinical stage. This current evidence does improve our understanding of TBI recovery, but also serves as an excellent platform upon which to build further study toward improved outcomes for this population.

  15. Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics.

    PubMed

    Gibbons, John G; Janson, Eric M; Hittinger, Chris Todd; Johnston, Mark; Abbot, Patrick; Rokas, Antonis

    2009-12-01

    Next-generation sequencing has opened the door to genomic analysis of nonmodel organisms. Technologies generating long-sequence reads (200-400 bp) are increasingly used in evolutionary studies of nonmodel organisms, but the short-sequence reads (30-50 bp) that can be produced at lower cost are thought to be of limited utility for de novo sequencing applications. Here, we tested this assumption by short-read sequencing the transcriptomes of the tropical disease vectors Aedes aegypti and Anopheles gambiae, for which complete genome sequences are available. Comparison of our results to the reference genomes allowed us to accurately evaluate the quantity, quality, and functional and evolutionary information content of our "test" data. We produced more than 0.7 billion nucleotides of sequenced data per species that assembled into more than 21,000 test contigs larger than 100 bp per species and covered approximately 27% of the Aedes reference transcriptome. Remarkably, the substitution error rate in the test contigs was approximately 0.25% per site, with very few indels or assembly errors. Test contigs of both species were enriched for genes involved in energy production and protein synthesis and underrepresented in genes involved in transcription and differentiation. Ortholog prediction using the test contigs was accurate across hundreds of millions of years of evolution. Our results demonstrate the considerable utility of short-read transcriptome sequencing for genomic studies of nonmodel organisms and suggest an approach for assessing the information content of next-generation data for evolutionary studies.

  16. The genome and transcriptome of Haemonchus contortus, a key model parasite for drug and vaccine discovery

    PubMed Central

    2013-01-01

    Background The small ruminant parasite Haemonchus contortus is the most widely used parasitic nematode in drug discovery, vaccine development and anthelmintic resistance research. Its remarkable propensity to develop resistance threatens the viability of the sheep industry in many regions of the world and provides a cautionary example of the effect of mass drug administration to control parasitic nematodes. Its phylogenetic position makes it particularly well placed for comparison with the free-living nematode Caenorhabditis elegans and the most economically important parasites of livestock and humans. Results Here we report the detailed analysis of a draft genome assembly and extensive transcriptomic dataset for H. contortus. This represents the first genome to be published for a strongylid nematode and the most extensive transcriptomic dataset for any parasitic nematode reported to date. We show a general pattern of conservation of genome structure and gene content between H. contortus and C. elegans, but also a dramatic expansion of important parasite gene families. We identify genes involved in parasite-specific pathways such as blood feeding, neurological function, and drug metabolism. In particular, we describe complete gene repertoires for known drug target families, providing the most comprehensive understanding yet of the action of several important anthelmintics. Also, we identify a set of genes enriched in the parasitic stages of the lifecycle and the parasite gut that provide a rich source of vaccine and drug target candidates. Conclusions The H. contortus genome and transcriptome provide an essential platform for postgenomic research in this and other important strongylid parasites. PMID:23985316

  17. A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome

    PubMed Central

    Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

    2012-01-01

    The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development. PMID:22508961

  18. A new RNASeq-based reference transcriptome for sugar beet and its application in transcriptome-scale analysis of vernalization and gibberellin responses

    PubMed Central

    2012-01-01

    Background Sugar beet (Beta vulgaris sp. vulgaris) crops account for about 30% of world sugar. Sugar yield is compromised by reproductive growth hence crops must remain vegetative until harvest. Prolonged exposure to cold temperature (vernalization) in the range 6°C to 12°C induces reproductive growth, leading to bolting (rapid elongation of the main stem) and flowering. Spring cultivation of crops in cool temperate climates makes them vulnerable to vernalization and hence bolting, which is initiated in the apical shoot meristem in processes involving interaction between gibberellin (GA) hormones and vernalization. The underlying mechanisms are unknown and genome scale next generation sequencing approaches now offer comprehensive strategies to investigate them; enabling the identification of novel targets for bolting control in sugar beet crops. In this study, we demonstrate the application of an mRNA-Seq based strategy for this purpose. Results There is no sugar beet reference genome, or public expression array platforms. We therefore used RNA-Seq to generate the first reference transcriptome. We next performed digital gene expression profiling using shoot apex mRNA from two sugar beet cultivars with and without applied GA, and also a vernalized cultivar with and without applied GA. Subsequent bioinformatics analyses identified transcriptional changes associated with genotypic difference and experimental treatments. Analysis of expression profiles in response to vernalization and GA treatment suggested previously unsuspected roles for a RAV1-like AP2/B3 domain protein in vernalization and efflux transporters in the GA response. Conclusions Next generation RNA-Seq enabled the generation of the first reference transcriptome for sugar beet and the study of global transcriptional responses in the shoot apex to vernalization and GA treatment, without the need for a reference genome or established array platforms. Comprehensive bioinformatic analysis identified

  19. Recent advances in genomics and transcriptomics of cnidarians.

    PubMed

    Technau, Ulrich; Schwaiger, Michaela

    2015-12-01

    The advent of the genomic era has provided important and surprising insights into the deducted genetic composition of the common ancestor of cnidarians and bilaterians. This has changed our view of how genomes of metazoans evolve and when crucial gene families arose and diverged in animal evolution. Sequencing of several cnidarian genomes showed that cnidarians share a great part of their gene repertoire as well as genome synteny with vertebrates, with less gene losses in the anthozoan cnidarian lineage than for example in ecdysozoans like Drosophila melanogaster or Caenorhabditis elegans. The Hydra genome on the other hand has evolved more rapidly indicated by more divergent sequences, more cases of gene losses and many taxonomically restricted genes. Cnidarian genomes also contain a rich repertoire of transcription factors, including those that in bilaterian model organisms regulate the development of key bilaterian traits such as mesoderm, nervous system development and bilaterality. The sea anemone Nematostella vectensis, and possibly cnidarians in general, does not only share its complex gene repertoire with bilaterians, but also the regulation of crucial developmental regulatory genes via distal enhancer elements. In addition, epigenetic modifications on DNA and chromatin are shared among eumetazoans. This suggests that most conserved genes present in our genomes today, as well as the mechanisms guiding their expression, evolved before the divergence of cnidarians and bilaterians about 600 Myr ago. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Soybean (Glycine max) SWEET gene family: insights through comparative genomics, transcriptome profiling and whole genome re-sequence analysis.

    PubMed

    Patil, Gunvant; Valliyodan, Babu; Deshmukh, Rupesh; Prince, Silvas; Nicander, Bjorn; Zhao, Mingzhe; Sonah, Humira; Song, Li; Lin, Li; Chaudhary, Juhi; Liu, Yang; Joshi, Trupti; Xu, Dong; Nguyen, Henry T

    2015-07-11

    SWEET (MtN3_saliva) domain proteins, a recently identified group of efflux transporters, play an indispensable role in sugar efflux, phloem loading, plant-pathogen interaction and reproductive tissue development. The SWEET gene family is predominantly studied in Arabidopsis and members of the family are being investigated in rice. To date, no transcriptome or genomics analysis of soybean SWEET genes has been reported. In the present investigation, we explored the evolutionary aspect of the SWEET gene family in diverse plant species including primitive single cell algae to angiosperms with a major emphasis on Glycine max. Evolutionary features showed expansion and duplication of the SWEET gene family in land plants. Homology searches with BLAST tools and Hidden Markov Model-directed sequence alignments identified 52 SWEET genes that were mapped to 15 chromosomes in the soybean genome as tandem duplication events. Soybean SWEET (GmSWEET) genes showed a wide range of expression profiles in different tissues and developmental stages. Analysis of public transcriptome data and expression profiling using quantitative real time PCR (qRT-PCR) showed that a majority of the GmSWEET genes were confined to reproductive tissue development. Several natural genetic variants (non-synonymous SNPs, premature stop codons and haplotype) were identified in the GmSWEET genes using whole genome re-sequencing data analysis of 106 soybean genotypes. A significant association was observed between SNP-haplogroup and seed sucrose content in three gene clusters on chromosome 6. Present investigation utilized comparative genomics, transcriptome profiling and whole genome re-sequencing approaches and provided a systematic description of soybean SWEET genes and identified putative candidates with probable roles in the reproductive tissue development. Gene expression profiling at different developmental stages and genomic variation data will aid as an important resource for the soybean research

  1. Comparative whole genome transcriptome and metabolome analyses of five Klebsiella pneumonia strains.

    PubMed

    Lee, Soojin; Kim, Borim; Yang, Jeongmo; Jeong, Daun; Park, Soohyun; Shin, Sang Heum; Kook, Jun Ho; Yang, Kap-Seok; Lee, Jinwon

    2015-11-01

    The integration of transcriptomics and metabolomics can provide precise information on gene-to-metabolite networks for identifying the function of novel genes. The goal of this study was to identify novel gene functions involved in 2,3-butanediol (2,3-BDO) biosynthesis by a comprehensive analysis of the transcriptome and metabolome of five mutated Klebsiella pneumonia strains (∆wabG = SGSB100, ∆wabG∆budA = SGSB106, ∆wabG∆budB = SGSB107, ∆wabG∆budC = SGSB108, ∆wabG∆budABC = SGSB109). First, the transcriptomes of all five mutants were analyzed and the genes exhibiting reproducible changes in expression were determined. The transcriptome was well conserved among the five strains, and differences in gene expression occurred mainly in genes coding for 2,3-BDO biosynthesis (budA, budB, and budC) and the genes involved in the degradation of reactive oxygen, biosynthesis and transport of arginine, cysteine biosynthesis, sulfur metabolism, oxidoreductase reaction, and formate dehydrogenase reaction. Second, differences in the metabolome (estimated by carbon distribution, CO2 emission, and redox balance) among the five mutant strains due to gene alteration of the 2,3-BDO operon were detected. The functional genomics approach integrating metabolomics and transcriptomics in K. Pneumonia presented here provides an innovative means of identifying novel gene functions involved in 2,3-BDO biosynthesis metabolism and whole cell metabolism.

  2. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts

    PubMed Central

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M.; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G.; Schroeder, Steven; Scheffler, Brian; Duke, Mary V.; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L.; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C.

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  3. Genome reannotation of the lizard Anolis carolinensis based on 14 adult and embryonic deep transcriptomes

    PubMed Central

    2013-01-01

    Background The green anole lizard, Anolis carolinensis, is a key species for both laboratory and field-based studies of evolutionary genetics, development, neurobiology, physiology, behavior, and ecology. As the first non-avian reptilian genome sequenced, A. carolinesis is also a prime reptilian model for comparison with other vertebrate genomes. The public databases of Ensembl and NCBI have provided a first generation gene annotation of the anole genome that relies primarily on sequence conservation with related species. A second generation annotation based on tissue-specific transcriptomes would provide a valuable resource for molecular studies. Results Here we provide an annotation of the A. carolinensis genome based on de novo assembly of deep transcriptomes of 14 adult and embryonic tissues. This revised annotation describes 59,373 transcripts, compared to 16,533 and 18,939 currently for Ensembl and NCBI, and 22,962 predicted protein-coding genes. A key improvement in this revised annotation is coverage of untranslated region (UTR) sequences, with 79% and 59% of transcripts containing 5’ and 3’ UTRs, respectively. Gaps in genome sequence from the current A. carolinensis build (Anocar2.0) are highlighted by our identification of 16,542 unmapped transcripts, representing 6,695 orthologues, with less than 70% genomic coverage. Conclusions Incorporation of tissue-specific transcriptome sequence into the A. carolinensis genome annotation has markedly improved its utility for comparative and functional studies. Increased UTR coverage allows for more accurate predicted protein sequence and regulatory analysis. This revised annotation also provides an atlas of gene expression specific to adult and embryonic tissues. PMID:23343042

  4. Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome.

    PubMed

    Visser, Erik A; Wegrzyn, Jill L; Steenkmap, Emma T; Myburg, Alexander A; Naidoo, Sanushka

    2015-12-12

    Pines are the most important tree species to the international forestry industry, covering 42 % of the global industrial forest plantation area. One of the most pressing threats to cultivation of some pine species is the pitch canker fungus, Fusarium circinatum, which can have devastating effects in both the field and nursery. Investigation of the Pinus-F. circinatum host-pathogen interaction is crucial for development of effective disease management strategies. As with many non-model organisms, investigation of host-pathogen interactions in pine species is hampered by limited genomic resources. This was partially alleviated through release of the 22 Gbp Pinus taeda v1.01 genome sequence ( http://pinegenome.org/pinerefseq/ ) in 2014. Despite the fact that the fragmented state of the genome may hamper comprehensive transcriptome analysis, it is possible to leverage the inherent redundancy resulting from deep RNA sequencing with Illumina short reads to assemble transcripts in the absence of a completed reference sequence. These data can then be integrated with available genomic data to produce a comprehensive transcriptome resource. The aim of this study was to provide a foundation for gene expression analysis of disease response mechanisms in Pinus patula through transcriptome assembly. Eighteen de novo and two reference based assemblies were produced for P. patula shoot tissue. For this purpose three transcriptome assemblers, Trinity, Velvet/OASES and SOAPdenovo-Trans, were used to maximise diversity and completeness of assembled transcripts. Redundancy in the assembly was reduced using the EvidentialGene pipeline. The resulting 52 Mb P. patula v1.0 shoot transcriptome consists of 52 112 unigenes, 60 % of which could be functionally annotated. The assembled transcriptome will serve as a major genomic resource for future investigation of P. patula and represents the largest gene catalogue produced to date for this species. Furthermore, this assembly can help detect

  5. Genomic and transcriptomic characterization of skull base chordoma

    PubMed Central

    Sa, Jason K.; Lee, In-Hee; Hong, Sang Duk; Kong, Doo-Sik; Nam, Do-Hyun

    2017-01-01

    Skull base chordoma is a primary rare malignant bone-origin tumor showing relatively slow growth pattern and locally destructive lesions, which can only be characterized by histologic components. There is no available prognostic or therapeutic biomarker to predict clinical outcome or treatment response and the molecular mechanisms underlying chordoma development still remain unexplored. Therefore, we sought out to identify novel somatic variations that are associated with chordoma progression and potentially employed as therapeutic targets. Thirteen skull base chordomas were subjected for whole-exome and/or whole-transcriptome sequencing. In process, we have identified chromosomal aberration in 1p, 7, 10, 13 and 17q, high frequency of functional germline SNP of the T gene, rs2305089 (P = 0.0038) and several recurrent alterations including MUC4, NBPF1, NPIPB15 mutations and novel gene fusion of SAMD5-SASH1 for the first time in skull base chordoma. PMID:27901492

  6. Oil Accumulation by the Oleaginous Diatom Fistulifera solaris as Revealed by the Genome and Transcriptome

    PubMed Central

    Veluchamy, Alaguraj; Tanaka, Michihiro; Abida, Heni; Maréchal, Eric; Bowler, Chris; Muto, Masaki; Sunaga, Yoshihiko; Tanaka, Masayoshi; Taniguchi, Takeaki; Fukuda, Yorikane; Nemoto, Michiko; Matsumoto, Mitsufumi; Wong, Pui Shan; Aburatani, Sachiyo; Fujibuchi, Wataru

    2015-01-01

    Oleaginous photosynthetic organisms such as microalgae are promising sources for biofuel production through the generation of carbon-neutral sustainable energy. However, the metabolic mechanisms driving high-rate lipid production in these oleaginous organisms remain unclear, thus impeding efforts to improve productivity through genetic modifications. We analyzed the genome and transcriptome of the oleaginous diatom Fistulifera solaris JPCC DA0580. Next-generation sequencing technology provided evidence of an allodiploid genome structure, suggesting unorthodox molecular evolutionary and genetic regulatory systems for reinforcing metabolic efficiencies. Although major metabolic pathways were shared with nonoleaginous diatoms, transcriptome analysis revealed unique expression patterns, such as concomitant upregulation of fatty acid/triacylglycerol biosynthesis and fatty acid degradation (β-oxidation) in concert with ATP production. This peculiar pattern of gene expression may account for the simultaneous growth and oil accumulation phenotype and may inspire novel biofuel production technology based on this oleaginous microalga. PMID:25634988

  7. Genome-Guided Transcriptome Assembly in the Age of Next-Generation Sequencing

    PubMed Central

    Florea, Liliana D.; Salzberg, Steven L.

    2014-01-01

    Next-generation sequencing technologies provide unprecedented power to explore the repertoire of genes and their alternative splice variants, collectively defining the transcriptome of a species in great detail. However, assembling the short reads into full-length gene and transcript models presents significant computational challenges. We review current algorithms for assembling transcripts and genes from next-generation sequencing reads aligned to a reference genome, and lay out areas for future improvements. PMID:24524156

  8. Finding genome-transcriptome-phenome association with structured association mapping and visualization in GenAMap.

    PubMed

    Curtis, Ross E; Yin, Junming; Kinnaird, Peter; Xing, Eric P

    2012-01-01

    Despite the success of genome-wide association studies in detecting novel disease variants, we are still far from a complete understanding of the mechanisms through which variants cause disease. Most of previous studies have considered only genome-phenome associations. However, the integration of transcriptome data may help further elucidate the mechanisms through which genetic mutations lead to disease and uncover potential pathways to target for treatment. We present a novel structured association mapping strategy for finding genome-transcriptome-phenome associations when SNP, gene-expression, and phenotype data are available for the same cohort. We do so via a two-step procedure where genome-transcriptome associations are identified by GFlasso, a sparse regression technique presented previously. Transcriptome-phenome associations are then found by a novel proposed method called gGFlasso, which leverages structure inherent in the genes and phenotypic traits. Due to the complex nature of three-way association results, visualization tools can aid in the discovery of causal SNPs and regulatory mechanisms affecting diseases. Using wellgrounded visualization techniques, we have designed new visualizations that filter through large three-way association results to detect interesting SNPs and associated genes and traits. The two-step GFlasso-gGFlasso algorithmic approach and new visualizations are integrated into GenAMap, a visual analytics system for structured association mapping. Results on simulated datasets show that our approach has the potential to increase the sensitivity and specificity of association studies, compared to existing procedures that do not exploit the full structural information of the data. We report results from an analysis on a publically available mouse dataset, showing that identified SNP-gene-trait associations are compatible with known biology.

  9. The Draft Genome and Transcriptome of the Atlantic Horseshoe Crab, Limulus polyphemus

    PubMed Central

    Ramsdell, Jordan S.; Watson III, Winsor H.; Chabot, Christopher C.

    2017-01-01

    The horseshoe crab, Limulus polyphemus, exhibits robust circadian and circatidal rhythms, but little is known about the molecular mechanisms underlying those rhythms. In this study, horseshoe crabs were collected during the day and night as well as high and low tides, and their muscle and central nervous system tissues were processed for genome and transcriptome sequencing, respectively. The genome assembly resulted in 7.4 × 105 contigs with N50 of 4,736, while the transcriptome assembly resulted in 9.3 × 104 contigs and N50 of 3,497. Analysis of functional completeness by the identification of putative universal orthologs suggests that the transcriptome has three times more total expected orthologs than the genome. Interestingly, RNA-Seq analysis indicated no statistically significant changes in expression level for any circadian core or accessory gene, but there was significant cycling of several noncircadian transcripts. Overall, these assemblies provide a resource to investigate the Limulus clock systems and provide a large dataset for further exploration into the taxonomy and biology of the Atlantic horseshoe crab. PMID:28265565

  10. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics

    PubMed Central

    Patel, Ravi K.; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html. PMID:26322998

  11. The Draft Genome and Transcriptome of the Atlantic Horseshoe Crab, Limulus polyphemus.

    PubMed

    Simpson, Stephen D; Ramsdell, Jordan S; Watson Iii, Winsor H; Chabot, Christopher C

    2017-01-01

    The horseshoe crab, Limulus polyphemus, exhibits robust circadian and circatidal rhythms, but little is known about the molecular mechanisms underlying those rhythms. In this study, horseshoe crabs were collected during the day and night as well as high and low tides, and their muscle and central nervous system tissues were processed for genome and transcriptome sequencing, respectively. The genome assembly resulted in 7.4 × 10(5) contigs with N50 of 4,736, while the transcriptome assembly resulted in 9.3 × 10(4) contigs and N50 of 3,497. Analysis of functional completeness by the identification of putative universal orthologs suggests that the transcriptome has three times more total expected orthologs than the genome. Interestingly, RNA-Seq analysis indicated no statistically significant changes in expression level for any circadian core or accessory gene, but there was significant cycling of several noncircadian transcripts. Overall, these assemblies provide a resource to investigate the Limulus clock systems and provide a large dataset for further exploration into the taxonomy and biology of the Atlantic horseshoe crab.

  12. Comparative Transcriptome and Chloroplast Genome Analyses of Two Related Dipteronia Species

    PubMed Central

    Zhou, Tao; Chen, Chen; Wei, Yue; Chang, Yongxia; Bai, Guoqing; Li, Zhonghu; Kanwal, Nazish; Zhao, Guifang

    2016-01-01

    Dipteronia (order Sapindales) is an endangered genus endemic to China and has two living species, D.sinensis and D. dyeriana. The plants are closely related to the genus Acer, which is also classified in the order Sapindales. Evolutionary studies on Dipteronia have been hindered by the paucity of information on their genomes and plastids. Here, we used next generation sequencing to characterize the transcriptomes and complete chloroplast genomes of both Dipteronia species. A comparison of the transcriptomes of both species identified a total of 7814 orthologs. Estimation of selection pressures using Ka/Ks ratios showed that only 30 of 5435 orthologous pairs had a ratio significantly >1, i.e., showing positive selection. However, 4041 orthologs had a Ka/Ks < 0.5 (p < 0.05), suggesting that most genes had likely undergone purifying selection. Based on orthologous unigenes, 314 single copy nuclear genes (SCNGs) were identified. Through a combination of de novo and reference guided assembly, plastid genomes were obtained; that of D. sinensis was 157,080 bp and that of D. dyeriana was 157,071 bp. Both plastid genomes encoded 87 protein coding genes, 40 tRNAs, and 8 rRNAs; no significant differences were detected in the size, gene content, and organization of the two plastomes. We used the whole chloroplast genomes to determine the phylogeny of D. sinensis and D. dyeriana and confirmed that the two species were highly divergent. Overall, our study provides comprehensive transcriptomic and chloroplast genomic resources, which will be valuable for future evolutionary studies of Dipteronia. PMID:27790228

  13. Chromosome-level genome assembly and transcriptome of the green alga Chromochloris zofingiensis illuminates astaxanthin production

    DOE PAGES

    Roth, Melissa S.; Cokus, Shawn J.; Gallaher, Sean D.; ...

    2017-05-08

    Microalgae have potential to help meet energy and food demands without exacerbating environmental problems. There is interest in the unicellular green alga Chromochloris zofingiensis, because it produces lipids for biofuels and a highly valuable carotenoid nutraceutical, astaxanthin. Here, to advance understanding of its biology and facilitate commercial development, we present a C. zofingiensis chromosome-level nuclear genome, organelle genomes, and transcriptome from diverse growth conditions. The assembly, derived from a combination of short- and long-read sequencing in conjunction with optical mapping, revealed a compact genome of ~58 Mbp distributed over 19 chromosomes containing 15,274 predicted protein-coding genes. The genome has uniformmore » gene density over chromosomes, low repetitive sequence content (~6%), and a high fraction of protein-coding sequence (~39%) with relatively long coding exons and few coding introns. Functional annotation of gene models identified orthologous families for the majority (~73%) of genes. Synteny analysis uncovered localized but scrambled blocks of genes in putative orthologous relationships with other green algae. Two genes encoding beta-ketolase (BKT), the key enzyme synthesizing astaxanthin, were found in the genome, and both were up-regulated by high light. Isolation and molecular analysis of astaxanthin-deficient mutants showed that BKT1 is required for the production of astaxanthin. Moreover, the transcriptome under high light exposure revealed candidate genes that could be involved in critical yet missing steps of astaxanthin biosynthesis, including ABC transporters, cytochrome P450 enzymes, and an acyltransferase. Finally, the high-quality genome and transcriptome provide insight into the green algal lineage and carotenoid production.« less

  14. Chromosome-level genome assembly and transcriptome of the green alga Chromochloris zofingiensis illuminates astaxanthin production

    PubMed Central

    Roth, Melissa S.; Cokus, Shawn J.; Gallaher, Sean D.; Walter, Andreas; Lopez, David; Erickson, Erika; Endelman, Benjamin; Westcott, Daniel; Larabell, Carolyn A.; Merchant, Sabeeha S.; Pellegrini, Matteo

    2017-01-01

    Microalgae have potential to help meet energy and food demands without exacerbating environmental problems. There is interest in the unicellular green alga Chromochloris zofingiensis, because it produces lipids for biofuels and a highly valuable carotenoid nutraceutical, astaxanthin. To advance understanding of its biology and facilitate commercial development, we present a C. zofingiensis chromosome-level nuclear genome, organelle genomes, and transcriptome from diverse growth conditions. The assembly, derived from a combination of short- and long-read sequencing in conjunction with optical mapping, revealed a compact genome of ∼58 Mbp distributed over 19 chromosomes containing 15,274 predicted protein-coding genes. The genome has uniform gene density over chromosomes, low repetitive sequence content (∼6%), and a high fraction of protein-coding sequence (∼39%) with relatively long coding exons and few coding introns. Functional annotation of gene models identified orthologous families for the majority (∼73%) of genes. Synteny analysis uncovered localized but scrambled blocks of genes in putative orthologous relationships with other green algae. Two genes encoding beta-ketolase (BKT), the key enzyme synthesizing astaxanthin, were found in the genome, and both were up-regulated by high light. Isolation and molecular analysis of astaxanthin-deficient mutants showed that BKT1 is required for the production of astaxanthin. Moreover, the transcriptome under high light exposure revealed candidate genes that could be involved in critical yet missing steps of astaxanthin biosynthesis, including ABC transporters, cytochrome P450 enzymes, and an acyltransferase. The high-quality genome and transcriptome provide insight into the green algal lineage and carotenoid production. PMID:28484037

  15. Transcriptome characterization and SSR discovery in large-scale loach Paramisgurnus dabryanus (Cobitidae, Cypriniformes).

    PubMed

    Li, Caijuan; Ling, Qufei; Ge, Chen; Ye, Zhuqing; Han, Xiaofei

    2015-02-25

    The large-scale loach (Paramisgurnus dabryanus, Cypriniformes) is a bottom-dwelling freshwater species of fish found mainly in eastern Asia. The natural germplasm resources of this important aquaculture species has been recently threatened due to overfishing and artificial propagation. The objective of this study is to obtain the first functional genomic resource and candidate molecular markers for future conservation and breeding research. Illumina paired-end sequencing generated over one hundred million reads that resulted in 71,887 assembled transcripts, with an average length of 1465bp. 42,093 (58.56%) protein-coding sequences were predicted; and 43,837 transcripts had significant matches to NCBI nonredundant protein (Nr) database. 29,389 and 14,419 transcripts were assigned into gene ontology (GO) categories and Eukaryotic Orthologous Groups (KOG), respectively. 22,102 (31.14%) transcripts were mapped to 302 KEGG pathways. In addition, 15,106 candidate SSR markers were identified, with 11,037 pairs of PCR primers designed. 400 primers pairs of SSR selected randomly were validated, of which 364 (91%) pairs of primers were able to produce PCR products. Further test with 41 loci and 20 large-scale loach specimens collected from the four largest lakes in China showed that 36 (87.8%) loci were polymorphic. The transcriptomic profile and SSR repertoire obtained in this study will facilitate population genetic studies and selective breeding of large-scale loach in the future. Copyright © 2015. Published by Elsevier B.V.

  16. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    PubMed

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species.

  17. The genome and developmental transcriptome of the strongylid nematode Haemonchus contortus

    PubMed Central

    2013-01-01

    Background The barber's pole worm, Haemonchus contortus, is one of the most economically important parasites of small ruminants worldwide. Although this parasite can be controlled using anthelmintic drugs, resistance against most drugs in common use has become a widespread problem. We provide a draft of the genome and the transcriptomes of all key developmental stages of H. contortus to support biological and biotechnological research areas of this and related parasites. Results The draft genome of H. contortus is 320 Mb in size and encodes 23,610 protein-coding genes. On a fundamental level, we elucidate transcriptional alterations taking place throughout the life cycle, characterize the parasite's gene silencing machinery, and explore molecules involved in development, reproduction, host-parasite interactions, immunity, and disease. The secretome of H. contortus is particularly rich in peptidases linked to blood-feeding activity and interactions with host tissues, and a diverse array of molecules is involved in complex immune responses. On an applied level, we predict drug targets and identify vaccine molecules. Conclusions The draft genome and developmental transcriptome of H. contortus provide a major resource to the scientific community for a wide range of genomic, genetic, proteomic, metabolomic, evolutionary, biological, ecological, and epidemiological investigations, and a solid foundation for biotechnological outcomes, including new anthelmintics, vaccines and diagnostic tests. This first draft genome of any strongylid nematode paves the way for a rapid acceleration in our understanding of a wide range of socioeconomically important parasites of one of the largest nematode orders. PMID:23985341

  18. Insights into the Maize Pan-Genome and Pan-Transcriptome[W][OPEN

    PubMed Central

    Hirsch, Candice N.; Foerster, Jillian M.; Johnson, James M.; Sekhon, Rajandeep S.; Muttoni, German; Vaillancourt, Brieanne; Peñagaricano, Francisco; Lindquist, Erika; Pedraza, Mary Ann; Barry, Kerrie; de Leon, Natalia; Kaeppler, Shawn M.; Buell, C. Robin

    2014-01-01

    Genomes at the species level are dynamic, with genes present in every individual (core) and genes in a subset of individuals (dispensable) that collectively constitute the pan-genome. Using transcriptome sequencing of seedling RNA from 503 maize (Zea mays) inbred lines to characterize the maize pan-genome, we identified 8681 representative transcript assemblies (RTAs) with 16.4% expressed in all lines and 82.7% expressed in subsets of the lines. Interestingly, with linkage disequilibrium mapping, 76.7% of the RTAs with at least one single nucleotide polymorphism (SNP) could be mapped to a single genetic position, distributed primarily throughout the nonpericentromeric portion of the genome. Stepwise iterative clustering of RTAs suggests, within the context of the genotypes used in this study, that the maize genome is restricted and further sampling of seedling RNA within this germplasm base will result in minimal discovery. Genome-wide association studies based on SNPs and transcript abundance in the pan-genome revealed loci associated with the timing of the juvenile-to-adult vegetative and vegetative-to-reproductive developmental transitions, two traits important for fitness and adaptation. This study revealed the dynamic nature of the maize pan-genome and demonstrated that a substantial portion of variation may lie outside the single reference genome for a species. PMID:24488960

  19. Swine transcriptome characterization by combined Iso-Seq and RNA-seq for annotating the emerging long read-based reference genome

    USDA-ARS?s Scientific Manuscript database

    PacBio long-read sequencing technology is increasingly popular in genome sequence assembly and transcriptome cataloguing. Recently, a new-generation pig reference genome was assembled based on long reads from this technology. To finely annotate this genome assembly, transcriptomes of nine tissues fr...

  20. Dataset for distribution of SIDER2 elements in the Leishmania major genome and transcriptome.

    PubMed

    Requena, Jose M; Rastrojo, Alberto; Garde, Esther; López, Manuel C; Thomas, M Carmen; Aguado, Begoña

    2017-04-01

    This paper contains data related to the research article entitled "Genomic cartography and proposal of nomenclature for the repeated, interspersed elements of the Leishmania major SIDER2 family and identification of SIDER2-containing transcripts" [1]. SIDER2 elements are repeated sequences, derived from, nowadays, extinct retrotransposons, that populate the genomes of protist of the genera Leishmania. This dataset (Supplementary file 1), an inventory of 1100 SIDER2 elements, was generated by surveying the L. major complete genome using bioinformatics tools with further manual refinements. In addition to the genomic distribution of these elements (summarized in Fig. 1), this dataset contains information regarding their association with specific transcripts, based on the recently established transcriptome for L. major[2].

  1. Genome-scale functional profiling of the mammalian AP-1 signaling pathway

    PubMed Central

    Chanda, Sumit K.; White, Suhaila; Orth, Anthony P.; Reisdorph, Richard; Miraglia, Loren; Thomas, Russell S.; DeJesus, Paul; Mason, Daniel E.; Huang, Qihong; Vega, Raquel; Yu, De-Hua; Nelson, Christian G.; Smith, Brendan M.; Terry, Robert; Linford, Alicia S.; Yu, Yang; Chirn, Gung-wei; Song, Chuanzheng; Labow, Mark A.; Cohen, Dalia; King, Frederick J.; Peters, Eric C.; Schultz, Peter G.; Vogt, Peter K.; Hogenesch, John B.; Caldwell, Jeremy S.

    2003-01-01

    Large-scale functional genomics approaches are fundamental to the characterization of mammalian transcriptomes annotated by genome sequencing projects. Although current high-throughput strategies systematically survey either transcriptional or biochemical networks, analogous genome-scale investigations that analyze gene function in mammalian cells have yet to be fully realized. Through transient overexpression analysis, we describe the parallel interrogation of ≈20,000 sequence annotated genes in cancer-related signaling pathways. For experimental validation of these genome data, we apply an integrative strategy to characterize previously unreported effectors of activator protein-1 (AP-1) mediated growth and mitogenic response pathways. These studies identify the ADP-ribosylation factor GTPase-activating protein Centaurin α1 and a Tudor domain-containing hypothetical protein as putative AP-1 regulatory oncogenes. These results provide insight into the composition of the AP-1 signaling machinery and validate this approach as a tractable platform for genome-wide functional analysis. PMID:14514886

  2. Genome-wide transcriptome and expression profile analysis of Phalaenopsis during explant browning.

    PubMed

    Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei

    2015-01-01

    Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning.

  3. Genome-Wide Transcriptome and Expression Profile Analysis of Phalaenopsis during Explant Browning

    PubMed Central

    Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei

    2015-01-01

    Background Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. Methodology/Principal Findings We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Conclusions/Significance Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further

  4. Transcriptome-scale homoeolog-specific transcript assemblies of bread wheat

    PubMed Central

    2012-01-01

    Background Bread wheat is one of the world’s most important food crops and considerable efforts have been made to develop genomic resources for this species. This includes an on-going project by the International Wheat Genome Sequencing Consortium to assemble its large and complex genome, which is hexaploid and contains three closely related ‘homoeologous’ copies for each chromosome. This multi-national effort avoids the complications polyploidy entails for correct assembly of the genome by sequencing flow-sorted chromosome arms one at a time. Here we report on an alternate approach, a direct homoeolog-specific assembly of the expressed portion of the genome, the transcriptome. Results After assessment of the ability of various assemblers to generate homoeolog-specific assemblies, we employed a two-stage assembly process to produce a high-quality assembly of the transcriptome of hexaploid wheat from Roche-454 and Illumina GAIIx paired-end sequence reads. The assembly process made use of a rapid partitioning of expressed sequences into homoeologous clusters, followed by a parallel high-fidelity assembly of each cluster on a 1150-processor compute cloud. We assessed assembly quality through comparison to known wheat gene sequences and found that in ca. 98.5% of cases the assembly was sufficiently accurate for homoeologous triplets to be cleanly separated into either two or three separate contigs. Comparison to publicly available transcript collections suggests that the assembly covers ~75-80% of the complete transcriptome. Conclusions This work therefore describes the first homoeolog-specific sequence assembly of the wheat transcriptome and provides a reference transcriptome for future wheat research. Furthermore, our assembly methodology is transferable to other polyploid organisms. PMID:22989011

  5. The genomic and transcriptomic landscape of a HeLa cell line.

    PubMed

    Landry, Jonathan J M; Pyl, Paul Theodor; Rausch, Tobias; Zichner, Thomas; Tekkedil, Manu M; Stütz, Adrian M; Jauch, Anna; Aiyar, Raeka S; Pau, Gregoire; Delhomme, Nicolas; Gagneur, Julien; Korbel, Jan O; Huber, Wolfgang; Steinmetz, Lars M

    2013-08-07

    HeLa is the most widely used model cell line for studying human cellular and molecular biology. To date, no genomic reference for this cell line has been released, and experiments have relied on the human reference genome. Effective design and interpretation of molecular genetic studies performed using HeLa cells require accurate genomic information. Here we present a detailed genomic and transcriptomic characterization of a HeLa cell line. We performed DNA and RNA sequencing of a HeLa Kyoto cell line and analyzed its mutational portfolio and gene expression profile. Segmentation of the genome according to copy number revealed a remarkably high level of aneuploidy and numerous large structural variants at unprecedented resolution. Some of the extensive genomic rearrangements are indicative of catastrophic chromosome shattering, known as chromothripsis. Our analysis of the HeLa gene expression profile revealed that several pathways, including cell cycle and DNA repair, exhibit significantly different expression patterns from those in normal human tissues. Our results provide the first detailed account of genomic variants in the HeLa genome, yielding insight into their impact on gene expression and cellular function as well as their origins. This study underscores the importance of accounting for the strikingly aberrant characteristics of HeLa cells when designing and interpreting experiments, and has implications for the use of HeLa as a model of human biology.

  6. Phenotypic, genomic, transcriptomic and proteomic changes in Bacillus cereus after a short-term space flight

    NASA Astrophysics Data System (ADS)

    Su, Longxiang; Zhou, Lisha; Liu, Jinwen; Cen, Zhong; Wu, Chunyan; Wang, Tong; Zhou, Tao; Chang, De; Guo, Yinghua; Fang, Xiangqun; Wang, Junfeng; Li, Tianzhi; Yin, Sanjun; Dai, Wenkui; Zhou, Yuping; Zhao, Jiao; Fang, Chengxiang; Yang, Ruifu; Liu, Changting

    2014-01-01

    The environment in space could affect microorganisms by changing a variety of features, including proliferation rate, cell physiology, cell metabolism, biofilm production, virulence, and drug resistance. However, the relevant mechanisms remain unclear. To explore the effect of a space environment on Bacillus cereus, a strain of B. cereus was sent to space for 398 h by ShenZhou VIII from November 1, 2011 to November 17, 2011. A ground simulation with similar temperature conditions was simultaneously performed as a control. After the flight, the flight and control strains were further analyzed using phenotypic, genomic, transcriptomic and proteomic techniques to explore the divergence of B. cereus in a space environment. The flight strains exhibited a significantly slower growth rate, a significantly higher amikacin resistance level, and changes in metabolism relative to the ground control strain. After the space flight, three polymorphic loci were found in the flight strains LCT-BC25 and LCT-BC235. A combined transcriptome and proteome analysis was performed, and this analysis revealed that the flight strains had changes in genes/proteins relevant to metabolism. In addition, certain genes/proteins that are relevant to structural function, gene expression modification and translation, and virulence were also altered. Our study represents the first documented analysis of the phenotypic, genomic, transcriptomic, and proteomic changes that occur in B. cereus during space flight, and our results could be beneficial to the field of space microbiology.

  7. The duck genome and transcriptome provide insight into an avian influenza virus reservoir species

    PubMed Central

    Chen, Hualan; Zhang, Yong; Qian, Wubin; Kim, Heebal; Gan, Shangquan; Zhao, Yiqiang; Li, Jianwen; Yi, Kang; Feng, Huapeng; Zhu, Pengyang; Li, Bo; Liu, Qiuyue; Fairley, Suan; Magor, Katharine E; Du, Zhenlin; Hu, Xiaoxiang; Goodman, Laurie; Tafer, Hakim; Vignal, Alain; Lee, Taeheon; Kim, Kyu-Won; Sheng, Zheya; An, Yang; Searle, Steve; Herrero, Javier; Groenen, Martien A M; Crooijmans, Richard P M A; Faraut, Thomas; Cai, Qingle; Webster, Robert G; Aldridge, Jerry R; Warren, Wesley C; Bartschat, Sebastian; Kehr, Stephanie; Marz, Manja; Stadler, Peter F; Smith, Jacqueline; Kraus, Robert H S; Zhao, Yaofeng; Ren, Liming; Fei, Jing; Morisson, Mireille; Kaiser, Pete; Griffin, Darren K; Rao, Man; Pitel, Frederique; Wang, Jun; Li, Ning

    2014-01-01

    The duck (Anas platyrhynchos) is one of the principal natural hosts of influenza A viruses. We present the duck genome sequence and perform deep transcriptome analyses to investigate immune-related genes. Our data indicate that the duck possesses a contractive immune gene repertoire, as in chicken and zebra finch, and this repertoire has been shaped through lineage-specific duplications. We identify genes that are responsive to influenza A viruses using the lung transcriptomes of control ducks and ones that were infected with either a highly pathogenic (A/duck/Hubei/49/05) or a weakly pathogenic (A/goose/Hubei/65/05) H5N1 virus. Further, we show how the duck’s defense mechanisms against influenza infection have been optimized through the diversification of its β-defensin and butyrophilin-like repertoires. These analyses, in combination with the genomic and transcriptomic data, provide a resource for characterizing the interaction between host and influenza viruses. PMID:23749191

  8. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution

    PubMed Central

    Clarke, Thomas H.; Garb, Jessica E.; Hayashi, Cheryl Y.; Arensburger, Peter; Ayoub, Nadia A.

    2015-01-01

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). PMID:26058392

  9. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

    PubMed

    Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

    2015-06-08

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Genome and transcriptome of the porcine whipworm Trichuris suis

    PubMed Central

    Jex, Aaron R.; Nejsum, Peter; Schwarz, Erich M.; Hu, Li; Young, Neil D.; Hall, Ross S.; Korhonen, Pasi K.; Liao, Shengguang; Thamsborg, Stig; Xia, Jinquan; Xu, Pengwei; Wang, Shaowei; Scheerlinck, Jean-Pierre Y.; Hofmann, Andreas; Sternberg, Paul W.; Wang, Jun; Gasser, Robin B.

    2014-01-01

    Trichuris (whipworm) infects 1 billion people worldwide, and causes a disease (trichuriasis) that results in major socioeconomic losses in both humans and pigs. Trichuriasis relates to an inflammation of the large intestine manifested in bloody diarrhoea, and chronic disease can cause malnourishment and stunting in children. Paradoxically, Trichuris of pigs has shown substantial promise as a treatment for human autoimmune disorders, including inflammatory bowel disease (IBD) and multiple sclerosis (MS). Here, we report ~80 megabase (Mb) draft assemblies of the genomes of adult male and female T. suis, and explore stage-, sex- and tissue-specific transcription of messenger and small non-coding RNAs. PMID:24929829

  11. Genome and transcriptome of the porcine whipworm Trichuris suis.

    PubMed

    Jex, Aaron R; Nejsum, Peter; Schwarz, Erich M; Hu, Li; Young, Neil D; Hall, Ross S; Korhonen, Pasi K; Liao, Shengguang; Thamsborg, Stig; Xia, Jinquan; Xu, Pengwei; Wang, Shaowei; Scheerlinck, Jean-Pierre Y; Hofmann, Andreas; Sternberg, Paul W; Wang, Jun; Gasser, Robin B

    2014-07-01

    Trichuris (whipworm) infects 1 billion people worldwide and causes a disease (trichuriasis) that results in major socioeconomic losses in both humans and pigs. Trichuriasis relates to an inflammation of the large intestine manifested in bloody diarrhea, and chronic disease can cause malnourishment and stunting in children. Paradoxically, Trichuris of pigs has shown substantial promise as a treatment for human autoimmune disorders, including inflammatory bowel disease (IBD) and multiple sclerosis. Here we report whole-genome sequencing at ∼140-fold coverage of adult male and female T. suis and ∼80-Mb draft assemblies. We explore stage-, sex- and tissue-specific transcription of mRNAs and small noncoding RNAs.

  12. Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.)

    USDA-ARS?s Scientific Manuscript database

    This study reports generation of large-scale genomic resources for pigeonpea, a so-called ‘orphan crop species’ of the semi-arid tropic regions. Roche FLX/454 sequencing was carried out on a normalized cDNA pool prepared from 31 tissues produced 494,353 short transcript reads (STRs). Cluster analysi...

  13. Microbial genomics, transcriptomics and proteomics: new discoveries in decomposition research using complementary methods.

    PubMed

    Baldrian, Petr; López-Mondéjar, Rubén

    2014-02-01

    Molecular methods for the analysis of biomolecules have undergone rapid technological development in the last decade. The advent of next-generation sequencing methods and improvements in instrumental resolution enabled the analysis of complex transcriptome, proteome and metabolome data, as well as a detailed annotation of microbial genomes. The mechanisms of decomposition by model fungi have been described in unprecedented detail by the combination of genome sequencing, transcriptomics and proteomics. The increasing number of available genomes for fungi and bacteria shows that the genetic potential for decomposition of organic matter is widespread among taxonomically diverse microbial taxa, while expression studies document the importance of the regulation of expression in decomposition efficiency. Importantly, high-throughput methods of nucleic acid analysis used for the analysis of metagenomes and metatranscriptomes indicate the high diversity of decomposer communities in natural habitats and their taxonomic composition. Today, the metaproteomics of natural habitats is of interest. In combination with advanced analytical techniques to explore the products of decomposition and the accumulation of information on the genomes of environmentally relevant microorganisms, advanced methods in microbial ecophysiology should increase our understanding of the complex processes of organic matter transformation.

  14. The genome and transcriptome of the enteric parasite Entamoeba invadens, a model for encystation

    PubMed Central

    2013-01-01

    Background Several eukaryotic parasites form cysts that transmit infection. The process is found in diverse organisms such as Toxoplasma, Giardia, and nematodes. In Entamoeba histolytica this process cannot be induced in vitro, making it difficult to study. In Entamoeba invadens, stage conversion can be induced, but its utility as a model system to study developmental biology has been limited by a lack of genomic resources. We carried out genome and transcriptome sequencing of E. invadens to identify molecular processes involved in stage conversion. Results We report the sequencing and assembly of the E. invadens genome and use whole transcriptome sequencing to characterize changes in gene expression during encystation and excystation. The E. invadens genome is larger than that of E. histolytica, apparently largely due to expansion of intergenic regions; overall gene number and the machinery for gene regulation are conserved between the species. Over half the genes are regulated during the switch between morphological forms and a key signaling molecule, phospholipase D, appears to regulate encystation. We provide evidence for the occurrence of meiosis during encystation, suggesting that stage conversion may play a key role in recombination between strains. Conclusions Our analysis demonstrates that a number of core processes are common to encystation between distantly related parasites, including meiosis, lipid signaling and RNA modification. These data provide a foundation for understanding the developmental cascade in the important human pathogen E. histolytica and highlight conserved processes more widely relevant in enteric pathogens. PMID:23889909

  15. The genome and transcriptome of the enteric parasite Entamoeba invadens, a model for encystation.

    PubMed

    Ehrenkaufer, Gretchen M; Weedall, Gareth D; Williams, Daryl; Lorenzi, Hernan A; Caler, Elisabet; Hall, Neil; Singh, Upinder

    2013-07-26

    Several eukaryotic parasites form cysts that transmit infection. The process is found in diverse organisms such as Toxoplasma, Giardia, and nematodes. In Entamoeba histolytica this process cannot be induced in vitro, making it difficult to study. In Entamoeba invadens, stage conversion can be induced, but its utility as a model system to study developmental biology has been limited by a lack of genomic resources. We carried out genome and transcriptome sequencing of E. invadens to identify molecular processes involved in stage conversion. We report the sequencing and assembly of the E. invadens genome and use whole transcriptome sequencing to characterize changes in gene expression during encystation and excystation. The E. invadens genome is larger than that of E. histolytica, apparently largely due to expansion of intergenic regions; overall gene number and the machinery for gene regulation are conserved between the species. Over half the genes are regulated during the switch between morphological forms and a key signaling molecule, phospholipase D, appears to regulate encystation. We provide evidence for the occurrence of meiosis during encystation, suggesting that stage conversion may play a key role in recombination between strains. Our analysis demonstrates that a number of core processes are common to encystation between distantly related parasites, including meiosis, lipid signaling and RNA modification. These data provide a foundation for understanding the developmental cascade in the important human pathogen E. histolytica and highlight conserved processes more widely relevant in enteric pathogens.

  16. Role of genomics and transcriptomics in selection of reintroduction source populations.

    PubMed

    He, Xiaoping; Johansson, Mattias L; Heath, Daniel D

    2016-10-01

    The use and importance of reintroduction as a conservation tool to return a species to its historical range from which it has been extirpated will increase as climate change and human development accelerate habitat loss and population extinctions. Although the number of reintroduction attempts has increased rapidly over the past 2 decades, the success rate is generally low. As a result of population differences in fitness-related traits and divergent responses to environmental stresses, population performance upon reintroduction is highly variable, and it is generally agreed that selecting an appropriate source population is a critical component of a successful reintroduction. Conservation genomics is an emerging field that addresses long-standing challenges in conservation, and the potential for using novel molecular genetic approaches to inform and improve conservation efforts is high. Because the successful establishment and persistence of reintroduced populations is highly dependent on the functional genetic variation and environmental stress tolerance of the source population, we propose the application of conservation genomics and transcriptomics to guide reintroduction practices. Specifically, we propose using genome-wide functional loci to estimate genetic variation of source populations. This estimate can then be used to predict the potential for adaptation. We also propose using transcriptional profiling to measure the expression response of fitness-related genes to environmental stresses as a proxy for acclimation (tolerance) capacity. Appropriate application of conservation genomics and transcriptomics has the potential to dramatically enhance reintroduction success in a time of rapidly declining biodiversity and accelerating environmental change.

  17. Genome and transcriptome of the regeneration-competent flatworm, Macrostomum lignano

    PubMed Central

    Wasik, Kaja; Gurtowski, James; Zhou, Xin; Ramos, Olivia Mendivil; Delás, M. Joaquina; Battistoni, Giorgia; El Demerdash, Osama; Falciatori, Ilaria; Vizoso, Dita B.; Smith, Andrew D.; Ladurner, Peter; Schärer, Lukas; McCombie, W. Richard; Hannon, Gregory J.; Schatz, Michael

    2015-01-01

    The free-living flatworm, Macrostomum lignano has an impressive regenerative capacity. Following injury, it can regenerate almost an entirely new organism because of the presence of an abundant somatic stem cell population, the neoblasts. This set of unique properties makes many flatworms attractive organisms for studying the evolution of pathways involved in tissue self-renewal, cell-fate specification, and regeneration. The use of these organisms as models, however, is hampered by the lack of a well-assembled and annotated genome sequences, fundamental to modern genetic and molecular studies. Here we report the genomic sequence of M. lignano and an accompanying characterization of its transcriptome. The genome structure of M. lignano is remarkably complex, with ∼75% of its sequence being comprised of simple repeats and transposon sequences. This has made high-quality assembly from Illumina reads alone impossible (N50 = 222 bp). We therefore generated 130× coverage by long sequencing reads from the Pacific Biosciences platform to create a substantially improved assembly with an N50 of 64 Kbp. We complemented the reference genome with an assembled and annotated transcriptome, and used both of these datasets in combination to probe gene-expression patterns during regeneration, examining pathways important to stem cell function. PMID:26392545

  18. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA.

    PubMed

    Chen, Lei; Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world's population. Rhizoctonia solaniis a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10,489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL:http://genedenovoweb.ticp.net:81/rsia/index.php.

  19. The Genome and Development-Dependent Transcriptomes of Pyronema confluens: A Window into Fungal Evolution

    PubMed Central

    Traeger, Stefanie; Altegoer, Florian; Freitag, Michael; Gabaldon, Toni; Kempken, Frank; Kumar, Abhishek; Marcet-Houben, Marina; Pöggeler, Stefanie; Stajich, Jason E.; Nowrousian, Minou

    2013-01-01

    Fungi are a large group of eukaryotes found in nearly all ecosystems. More than 250 fungal genomes have already been sequenced, greatly improving our understanding of fungal evolution, physiology, and development. However, for the Pezizomycetes, an early-diverging lineage of filamentous ascomycetes, there is so far only one genome available, namely that of the black truffle, Tuber melanosporum, a mycorrhizal species with unusual subterranean fruiting bodies. To help close the sequence gap among basal filamentous ascomycetes, and to allow conclusions about the evolution of fungal development, we sequenced the genome and assayed transcriptomes during development of Pyronema confluens, a saprobic Pezizomycete with a typical apothecium as fruiting body. With a size of 50 Mb and ∼13,400 protein-coding genes, the genome is more characteristic of higher filamentous ascomycetes than the large, repeat-rich truffle genome; however, some typical features are different in the P. confluens lineage, e.g. the genomic environment of the mating type genes that is conserved in higher filamentous ascomycetes, but only partly conserved in P. confluens. On the other hand, P. confluens has a full complement of fungal photoreceptors, and expression studies indicate that light perception might be similar to distantly related ascomycetes and, thus, represent a basic feature of filamentous ascomycetes. Analysis of spliced RNA-seq sequence reads allowed the detection of natural antisense transcripts for 281 genes. The P. confluens genome contains an unusually high number of predicted orphan genes, many of which are upregulated during sexual development, consistent with the idea of rapid evolution of sex-associated genes. Comparative transcriptomics identified the transcription factor gene pro44 that is upregulated during development in P. confluens and the Sordariomycete Sordaria macrospora. The P. confluens pro44 gene (PCON_06721) was used to complement the S. macrospora pro44 deletion

  20. Deep analysis of wild Vitis flower transcriptome reveals unexplored genome regions associated with sex specification.

    PubMed

    Ramos, Miguel Jesus Nunes; Coito, João Lucas; Fino, Joana; Cunha, Jorge; Silva, Helena; de Almeida, Patrícia Gomes; Costa, Maria Manuela Ribeiro; Amâncio, Sara; Paulo, Octávio S; Rocheta, Margarida

    2017-01-01

    RNA-seq of Vitis during early stages of bud development, in male, female and hermaphrodite flowers, identified new loci outside of annotated gene models, suggesting their involvement in sex establishment. The molecular mechanisms responsible for flower sex specification remain unclear for most plant species. In the case of V. vinifera ssp. vinifera, it is not fully understood what determines hermaphroditism in the domesticated subspecies and male or female flowers in wild dioecious relatives (Vitis vinifera ssp. sylvestris). Here, we describe a de novo assembly of the transcriptome of three flower developmental stages from the three Vitis vinifera flower types. The validation of de novo assembly showed a correlation of 0.825. The main goals of this work were the identification of V. v. sylvestris exclusive transcripts and the characterization of differential gene expression during flower development. RNA from several flower developmental stages was used previously to generate Illumina sequence reads. Through a sequential de novo assembly strategy one comprehensive transcriptome comprising 95,516 non-redundant transcripts was assembled. From this dataset 81,064 transcripts were annotated to V. v. vinifera reference transcriptome and 11,084 were annotated against V. v. vinifera reference genome. Moreover, we found 3368 transcripts that could not be mapped to Vitis reference genome. From all the non-redundant transcripts that were assembled, bioinformatics analysis identified 133 specific of V. v. sylvestris and 516 transcripts differentially expressed among the three flower types. The detection of transcription from areas of the genome not currently annotated suggests active transcription of previously unannotated genomic loci during early stages of bud development.

  1. Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.

    PubMed

    Rivas, Manuel A; Pirinen, Matti; Conrad, Donald F; Lek, Monkol; Tsang, Emily K; Karczewski, Konrad J; Maller, Julian B; Kukurba, Kimberly R; DeLuca, David S; Fromer, Menachem; Ferreira, Pedro G; Smith, Kevin S; Zhang, Rui; Zhao, Fengmei; Banks, Eric; Poplin, Ryan; Ruderfer, Douglas M; Purcell, Shaun M; Tukiainen, Taru; Minikel, Eric V; Stenson, Peter D; Cooper, David N; Huang, Katharine H; Sullivan, Timothy J; Nedzel, Jared; Bustamante, Carlos D; Li, Jin Billy; Daly, Mark J; Guigo, Roderic; Donnelly, Peter; Ardlie, Kristin; Sammeth, Michael; Dermitzakis, Emmanouil T; McCarthy, Mark I; Montgomery, Stephen B; Lappalainen, Tuuli; MacArthur, Daniel G

    2015-05-08

    Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants.

  2. High density linkage mapping of genomic and transcriptomic SNPs for synteny analysis and anchoring the genome sequence of chickpea

    PubMed Central

    Gaur, Rashmi; Jeena, Ganga; Shah, Niraj; Gupta, Shefali; Pradhan, Seema; Tyagi, Akhilesh K; Jain, Mukesh; Chattopadhyay, Debasis; Bhatia, Sabhyata

    2015-01-01

    This study presents genome-wide discovery of SNPs through next generation sequencing of the genome of Cicer reticulatum. Mapping of the C. reticulatum sequenced reads onto the draft genome assembly of C. arietinum (desi chickpea) resulted in identification of 842,104 genomic SNPs which were utilized along with an additional 36,446 genic SNPs identified from transcriptome sequences of the aforementioned varieties. Two new chickpea Oligo Pool All (OPAs) each having 3,072 SNPs were designed and utilized for SNP genotyping of 129 Recombinant Inbred Lines (RILs). Using Illumina GoldenGate Technology genotyping data of 5,041 SNPs were generated and combined with the 1,673 marker data from previously published studies, to generate a high resolution linkage map. The map comprised of 6698 markers distributed on eight linkage groups spanning 1083.93 cM with an average inter-marker distance of 0.16 cM. Utility of the present map was demonstrated for improving the anchoring of the earlier reported draft genome sequence of desi chickpea by ~30% and that of kabuli chickpea by 18%. The genetic map reported in this study represents the most dense linkage map of chickpea , with the potential to facilitate efficient anchoring of the draft genome sequences of desi as well as kabuli chickpea varieties. PMID:26303721

  3. High density linkage mapping of genomic and transcriptomic SNPs for synteny analysis and anchoring the genome sequence of chickpea.

    PubMed

    Gaur, Rashmi; Jeena, Ganga; Shah, Niraj; Gupta, Shefali; Pradhan, Seema; Tyagi, Akhilesh K; Jain, Mukesh; Chattopadhyay, Debasis; Bhatia, Sabhyata

    2015-08-25

    This study presents genome-wide discovery of SNPs through next generation sequencing of the genome of Cicer reticulatum. Mapping of the C. reticulatum sequenced reads onto the draft genome assembly of C. arietinum (desi chickpea) resulted in identification of 842,104 genomic SNPs which were utilized along with an additional 36,446 genic SNPs identified from transcriptome sequences of the aforementioned varieties. Two new chickpea Oligo Pool All (OPAs) each having 3,072 SNPs were designed and utilized for SNP genotyping of 129 Recombinant Inbred Lines (RILs). Using Illumina GoldenGate Technology genotyping data of 5,041 SNPs were generated and combined with the 1,673 marker data from previously published studies, to generate a high resolution linkage map. The map comprised of 6698 markers distributed on eight linkage groups spanning 1083.93 cM with an average inter-marker distance of 0.16 cM. Utility of the present map was demonstrated for improving the anchoring of the earlier reported draft genome sequence of desi chickpea by ~30% and that of kabuli chickpea by 18%. The genetic map reported in this study represents the most dense linkage map of chickpea , with the potential to facilitate efficient anchoring of the draft genome sequences of desi as well as kabuli chickpea varieties.

  4. Clustered Xenopus keratin genes: A genomic, transcriptomic, and proteomic analysis.

    PubMed

    Suzuki, Ken-Ichi T; Suzuki, Miyuki; Shigeta, Mitsuki; Fortriede, Joshua D; Takahashi, Shuji; Mawaribuchi, Shuuji; Yamamoto, Takashi; Taira, Masanori; Fukui, Akimasa

    2017-06-15

    Keratin genes belong to the intermediate filament superfamily and their expression is altered following morphological and physiological changes in vertebrate epithelial cells. Keratin genes are divided into two groups, type I and II, and are clustered on vertebrate genomes, including those of Xenopus species. Various keratin genes have been identified and characterized by their unique expression patterns throughout ontogeny in Xenopus laevis; however, compilation of previously reported and newly identified keratin genes in two Xenopus species is required for our further understanding of keratin gene evolution, not only in amphibians but also in all terrestrial vertebrates. In this study, 120 putative type I and II keratin genes in total were identified based on the genome data from two Xenopus species. We revealed that most of these genes are highly clustered on two homeologous chromosomes, XLA9_10 and XLA2 in X. laevis, and XTR10 and XTR2 in X. tropicalis, which are orthologous to those of human, showing conserved synteny among tetrapods. RNA-Seq data from various embryonic stages and adult tissues highlighted the unique expression profiles of orthologous and homeologous keratin genes in developmental stage- and tissue-specific manners. Moreover, we identified dozens of epidermal keratin proteins from the whole embryo, larval skin, tail, and adult skin using shotgun proteomics. In light of our results, we discuss the radiation, diversification, and unique expression of the clustered keratin genes, which are closely related to epidermal development and terrestrial adaptation during amphibian evolution, including Xenopus speciation. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Transcriptome and Functional Genomics Reveal the Participation of Adenine Phosphoribosyltransferase in Trypanosoma cruzi Resistance to Benznidazole.

    PubMed

    García-Huertas, Paola; Mejía-Jaramillo, Ana María; González, Laura; Triana Chávez, Omar

    2017-03-09

    Currently, the only available treatments for Trypanosoma cruzi are benznidazole (Bz) and nifurtimox (Nfx). The mechanisms of action and resistance to these drugs in this parasite are not complete known. In order to identify differentially expressed transcripts between sensitive and resistant parasites, a massive pyrosequencing of the T. cruzi transcriptome was carried out. Additionally, the 2D gel electrophoresis profile of sensitive and resistant parasites was analyzed and the data were supported with functional genomics. The results showed 133 differentially expressed genes in resistant parasites. The transcriptome analysis revealed the regulation of different genes with several functions and metabolic pathways, which could suggest that resistance in T. cruzi is a multigenic process. Additionally, using transcriptomics, one gene, adenine phosphoribosyltransferase (APRT), was found to be down-regulated in the resistant parasites and its expression profile was confirmed by 2D electrophoresis analysis. The role of this gene in the resistance to Bz was confirmed overexpressing it in sensitive and resistant parasites. Interestingly, both parasites became more sensitive to Bz and H2 O2 . This is the first RNA-seq study to identify regulated genes in T. cruzi associated with Bz resistance and to show the role of APRT in T. cruzi resistance. Although T. cruzi regulation is mainly post-transcriptional, the transcriptome analysis, supported by 2D gel analysis and functional genomic, provides an overall idea of the expression profiles of genes under resistance conditions. These results contribute essential information to further the understanding of the mechanisms of action and resistance to Bz in T. cruzi. This article is protected by copyright. All rights reserved.

  6. Transcriptome sequencing and microarray development for the Manila clam, Ruditapes philippinarum: genomic tools for environmental monitoring.

    PubMed

    Milan, Massimo; Coppe, Alessandro; Reinhardt, Richard; Cancela, Leonor M; Leite, Ricardo B; Saavedra, Carlos; Ciofi, Claudio; Chelazzi, Guido; Patarnello, Tomaso; Bortoluzzi, Stefania; Bargelloni, Luca

    2011-05-12

    The Manila clam, Ruditapes philippinarum, is one of the major aquaculture species in the world and a potential sentinel organism for monitoring the status of marine ecosystems. However, genomic resources for R. philippinarum are still extremely limited. Global analysis of gene expression profiles is increasingly used to evaluate the biological effects of various environmental stressors on aquatic animals under either artificial conditions or in the wild. Here, we report on the development of a transcriptomic platform for global gene expression profiling in the Manila clam. A normalized cDNA library representing a mixture of adult tissues was sequenced using a ultra high-throughput sequencing technology (Roche 454). A database consisting of 32,606 unique transcripts was constructed, 9,747 (30%) of which could be annotated by similarity. An oligo-DNA microarray platform was designed and applied to profile gene expression of digestive gland and gills. Functional annotation of differentially expressed genes between different tissues was performed by enrichment analysis. Expression of Natural Antisense Transcripts (NAT) analysis was also performed and bi-directional transcription appears a common phenomenon in the R. philippinarum transcriptome. A preliminary study on clam samples collected in a highly polluted area of the Venice Lagoon demonstrated the applicability of genomic tools to environmental monitoring. The transcriptomic platform developed for the Manila clam confirmed the high level of reproducibility of current microarray technology. Next-generation sequencing provided a good representation of the clam transcriptome. Despite the known limitations in transcript annotation and sequence coverage for non model species, sufficient information was obtained to identify a large set of genes potentially involved in cellular response to environmental stress.

  7. Transcriptome sequencing and microarray development for the Manila clam, Ruditapes philippinarum: genomic tools for environmental monitoring

    PubMed Central

    2011-01-01

    Background The Manila clam, Ruditapes philippinarum, is one of the major aquaculture species in the world and a potential sentinel organism for monitoring the status of marine ecosystems. However, genomic resources for R. philippinarum are still extremely limited. Global analysis of gene expression profiles is increasingly used to evaluate the biological effects of various environmental stressors on aquatic animals under either artificial conditions or in the wild. Here, we report on the development of a transcriptomic platform for global gene expression profiling in the Manila clam. Results A normalized cDNA library representing a mixture of adult tissues was sequenced using a ultra high-throughput sequencing technology (Roche 454). A database consisting of 32,606 unique transcripts was constructed, 9,747 (30%) of which could be annotated by similarity. An oligo-DNA microarray platform was designed and applied to profile gene expression of digestive gland and gills. Functional annotation of differentially expressed genes between different tissues was performed by enrichment analysis. Expression of Natural Antisense Transcripts (NAT) analysis was also performed and bi-directional transcription appears a common phenomenon in the R. philippinarum transcriptome. A preliminary study on clam samples collected in a highly polluted area of the Venice Lagoon demonstrated the applicability of genomic tools to environmental monitoring. Conclusions The transcriptomic platform developed for the Manila clam confirmed the high level of reproducibility of current microarray technology. Next-generation sequencing provided a good representation of the clam transcriptome. Despite the known limitations in transcript annotation and sequence coverage for non model species, sufficient information was obtained to identify a large set of genes potentially involved in cellular response to environmental stress. PMID:21569398

  8. Databases and information integration for the Medicago truncatula genome and transcriptome.

    PubMed

    Cannon, Steven B; Crow, John A; Heuer, Michael L; Wang, Xiaohong; Cannon, Ethalinda K S; Dwan, Christopher; Lamblin, Anne-Francoise; Vasdewani, Jayprakash; Mudge, Joann; Cook, Andrew; Gish, John; Cheung, Foo; Kenton, Steve; Kunau, Timothy M; Brown, Douglas; May, Gregory D; Kim, Dongjin; Cook, Douglas R; Roe, Bruce A; Town, Chris D; Young, Nevin D; Retzel, Ernest F

    2005-05-01

    An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu), the Institute for Genomic Research (www.tigr.org), and Munich Information for Protein Sequences Center (mips.gsf.de).

  9. Breeding in peach, cherry and plum: from a tissue culture, genetic, transcriptomic and genomic perspective.

    PubMed

    Carrasco, Basilio; Meisel, Lee; Gebauer, Marlene; Garcia-Gonzales, Rolando; Silva, Herman

    2013-01-01

    This review is an overview of traditional and modern breeding methodologies being used to develop new Prunus cultivars (stone fruits) with major emphasis on peach, sweet cherry and Japanese plum. To this end, common breeding tools used to produce seedlings, including in vitro culture tools, are discussed. Additionally, the mechanisms of inheritance of many important agronomical traits are described. Recent advances in stone fruit transcriptomics and genomic resources are providing an understanding of the molecular basis of phenotypic variability as well as the identification of allelic variants and molecular markers. These have potential applications for understanding the genetic diversity of the Prunus species, molecular marker-assisted selection and transgenesis. Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNPs) molecular markers are described as useful tools to describe genetic diversity in peach, sweet cherry and Japanese plum. Additionally, the recently sequenced peach genome and the public release of the sweet cherry genome are discussed in terms of their applicability to breeding programs.

  10. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses.

    PubMed

    O'Connell, Richard J; Thon, Michael R; Hacquard, Stéphane; Amyotte, Stefan G; Kleemann, Jochen; Torres, Maria F; Damm, Ulrike; Buiate, Ester A; Epstein, Lynn; Alkan, Noam; Altmüller, Janine; Alvarado-Balderrama, Lucia; Bauser, Christopher A; Becker, Christian; Birren, Bruce W; Chen, Zehua; Choi, Jaeyoung; Crouch, Jo Anne; Duvick, Jonathan P; Farman, Mark A; Gan, Pamela; Heiman, David; Henrissat, Bernard; Howard, Richard J; Kabbage, Mehdi; Koch, Christian; Kracher, Barbara; Kubo, Yasuyuki; Law, Audrey D; Lebrun, Marc-Henri; Lee, Yong-Hwan; Miyara, Itay; Moore, Neil; Neumann, Ulla; Nordström, Karl; Panaccione, Daniel G; Panstruga, Ralph; Place, Michael; Proctor, Robert H; Prusky, Dov; Rech, Gabriel; Reinhardt, Richard; Rollins, Jeffrey A; Rounsley, Steve; Schardl, Christopher L; Schwartz, David C; Shenoy, Narmada; Shirasu, Ken; Sikhakolli, Usha R; Stüber, Kurt; Sukno, Serenella A; Sweigard, James A; Takano, Yoshitaka; Takahara, Hiroyuki; Trail, Frances; van der Does, H Charlotte; Voll, Lars M; Will, Isa; Young, Sarah; Zeng, Qiandong; Zhang, Jingze; Zhou, Shiguo; Dickman, Martin B; Schulze-Lefert, Paul; Ver Loren van Themaat, Emiel; Ma, Li-Jun; Vaillancourt, Lisa J

    2012-09-01

    Colletotrichum species are fungal pathogens that devastate crop plants worldwide. Host infection involves the differentiation of specialized cell types that are associated with penetration, growth inside living host cells (biotrophy) and tissue destruction (necrotrophy). We report here genome and transcriptome analyses of Colletotrichum higginsianum infecting Arabidopsis thaliana and Colletotrichum graminicola infecting maize. Comparative genomics showed that both fungi have large sets of pathogenicity-related genes, but families of genes encoding secreted effectors, pectin-degrading enzymes, secondary metabolism enzymes, transporters and peptidases are expanded in C. higginsianum. Genome-wide expression profiling revealed that these genes are transcribed in successive waves that are linked to pathogenic transitions: effectors and secondary metabolism enzymes are induced before penetration and during biotrophy, whereas most hydrolases and transporters are upregulated later, at the switch to necrotrophy. Our findings show that preinvasion perception of plant-derived signals substantially reprograms fungal gene expression and indicate previously unknown functions for particular fungal cell types.

  11. Novel mouse model recapitulates genome and transcriptome alterations in human colorectal carcinomas.

    PubMed

    McNeil, Nicole E; Padilla-Nash, Hesed M; Buishand, Floryne O; Hue, Yue; Ried, Thomas

    2017-03-01

    Human colorectal carcinomas are defined by a nonrandom distribution of genomic imbalances that are characteristic for this disease. Often, these imbalances affect entire chromosomes. Understanding the role of these aneuploidies for carcinogenesis is of utmost importance. Currently, established transgenic mice do not recapitulate the pathognonomic genome aberration profile of human colorectal carcinomas. We have developed a novel model based on the spontaneous transformation of murine colon epithelial cells. During this process, cells progress through stages of pre-immortalization, immortalization and, finally, transformation, and result in tumors when injected into immunocompromised mice. We analyzed our model for genome and transcriptome alterations using ArrayCGH, spectral karyotyping (SKY), and array based gene expression profiling. ArrayCGH revealed a recurrent pattern of genomic imbalances. These results were confirmed by SKY. Comparing these imbalances with orthologous maps of human chromosomes revealed a remarkable overlap. We observed focal deletions of the tumor suppressor genes Trp53 and Cdkn2a/p16. High-level focal genomic amplification included the locus harboring the oncogene Mdm2, which was confirmed by FISH in the form of double minute chromosomes. Array-based global gene expression revealed distinct differences between the sequential steps of spontaneous transformation. Gene expression changes showed significant similarities with human colorectal carcinomas. Pathways most prominently affected included genes involved in chromosomal instability and in epithelial to mesenchymal transition. Our novel mouse model therefore recapitulates the most prominent genome and transcriptome alterations in human colorectal cancer, and might serve as a valuable tool for understanding the dynamic process of tumorigenesis, and for preclinical drug testing. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  12. Transcriptome sequencing reveals genome-wide variation in molecular evolutionary rate among ferns.

    PubMed

    Grusz, Amanda L; Rothfels, Carl J; Schuettpelz, Eric

    2016-08-30

    Transcriptomics in non-model plant systems has recently reached a point where the examination of nuclear genome-wide patterns in understudied groups is an achievable reality. This progress is especially notable in evolutionary studies of ferns, for which molecular resources to date have been derived primarily from the plastid genome. Here, we utilize transcriptome data in the first genome-wide comparative study of molecular evolutionary rate in ferns. We focus on the ecologically diverse family Pteridaceae, which comprises about 10 % of fern diversity and includes the enigmatic vittarioid ferns-an epiphytic, tropical lineage known for dramatically reduced morphologies and radically elongated phylogenetic branch lengths. Using expressed sequence data for 2091 loci, we perform pairwise comparisons of molecular evolutionary rate among 12 species spanning the three largest clades in the family and ask whether previously documented heterogeneity in plastid substitution rates is reflected in their nuclear genomes. We then inquire whether variation in evolutionary rate is being shaped by genes belonging to specific functional categories and test for differential patterns of selection. We find significant, genome-wide differences in evolutionary rate for vittarioid ferns relative to all other lineages within the Pteridaceae, but we recover few significant correlations between faster/slower vittarioid loci and known functional gene categories. We demonstrate that the faster rates characteristic of the vittarioid ferns are likely not driven by positive selection, nor are they unique to any particular type of nucleotide substitution. Our results reinforce recently reviewed mechanisms hypothesized to shape molecular evolutionary rates in vittarioid ferns and provide novel insight into substitution rate variation both within and among fern nuclear genomes.

  13. Genome-Wide Transcriptome Analysis of Cadmium Stress in Rice

    PubMed Central

    Oono, Youko; Yazawa, Takayuki; Kanamori, Hiroyuki; Sasaki, Harumi; Mori, Satomi; Handa, Hirokazu; Matsumoto, Takashi

    2016-01-01

    Rice growth is severely affected by toxic concentrations of the nonessential heavy metal cadmium (Cd). To elucidate the molecular basis of the response to Cd stress, we performed mRNA sequencing of rice following our previous study on exposure to high concentrations of Cd (Oono et al., 2014). In this study, rice plants were hydroponically treated with low concentrations of Cd and approximately 211 million sequence reads were mapped onto the IRGSP-1.0 reference rice genome sequence. Many genes, including some identified under high Cd concentration exposure in our previous study, were found to be responsive to low Cd exposure, with an average of about 11,000 transcripts from each condition. However, genes expressed constitutively across the developmental course responded only slightly to low Cd concentrations, in contrast to their clear response to high Cd concentration, which causes fatal damage to rice seedlings according to phenotypic changes. The expression of metal ion transporter genes tended to correlate with Cd concentration, suggesting the potential of the RNA-Seq strategy to reveal novel Cd-responsive transporters by analyzing gene expression under different Cd concentrations. This study could help to develop novel strategies for improving tolerance to Cd exposure in rice and other cereal crops. PMID:27034955

  14. Transcriptome characterization for genome annotation and functional genomics in Theobroma cacao

    USDA-ARS?s Scientific Manuscript database

    Evidence from leaf transcriptome sequencing using two technology platforms, in combination with protein homology and trained ab initio predictions, previously enabled us to build 35,000 gene models in T. cacao (www.cacaogenomedb.org). Here we review the contribution of each data type to cacao gene a...

  15. Genome scale engineering techniques for metabolic engineering.

    PubMed

    Liu, Rongming; Bassalo, Marcelo C; Zeitoun, Ramsey I; Gill, Ryan T

    2015-11-01

    Metabolic engineering has expanded from a focus on designs requiring a small number of genetic modifications to increasingly complex designs driven by advances in genome-scale engineering technologies. Metabolic engineering has been generally defined by the use of iterative cycles of rational genome modifications, strain analysis and characterization, and a synthesis step that fuels additional hypothesis generation. This cycle mirrors the Design-Build-Test-Learn cycle followed throughout various engineering fields that has recently become a defining aspect of synthetic biology. This review will attempt to summarize recent genome-scale design, build, test, and learn technologies and relate their use to a range of metabolic engineering applications. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.

  16. Mapping the C. elegans noncoding transcriptome with a whole-genome tiling microarray.

    PubMed

    He, Housheng; Wang, Jie; Liu, Tao; Liu, X Shirley; Li, Tiantian; Wang, Yunfei; Qian, Zuwei; Zheng, Haixia; Zhu, Xiaopeng; Wu, Tao; Shi, Baochen; Deng, Wei; Zhou, Wei; Skogerbø, Geir; Chen, Runsheng

    2007-10-01

    The number of annotated protein coding genes in the genome of Caenorhabditis elegans is similar to that of other animals, but the extent of its non-protein-coding transcriptome remains unknown. Expression profiling on whole-genome tiling microarrays applied to a mixed-stage C. elegans population verified the expression of 71% of all annotated exons. Only a small fraction (11%) of the polyadenylated transcription is non-annotated and appears to consist of approximately 3200 missed or alternative exons and 7800 small transcripts of unknown function (TUFs). Almost half (44%) of the detected transcriptional output is non-polyadenylated and probably not protein coding, and of this, 70% overlaps the boundaries of protein-coding genes in a complex manner. Specific analysis of small non-polyadenylated transcripts verified 97% of all annotated small ncRNAs and suggested that the transcriptome contains approximately 1200 small (<500 nt) unannotated noncoding loci. After combining overlapping transcripts, we estimate that at least 70% of the total C. elegans genome is transcribed.

  17. Genome-wide investigation and transcriptome analysis of the WRKY gene family in Gossypium.

    PubMed

    Ding, Mingquan; Chen, Jiadong; Jiang, Yurong; Lin, Lifeng; Cao, YueFen; Wang, Minhua; Zhang, Yuting; Rong, Junkang; Ye, Wuwei

    2015-02-01

    WRKY transcription factors play important roles in various stress responses in diverse plant species. In cotton, this family has not been well studied, especially in relation to fiber development. Here, the genomes and transcriptomes of Gossypium raimondii and Gossypium arboreum were investigated to identify fiber development related WRKY genes. This represents the first comprehensive comparative study of WRKY transcription factors in both diploid A and D cotton species. In total, 112 G. raimondii and 109 G. arboreum WRKY genes were identified. No significant gene structure or domain alterations were detected between the two species, but many SNPs distributed unequally in exon and intron regions. Physical mapping revealed that the WRKY genes in G. arboreum were not located in the corresponding chromosomes of G. raimondii, suggesting great chromosome rearrangement in the diploid cotton genomes. The cotton WRKY genes, especially subgroups I and II, have expanded through multiple whole genome duplications and tandem duplications compared with other plant species. Sequence comparison showed many functionally divergent sites between WRKY subgroups, while the genes within each group are under strong purifying selection. Transcriptome analysis suggested that many WRKY genes participate in specific fiber development processes such as fiber initiation, elongation and maturation with different expression patterns between species. Complex WRKY gene expression such as differential Dt and At allelic gene expression in G. hirsutum and alternative splicing events were also observed in both diploid and tetraploid cottons during fiber development process. In conclusion, this study provides important information on the evolution and function of WRKY gene family in cotton species.

  18. Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing

    PubMed Central

    2011-01-01

    Background The garden pea, Pisum sativum, is among the best-investigated legume plants and of significant agro-commercial relevance. Pisum sativum has a large and complex genome and accordingly few comprehensive genomic resources exist. Results We analyzed the pea transcriptome at the highest possible amount of accuracy by current technology. We used next generation sequencing with the Roche/454 platform and evaluated and compared a variety of approaches, including diverse tissue libraries, normalization, alternative sequencing technologies, saturation estimation and diverse assembly strategies. We generated libraries from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings, comprising a total of 450 megabases. Libraries were assembled into 324,428 unigenes in a first pass assembly. A second pass assembly reduced the amount to 81,449 unigenes but caused a significant number of chimeras. Analyses of the assemblies identified the assembly step as a major possibility for improvement. By recording frequencies of Arabidopsis orthologs hit by randomly drawn reads and fitting parameters of the saturation curve we concluded that sequencing was exhaustive. For leaf libraries we found normalization allows partial recovery of expression strength aside the desired effect of increased coverage. Based on theoretical and biological considerations we concluded that the sequence reads in the database tagged the vast majority of transcripts in the aerial tissues. A pathway representation analysis showed the merits of sampling multiple aerial tissues to increase the number of tagged genes. All results have been made available as a fully annotated database in fasta format. Conclusions We conclude that the approach taken resulted in a high quality - dataset which serves well as a first comprehensive reference set for the model legume pea. We suggest future deep sequencing transcriptome projects of species lacking a genomics backbone will

  19. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer

    PubMed Central

    Bova, G. Steven; Kallio, Heini M.L.; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B.; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-01-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials. PMID:27148588

  20. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer.

    PubMed

    Bova, G Steven; Kallio, Heini M L; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-05-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials.

  1. Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics

    PubMed Central

    Yang, Ya; Smith, Stephen A.

    2014-01-01

    Orthology inference is central to phylogenomic analyses. Phylogenomic data sets commonly include transcriptomes and low-coverage genomes that are incomplete and contain errors and isoforms. These properties can severely violate the underlying assumptions of orthology inference with existing heuristics. We present a procedure that uses phylogenies for both homology and orthology assignment. The procedure first uses similarity scores to infer putative homologs that are then aligned, constructed into phylogenies, and pruned of spurious branches caused by deep paralogs, misassembly, frameshifts, or recombination. These final homologs are then used to identify orthologs. We explore four alternative tree-based orthology inference approaches, of which two are new. These accommodate gene and genome duplications as well as gene tree discordance. We demonstrate these methods in three published data sets including the grape family, Hymenoptera, and millipedes with divergence times ranging from approximately 100 to over 400 Ma. The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs. We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs. To explicitly evaluate sources of conflicting phylogenetic signals, we applied serial jackknife analyses of gene regions keeping each locus intact. The methods described here can scale to over 100 taxa. They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines. All scripts are available from https://bitbucket.org/yangya/phylogenomic_dataset_construction. PMID:25158799

  2. Draft genome of spinach and transcriptome diversity of 120 Spinacia accessions

    PubMed Central

    Xu, Chenxi; Jiao, Chen; Sun, Honghe; Cai, Xiaofeng; Wang, Xiaoli; Ge, Chenhui; Zheng, Yi; Liu, Wenli; Sun, Xuepeng; Xu, Yimin; Deng, Jie; Zhang, Zhonghua; Huang, Sanwen; Dai, Shaojun; Mou, Beiquan; Wang, Quanxi; Fei, Zhangjun; Wang, Quanhua

    2017-01-01

    Spinach is an important leafy vegetable enriched with multiple necessary nutrients. Here we report the draft genome sequence of spinach (Spinacia oleracea, 2n=12), which contains 25,495 protein-coding genes. The spinach genome is highly repetitive with 74.4% of its content in the form of transposable elements. No recent whole genome duplication events are observed in spinach. Genome syntenic analysis between spinach and sugar beet suggests substantial inter- and intra-chromosome rearrangements during the Caryophyllales genome evolution. Transcriptome sequencing of 120 cultivated and wild spinach accessions reveals more than 420 K variants. Our data suggests that S. turkestanica is likely the direct progenitor of cultivated spinach and spinach domestication has a weak bottleneck. We identify 93 domestication sweeps in the spinach genome, some of which are associated with important agronomic traits including bolting, flowering and leaf numbers. This study offers insights into spinach evolution and domestication and provides resources for spinach research and improvement. PMID:28537264

  3. Integrative approaches for large-scale transcriptome-wide association studies

    PubMed Central

    Gusev, Alexander; Ko, Arthur; Shi, Huwenbo; Bhatia, Gaurav; Chung, Wonil; Penninx, Brenda W J H; Jansen, Rick; de Geus, Eco JC; Boomsma, Dorret I; Wright, Fred A; Sullivan, Patrick F; Nikkola, Elina; Alvarez, Marcus; Civelek, Mete; Lusis, Aldons J.; Lehtimäki, Terho; Raitoharju, Emma; Kähönen, Mika; Seppälä, Ilkka; Raitakari, Olli T.; Kuusisto, Johanna; Laakso, Markku; Price, Alkes L.; Pajukanta, Päivi; Pasaniuc, Bogdan

    2016-01-01

    Many genetic variants influence complex traits by modulating gene expression, thus altering the abundance levels of one or multiple proteins. Here, we introduce a powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated to complex traits. We leverage expression imputation to perform a transcriptome wide association scan (TWAS) to identify significant expression-trait associations. We applied our approaches to expression data from blood and adipose tissue measured in ~3,000 individuals overall. We imputed gene expression into GWAS data from over 900,000 phenotype measurements to identify 69 novel genes significantly associated to obesity-related traits (BMI, lipids, and height). Many of the novel genes are associated with relevant phenotypes in the Hybrid Mouse Diversity Panel. Our results showcase the power of integrating genotype, gene expression and phenotype to gain insights into the genetic basis of complex traits. PMID:26854917

  4. Integrative approaches for large-scale transcriptome-wide association studies.

    PubMed

    Gusev, Alexander; Ko, Arthur; Shi, Huwenbo; Bhatia, Gaurav; Chung, Wonil; Penninx, Brenda W J H; Jansen, Rick; de Geus, Eco J C; Boomsma, Dorret I; Wright, Fred A; Sullivan, Patrick F; Nikkola, Elina; Alvarez, Marcus; Civelek, Mete; Lusis, Aldons J; Lehtimäki, Terho; Raitoharju, Emma; Kähönen, Mika; Seppälä, Ilkka; Raitakari, Olli T; Kuusisto, Johanna; Laakso, Markku; Price, Alkes L; Pajukanta, Päivi; Pasaniuc, Bogdan

    2016-03-01

    Many genetic variants influence complex traits by modulating gene expression, thus altering the abundance of one or multiple proteins. Here we introduce a powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated with complex traits. We leverage expression imputation from genetic data to perform a transcriptome-wide association study (TWAS) to identify significant expression-trait associations. We applied our approaches to expression data from blood and adipose tissue measured in ∼ 3,000 individuals overall. We imputed gene expression into GWAS data from over 900,000 phenotype measurements to identify 69 new genes significantly associated with obesity-related traits (BMI, lipids and height). Many of these genes are associated with relevant phenotypes in the Hybrid Mouse Diversity Panel. Our results showcase the power of integrating genotype, gene expression and phenotype to gain insights into the genetic basis of complex traits.

  5. Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus).

    PubMed

    Thanh, Nguyen Minh; Jung, Hyungtaek; Lyons, Russell E; Njaci, Isaac; Yoon, Byoung-Ha; Chand, Vincent; Tuan, Nguyen Viet; Thu, Vo Thi Minh; Mather, Peter

    2015-10-01

    Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478bp and N50 length of 506bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea.

    PubMed

    Parkin, Isobel A P; Koh, Chushin; Tang, Haibao; Robinson, Stephen J; Kagale, Sateesh; Clarke, Wayne E; Town, Chris D; Nixon, John; Krishnakumar, Vivek; Bidwell, Shelby L; Denoeud, France; Belcram, Harry; Links, Matthew G; Just, Jérémy; Clarke, Carling; Bender, Tricia; Huebert, Terry; Mason, Annaliese S; Pires, J Chris; Barker, Guy; Moore, Jonathan; Walley, Peter G; Manoli, Sahana; Batley, Jacqueline; Edwards, David; Nelson, Matthew N; Wang, Xiyin; Paterson, Andrew H; King, Graham; Bancroft, Ian; Chalhoub, Boulos; Sharpe, Andrew G

    2014-06-10

    Brassica oleracea is a valuable vegetable species that has contributed to human health and nutrition for hundreds of years and comprises multiple distinct cultivar groups with diverse morphological and phytochemical attributes. In addition to this phenotypic wealth, B. oleracea offers unique insights into polyploid evolution, as it results from multiple ancestral polyploidy events and a final Brassiceae-specific triplication event. Further, B. oleracea represents one of the diploid genomes that formed the economically important allopolyploid oilseed, Brassica napus. A deeper understanding of B. oleracea genome architecture provides a foundation for crop improvement strategies throughout the Brassica genus. We generate an assembly representing 75% of the predicted B. oleracea genome using a hybrid Illumina/Roche 454 approach. Two dense genetic maps are generated to anchor almost 92% of the assembled scaffolds to nine pseudo-chromosomes. Over 50,000 genes are annotated and 40% of the genome predicted to be repetitive, thus contributing to the increased genome size of B. oleracea compared to its close relative B. rapa. A snapshot of both the leaf transcriptome and methylome allows comparisons to be made across the triplicated sub-genomes, which resulted from the most recent Brassiceae-specific polyploidy event. Differential expression of the triplicated syntelogs and cytosine methylation levels across the sub-genomes suggest residual marks of the genome dominance that led to the current genome architecture. Although cytosine methylation does not correlate with individual gene dominance, the independent methylation patterns of triplicated copies suggest epigenetic mechanisms play a role in the functional diversification of duplicate genes.

  7. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea

    PubMed Central

    2014-01-01

    Background Brassica oleracea is a valuable vegetable species that has contributed to human health and nutrition for hundreds of years and comprises multiple distinct cultivar groups with diverse morphological and phytochemical attributes. In addition to this phenotypic wealth, B. oleracea offers unique insights into polyploid evolution, as it results from multiple ancestral polyploidy events and a final Brassiceae-specific triplication event. Further, B. oleracea represents one of the diploid genomes that formed the economically important allopolyploid oilseed, Brassica napus. A deeper understanding of B. oleracea genome architecture provides a foundation for crop improvement strategies throughout the Brassica genus. Results We generate an assembly representing 75% of the predicted B. oleracea genome using a hybrid Illumina/Roche 454 approach. Two dense genetic maps are generated to anchor almost 92% of the assembled scaffolds to nine pseudo-chromosomes. Over 50,000 genes are annotated and 40% of the genome predicted to be repetitive, thus contributing to the increased genome size of B. oleracea compared to its close relative B. rapa. A snapshot of both the leaf transcriptome and methylome allows comparisons to be made across the triplicated sub-genomes, which resulted from the most recent Brassiceae-specific polyploidy event. Conclusions Differential expression of the triplicated syntelogs and cytosine methylation levels across the sub-genomes suggest residual marks of the genome dominance that led to the current genome architecture. Although cytosine methylation does not correlate with individual gene dominance, the independent methylation patterns of triplicated copies suggest epigenetic mechanisms play a role in the functional diversification of duplicate genes. PMID:24916971

  8. Combined Analysis of the Chloroplast Genome and Transcriptome of the Antarctic Vascular Plant Deschampsia antarctica Desv

    PubMed Central

    Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

    2014-01-01

    Background Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. Results The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5′- or 3′-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. Conclusions We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the

  9. Sequencing Overview of Ewing Sarcoma: A Journey across Genomic, Epigenomic and Transcriptomic Landscapes

    PubMed Central

    Sand, Laurens G. L.; Szuhai, Karoly; Hogendoorn, Pancras C. W.

    2015-01-01

    Ewing sarcoma is an aggressive neoplasm occurring predominantly in adolescent Caucasians. At the genome level, a pathognomonic EWSR1-ETS translocation is present. The resulting fusion protein acts as a molecular driver in the tumor development and interferes, amongst others, with endogenous transcription and splicing. The Ewing sarcoma cell shows a poorly differentiated, stem-cell like phenotype. Consequently, the cellular origin of Ewing sarcoma is still a hot discussed topic. To further characterize Ewing sarcoma and to further elucidate the role of EWSR1-ETS fusion protein multiple genome, epigenome and transcriptome level studies were performed. In this review, the data from these studies were combined into a comprehensive overview. Presently, classical morphological predictive markers are used in the clinic and the therapy is dominantly based on systemic chemotherapy in combination with surgical interventions. Using sequencing, novel predictive markers and candidates for immuno- and targeted therapy were identified which were summarized in this review. PMID:26193259

  10. Genomic and transcriptomic evidence for scavenging of diverse organic compounds by widespread deep-sea archaea.

    PubMed

    Li, Meng; Baker, Brett J; Anantharaman, Karthik; Jain, Sunit; Breier, John A; Dick, Gregory J

    2015-11-17

    Microbial activity is one of the most important processes to mediate the flux of organic carbon from the ocean surface to the seafloor. However, little is known about the microorganisms that underpin this key step of the global carbon cycle in the deep oceans. Here we present genomic and transcriptomic evidence that five ubiquitous archaeal groups actively use proteins, carbohydrates, fatty acids and lipids as sources of carbon and energy at depths ranging from 800 to 4,950 m in hydrothermal vent plumes and pelagic background seawater across three different ocean basins. Genome-enabled metabolic reconstructions and gene expression patterns show that these marine archaea are motile heterotrophs with extensive mechanisms for scavenging organic matter. Our results shed light on the ecological and physiological properties of ubiquitous marine archaea and highlight their versatile metabolic strategies in deep oceans that might play a critical role in global carbon cycling.

  11. Genomic and transcriptomic evidence for scavenging of diverse organic compounds by widespread deep-sea archaea

    PubMed Central

    Li, Meng; Baker, Brett J.; Anantharaman, Karthik; Jain, Sunit; Breier, John A.; Dick, Gregory J.

    2015-01-01

    Microbial activity is one of the most important processes to mediate the flux of organic carbon from the ocean surface to the seafloor. However, little is known about the microorganisms that underpin this key step of the global carbon cycle in the deep oceans. Here we present genomic and transcriptomic evidence that five ubiquitous archaeal groups actively use proteins, carbohydrates, fatty acids and lipids as sources of carbon and energy at depths ranging from 800 to 4,950 m in hydrothermal vent plumes and pelagic background seawater across three different ocean basins. Genome-enabled metabolic reconstructions and gene expression patterns show that these marine archaea are motile heterotrophs with extensive mechanisms for scavenging organic matter. Our results shed light on the ecological and physiological properties of ubiquitous marine archaea and highlight their versatile metabolic strategies in deep oceans that might play a critical role in global carbon cycling. PMID:26573375

  12. Amplified fragment length polymorphism: an invaluable fingerprinting technique for genomic, transcriptomic, and epigenetic studies.

    PubMed

    Paun, Ovidiu; Schönswetter, Peter

    2012-01-01

    Amplified fragment length polymorphism (AFLP) is a PCR-based technique that uses selective amplification of a subset of digested DNA fragments to generate and compare unique fingerprints for genomes of interest. The power of this method relies mainly in that it does not require prior information regarding the targeted genome, as well as in its high reproducibility and sensitivity for detecting polymorphism at the level of DNA sequence. Widely used for plant and microbial studies, AFLP is employed for a variety of applications, such as to assess genetic diversity within species or among closely related species, to infer population-level phylogenies and biogeographic patterns, to generate genetic maps, and to determine relatedness among cultivars. Variations of standard AFLP methodology have been also developed for targeting additional levels of diversity, such as transcriptomic variation and DNA methylation polymorphism.

  13. The genome and transcriptome of the zoonotic hookworm Ancylostoma ceylanicum identify infection-specific gene families.

    PubMed

    Schwarz, Erich M; Hu, Yan; Antoshechkin, Igor; Miller, Melanie M; Sternberg, Paul W; Aroian, Raffi V

    2015-04-01

    Hookworms infect over 400 million people, stunting and impoverishing them. Sequencing hookworm genomes and finding which genes they express during infection should help in devising new drugs or vaccines against hookworms. Unlike other hookworms, Ancylostoma ceylanicum infects both humans and other mammals, providing a laboratory model for hookworm disease. We determined an A. ceylanicum genome sequence of 313 Mb, with transcriptomic data throughout infection showing expression of 30,738 genes. Approximately 900 genes were upregulated during early infection in vivo, including ASPRs, a cryptic subfamily of activation-associated secreted proteins (ASPs). Genes downregulated during early infection included ion channels and G protein-coupled receptors; this downregulation was observed in both parasitic and free-living nematodes. Later, at the onset of heavy blood feeding, C-lectin genes were upregulated along with genes for secreted clade V proteins (SCVPs), encoding a previously undescribed protein family. These findings provide new drug and vaccine targets and should help elucidate hookworm pathogenesis.

  14. De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis).

    PubMed

    Lok, Si; Paton, Tara A; Wang, Zhuozhi; Kaur, Gaganjot; Walker, Susan; Yuen, Ryan K C; Sung, Wilson W L; Whitney, Joseph; Buchanan, Janet A; Trost, Brett; Singh, Naina; Apresto, Beverly; Chen, Nan; Coole, Matthew; Dawson, Travis J; Ho, Karen; Hu, Zhizhou; Pullenayegum, Sanjeev; Samler, Kozue; Shipstone, Arun; Tsoi, Fiona; Wang, Ting; Pereira, Sergio L; Rostami, Pirooz; Ryan, Carol Ann; Tong, Amy Hin Yan; Ng, Karen; Sundaravadanam, Yogi; Simpson, Jared T; Lim, Burton K; Engstrom, Mark D; Dutton, Christopher J; Kerr, Kevin C R; Franke, Maria; Rapley, William; Wintle, Richard F; Scherer, Stephen W

    2017-02-09

    The Canadian beaver (Castor canadensis) is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 ×) long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 ×) and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon-gene models derived from 9805 full-length open reading frames (FL-ORFs) constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs) gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology. Copyright © 2017 Lok et al.

  15. De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis)

    PubMed Central

    Lok, Si; Paton, Tara A.; Wang, Zhuozhi; Kaur, Gaganjot; Walker, Susan; Yuen, Ryan K. C.; Sung, Wilson W. L.; Whitney, Joseph; Buchanan, Janet A.; Trost, Brett; Singh, Naina; Apresto, Beverly; Chen, Nan; Coole, Matthew; Dawson, Travis J.; Ho, Karen; Hu, Zhizhou; Pullenayegum, Sanjeev; Samler, Kozue; Shipstone, Arun; Tsoi, Fiona; Wang, Ting; Pereira, Sergio L.; Rostami, Pirooz; Ryan, Carol Ann; Tong, Amy Hin Yan; Ng, Karen; Sundaravadanam, Yogi; Simpson, Jared T.; Lim, Burton K.; Engstrom, Mark D.; Dutton, Christopher J.; Kerr, Kevin C. R.; Franke, Maria; Rapley, William; Wintle, Richard F.; Scherer, Stephen W.

    2017-01-01

    The Canadian beaver (Castor canadensis) is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 ×) long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 ×) and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon–gene models derived from 9805 full-length open reading frames (FL-ORFs) constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs) gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology. PMID:28087693

  16. Dictyocaulus viviparus genome, variome and transcriptome elucidate lungworm biology and support future intervention

    PubMed Central

    McNulty, Samantha N.; Strübe, Christina; Rosa, Bruce A.; Martin, John C.; Tyagi, Rahul; Choi, Young-Jun; Wang, Qi; Hallsworth Pepin, Kymberlie; Zhang, Xu; Ozersky, Philip; Wilson, Richard K.; Sternberg, Paul W.; Gasser, Robin B.; Mitreva, Makedonka

    2016-01-01

    The bovine lungworm, Dictyocaulus viviparus (order Strongylida), is an important parasite of livestock that causes substantial economic and production losses worldwide. Here we report the draft genome, variome, and developmental transcriptome of D. viviparus. The genome (161 Mb) is smaller than those of related bursate nematodes and encodes fewer proteins (14,171 total). In the first genome-wide assessment of genomic variation in any parasitic nematode, we found a high degree of sequence variability in proteins predicted to be involved host-parasite interactions. Next, we used extensive RNA sequence data to track gene transcription across the life cycle of D. viviparus, and identified genes that might be important in nematode development and parasitism. Finally, we predicted genes that could be vital in host-parasite interactions, genes that could serve as drug targets, and putative RNAi effectors with a view to developing functional genomic tools. This extensive, well-curated dataset should provide a basis for developing new anthelmintics, vaccines, and improved diagnostic tests and serve as a platform for future investigations of drug resistance and epidemiology of the bovine lungworm and related nematodes. PMID:26856411

  17. Search for sarcoidosis candidate genes by integration of data from genomic, transcriptomic and proteomic studies.

    PubMed

    Maver, Ales; Medica, Igor; Peterlin, Borut

    2009-12-01

    The search for gene candidates in multifactorial diseases such as sarcoidosis can be based on the integration of linkage association data, gene expression data, and protein profile data from genomic, transcriptomic and proteomic studies, respectively. In this study we performed a literature-based search for studies reporting such data, followed by integration of collected information. Different databases were examined--Medline, HugGE Navigator, ArrayExpress and Gene Expression Omnibus (GEO). Candidate genes were defined as genes which were reported in at least 2 different types of omics studies. Genes previously investigated in sarcoidosis were excluded from further analyses. We identified 177 genes associated with sarcoidosis as potential new candidate genes. Subsequently, 9 gene candidates identified to overlap in 2 different types of studies (genomic, transcriptomic and/or proteomic) were consistently reported in at least 3 studies: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214. These genes are involved in regulation of immune response, cellular proliferation, apoptosis, inhibition of protease activity, lipid metabolism. Exact biological functions of HBEGF, LRIG1, PTPN23, DPM2 and NUP214 remain to be completely elucidated. We propose 9 candidate genes: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214, as genes with high potential for association with sarcoidosis.

  18. Development of Genomic Resources for Pacific Herring through Targeted Transcriptome Pyrosequencing

    PubMed Central

    Roberts, Steven B.; Hauser, Lorenz; Seeb, Lisa W.; Seeb, James E.

    2012-01-01

    Pacific herring (Clupea pallasii) support commercially and culturally important fisheries but have experienced significant additional pressure from a variety of anthropogenic and environmental sources. In order to provide genomic resources to facilitate organismal and population level research, high-throughput pyrosequencing (Roche 454) was carried out on transcriptome libraries from liver and testes samples taken in Prince William Sound, the Bering Sea, and the Gulf of Alaska. Over 40,000 contigs were identified with an average length of 728 bp. We describe an annotated transcriptome as well as a workflow for single nucleotide polymorphism (SNP) discovery and validation. A subset of 96 candidate SNPs chosen from 10,933 potential SNPs, were tested using a combination of Sanger sequencing and high-resolution melt-curve analysis. Five SNPs supported between-ocean-basin differentiation, while one SNP associated with immune function provided high differentiation between Prince William Sound and Kodiak Island within the Gulf of Alaska. These genomic resources provide a basis for environmental physiology studies and opportunities for marker development and subsequent population structure analysis. PMID:22383979

  19. The capsicum transcriptome DB: a “hot” tool for genomic research

    PubMed Central

    Góngora-Castillo, Elsa; Fajardo-Jaime, Rubén; Fernández-Cortes, Araceli; Jofre-Garfias, Alba E; Lozoya-Gloria, Edmundo; Martínez, Octavio; Ochoa-Alejo, Neftalí; Rivera-Bustamante, Rafael

    2012-01-01

    Chili pepper (Capsicum annuum) is an economically important crop with no available public genome sequence. We describe a genomic resource to facilitate Capsicum annuum research. A collection of Expressed Sequence Tags (ESTs) derived from five C. annuum organs (root, stem, leaf, flower and fruit) were sequenced using the Sanger method and multiple leaf transcriptomes were deeply sampled using with GS-pyrosequencing. A hybrid assembly of 1,324,516 raw reads yielded 32,314 high quality contigs as validated by coverage and identity analysis with existing pepper sequences. Overall, 75.5% of the contigs had significant sequence similarity to entries in nucleic acid and protein databases; 23% of the sequences have not been previously reported for C. annuum and expand sequence resources for this species. A MySQL database and a user-friendly Web interface were constructed with search-tools that permit queries of the ESTs including sequence, functional annotation, Gene Ontology classification, metabolic pathways, and assembly information. The Capsicum Transcriptome DB is free available from http://www.bioingenios.ira.cinvestav.mx:81/Joomla/ PMID:22359434

  20. Transcriptome analysis in Concholepas concholepas (Gastropoda, Muricidae): mining and characterization of new genomic and molecular markers.

    PubMed

    Cárdenas, Leyla; Sánchez, Roland; Gomez, Daniela; Fuenzalida, Gonzalo; Gallardo-Escárate, Cristián; Tanguy, Arnaud

    2011-09-01

    The marine gastropod Concholepas concholepas, locally known as the "loco", is the main target species of the benthonic Chilean fisheries. Genetic and genomic tools are necessary to study the genome of this species in order to understand the molecular basis of its development, growth, and other key traits to improve the management strategies and to identify local adaptation to prevent loss of biodiversity. Here, we use pyrosequencing technologies to generate the first transcriptomic database from adult specimens of the loco. After trimming, a total of 140,756 Expressed Sequence Tag sequences were achieved. Clustering and assembly analysis identified 19,219 contigs and 105,435 singleton sequences. BlastN analysis showed a significant identity with Expressed Sequence Tags of different gastropod species available in public databases. Similarly, BlastX results showed that only 895 out of the total 124,654 had significant hits and may represent novel genes for marine gastropods. From this database, simple sequence repeat motifs were also identified and a total of 38 primer pairs were designed and tested to assess their potential as informative markers and to investigate their cross-species amplification in different related gastropod species. This dataset represents the first publicly available 454 data for a marine gastropod endemic to the southeastern Pacific coast, providing a valuable transcriptomic resource for future efforts of gene discovery and development of functional markers in other marine gastropods.

  1. Phenotypic, transcriptomic, and genomic features of clonal plasma cells in light-chain amyloidosis.

    PubMed

    Paiva, Bruno; Martinez-Lopez, Joaquin; Corchete, Luis A; Sanchez-Vega, Beatriz; Rapado, Inmaculada; Puig, Noemi; Barrio, Santiago; Sanchez, Maria-Luz; Alignani, Diego; Lasa, Marta; García de Coca, Alfonso; Pardal, Emilia; Oriol, Alberto; Garcia, Maria-Esther Gonzalez; Escalante, Fernando; González-López, Tomás J; Palomera, Luis; Alonso, José; Prosper, Felipe; Orfao, Alberto; Vidriales, Maria-Belen; Mateos, María-Victoria; Lahuerta, Juan-Jose; Gutierrez, Norma C; San Miguel, Jesús F

    2016-06-16

    Immunoglobulin light-chain amyloidosis (AL) and multiple myeloma (MM) are 2 distinct monoclonal gammopathies that involve the same cellular compartment: clonal plasma cells (PCs). Despite the fact that knowledge about MM PC biology has significantly increased in the last decade, the same does not apply for AL. Here, we used an integrative phenotypic, molecular, and genomic approach to study clonal PCs from 24 newly diagnosed patients with AL. Through principal-component-analysis, we demonstrated highly overlapping phenotypic profiles between AL and both monoclonal gammopathy of undetermined significance and MM PCs. However, in contrast to MM, highly purified fluorescence-activated cell-sorted clonal PCs from AL (n = 9) showed almost normal transcriptome, with only 38 deregulated genes vs normal PCs; these included a few tumor-suppressor (CDH1, RCAN) and proapoptotic (GLIPR1, FAS) genes. Notwithstanding, clonal PCs in AL (n = 11) were genomically unstable, with a median of 9 copy number alterations (CNAs) per case, many of such CNAs being similar to those found in MM. Whole-exome sequencing (WES) performed in 5 AL patients revealed a median of 15 nonrecurrent mutations per case. Altogether, our results show that in the absence of a unifying mutation by WES, clonal PCs in AL display phenotypic and CNA profiles similar to MM, but their transcriptome is remarkably similar to that of normal PCs. © 2016 by The American Society of Hematology.

  2. Genome-wide transcriptome analysis of Chinese pollination-constant nonastringent persimmon fruit treated with ethanol.

    PubMed

    Luo, Chun; Zhang, Qinglin; Luo, Zhengrong

    2014-02-08

    The persimmon Diospyros kaki Thunb. is an important commercial and deciduous fruit tree. The fruits have proanthocyanidin (PA) content of >25% of the dry weight and are astringent. PAs cause astringency that is often undesirable for human consumption; thus, the removal of astringency is an important practice in the persimmon industry. Soluble PAs can be converted to insoluble PAs by enclosing the fruit in a polyethylene bag containing diluted ethanol. The genomic resource development of the persimmon is delayed because of its large and complex genome. Second-generation sequencing is an efficient technique for generating huge sequences that can represent a large number of genes and their expression levels. We used 454 sequencing for the de novo transcriptome assembly of persimmon fruit treated with 5% ethanol (Tr library) and without treatment as the control (Co library) to investigate the genes and pathways that control PA biosynthesis and other secondary metabolites. We obtained 374.6 Mb in clean nucleotides comprising 624,690 and 626,203 clean sequencing reads from the Tr and Co libraries, respectively. We also identified 83,898 unigenes; 54,719 (~65.2%) unigenes were annotated based on similarity searches with known proteins. Up to 14,954 of the unigenes were assigned to the protein database Clusters of Orthologous Groups (COG), 24,337 were assigned to the term annotation database of Gene Ontology (GO), and 45,506 were assigned to 200 pathways in the database of Kyoto Encyclopedia of Genes and Genomes (KEGG). The two libraries were compared to identify the differentially expressed unigenes. The expression levels of genes involved in PA biosynthesis and tannin coagulation were analysed, and some of them were verified using quantitative real time PCR (qRT-PCR). This study provides abundant genomic data for persimmon and offers comprehensive sequence resources for persimmon research. The transcriptome dataset will improve our understanding of the molecular

  3. Genome and Transcriptome Sequences Reveal the Specific Parasitism of the Nematophagous Purpureocillium lilacinum 36-1

    PubMed Central

    Xie, Jialian; Li, Shaojun; Mo, Chenmi; Xiao, Xueqiong; Peng, Deliang; Wang, Gaofeng; Xiao, Yannong

    2016-01-01

    Purpureocillium lilacinum is a promising nematophagous ascomycete able to adapt diverse environments and it is also an opportunistic fungus that infects humans. A microbial inoculant of P. lilacinum has been registered to control plant parasitic nematodes. However, the molecular mechanism of the toxicological processes is still unclear because of the relatively few reports on the subject. In this study, using Illumina paired-end sequencing, the draft genome sequence and the transcriptome of P. lilacinum strain 36-1 infecting nematode-eggs were determined. Whole genome alignment indicated that P. lilacinum 36-1 possessed a more dynamic genome in comparison with P. lilacinum India strain. Moreover, a phylogenetic analysis showed that the P. lilacinum 36-1 had a closer relation to entomophagous fungi. The protein-coding genes in P. lilacinum 36-1 occurred much more frequently than they did in other fungi, which was a result of the depletion of repeat-induced point mutations (RIP). Comparative genome and transcriptome analyses revealed the genes that were involved in pathogenicity, particularly in the recognition, adhesion of nematode-eggs, downstream signal transduction pathways and hydrolase genes. By contrast, certain numbers of cellulose and xylan degradation genes and a lack of polysaccharide lyase genes showed the potential of P. lilacinum 36-1 as an endophyte. Notably, the expression of appressorium-formation and antioxidants-related genes exhibited similar infection patterns in P. lilacinum strain 36-1 to those of the model entomophagous fungi Metarhizium spp. These results uncovered the specific parasitism of P. lilacinum and presented the genes responsible for the infection of nematode-eggs. PMID:27486440

  4. Genotyping-by-Sequencing SNP Identification for Crops without a Reference Genome: Using Transcriptome Based Mapping as an Alternative Strategy.

    PubMed

    Berthouly-Salazar, Cécile; Mariac, Cédric; Couderc, Marie; Pouzadoux, Juliette; Floc'h, Jean-Baptiste; Vigouroux, Yves

    2016-01-01

    Next-generation sequencing opens the way for genomic studies of diversity even for non-model crops and animals. Genome reduction techniques are becoming progressively more popular as they allow a fraction of the genome to be sequenced for multiple individuals and/or populations. These techniques are an efficient way to explore genome diversity in non-model crops and animals for which no reference genome is available. Genome reduction techniques emerged with the development of specific pipelines such as UNEAK (Universal Network Enabled Analysis Kit) and Stacks. However, even for non-model crops and animals, transcriptomes are easier to obtain, thereby making it possible to directly map reads. We investigate the direct use of transcriptome as an alternative strategy. Our specific objective was to compare SNPs obtained from the UNEAK pipeline as well as SNPs obtained by directly mapping genotyping-by-sequencing reads on a transcriptome. We assessed the feasibility of both SNP datasets, UNEAK and transcriptome mapping, to investigate the diversity of 91 samples of wild pearl millet sampled across its distribution area. Both approaches produced several tens of thousands of single nucleotide variants, but differed in the way the variants were identified, leading to differences in the frequency spectrum associated with marked differences in the assessment of diversity. Difference in the frequency spectrum significantly biased a large set of diversity analyses as well as detection of selection approaches. However, whatever the approach, we found very similar inference of genetic structure, with three major genetic groups from West, Central, and East Africa. For non-model crops, using transcriptome data as a reference is thus a particularly promising way to obtain a more thorough analysis of datasets generated using genome reduction techniques.

  5. Genotyping-by-Sequencing SNP Identification for Crops without a Reference Genome: Using Transcriptome Based Mapping as an Alternative Strategy

    PubMed Central

    Berthouly-Salazar, Cécile; Mariac, Cédric; Couderc, Marie; Pouzadoux, Juliette; Floc’h, Jean-Baptiste; Vigouroux, Yves

    2016-01-01

    Next-generation sequencing opens the way for genomic studies of diversity even for non-model crops and animals. Genome reduction techniques are becoming progressively more popular as they allow a fraction of the genome to be sequenced for multiple individuals and/or populations. These techniques are an efficient way to explore genome diversity in non-model crops and animals for which no reference genome is available. Genome reduction techniques emerged with the development of specific pipelines such as UNEAK (Universal Network Enabled Analysis Kit) and Stacks. However, even for non-model crops and animals, transcriptomes are easier to obtain, thereby making it possible to directly map reads. We investigate the direct use of transcriptome as an alternative strategy. Our specific objective was to compare SNPs obtained from the UNEAK pipeline as well as SNPs obtained by directly mapping genotyping-by-sequencing reads on a transcriptome. We assessed the feasibility of both SNP datasets, UNEAK and transcriptome mapping, to investigate the diversity of 91 samples of wild pearl millet sampled across its distribution area. Both approaches produced several tens of thousands of single nucleotide variants, but differed in the way the variants were identified, leading to differences in the frequency spectrum associated with marked differences in the assessment of diversity. Difference in the frequency spectrum significantly biased a large set of diversity analyses as well as detection of selection approaches. However, whatever the approach, we found very similar inference of genetic structure, with three major genetic groups from West, Central, and East Africa. For non-model crops, using transcriptome data as a reference is thus a particularly promising way to obtain a more thorough analysis of datasets generated using genome reduction techniques. PMID:27379109

  6. Multi-tissue transcriptome profiles for coho salmon (Oncorhynchus kisutch), a species undergoing rediploidization following whole-genome duplication.

    PubMed

    Kim, Jin-Hyoung; Leong, Jong S; Koop, Ben F; Devlin, Robert H

    2016-02-01

    Salmonids are an important family of fish both from economic and basic research perspectives, and have been subjected to extensive research at whole-animal and molecular levels. Most research to date has been conducted on Atlantic salmon (Salmo salar) and rainbow trout (Oncorhynchus mykiss), but more recently other salmonids have become a focus of study due to their interesting life histories and because of their potential for use in commercial aquaculture. However, molecular biology and genetic analyses for these emerging species are currently hampered due to the lack of extensive genomic resources. To overcome some of these limitations, we have constructed a 43,228 sequence transcriptome from 13 tissues from coho salmon, Oncorhynchus kisutch using de novo transcriptome assembly methods. The transcriptome profiling analysis has provided data distinguishing allelic variation from paralogues that arose during the recent whole-genome duplication event in this family, thus allowing simplified analysis of gene-specific expression. Additionally, 1599 novel coho sequences have been identified through comparison with transcriptomes from two other salmonids species (Atlantic salmon and rainbow trout), and with northern pike. The transcriptome presented here will be useful for genomic analysis of coho salmon and other closely related salmonid species. Crown Copyright © 2015. Published by Elsevier B.V. All rights reserved.

  7. Transcriptome profiling of the demosponge Amphimedon queenslandica reveals genome-wide events that accompany major life cycle transitions

    PubMed Central

    2012-01-01

    Background The biphasic life cycle with pelagic larva and benthic adult stages is widely observed in the animal kingdom, including the Porifera (sponges), which are the earliest branching metazoans. The demosponge, Amphimedon queenslandica, undergoes metamorphosis from a free-swimming larva into a sessile adult that bears no morphological resemblance to other animals. While the genome of A. queenslandica contains an extensive repertoire of genes very similar to that of complex bilaterians, it is as yet unclear how this is drawn upon to coordinate changing morphological features and ecological demands throughout the sponge life cycle. Results To identify genome-wide events that accompany the pelagobenthic transition in A. queenslandica, we compared global gene expression profiles at four key developmental stages by sequencing the poly(A) transcriptome using SOLiD technology. Large-scale changes in transcription were observed as sponge larvae settled on the benthos and began metamorphosis. Although previous systematics suggest that the only clear homology between Porifera and other animals is in the embryonic and larval stages, we observed extensive use of genes involved in metazoan-associated cellular processes throughout the sponge life cycle. Sponge-specific transcripts are not over-represented in the morphologically distinct adult; rather, many genes that encode typical metazoan features, such as cell adhesion and immunity, are upregulated. Our analysis further revealed gene families with candidate roles in competence, settlement, and metamorphosis in the sponge, including transcription factors, G-protein coupled receptors and other signaling molecules. Conclusions This first genome-wide study of the developmental transcriptome in an early branching metazoan highlights major transcriptional events that accompany the pelagobenthic transition and point to a network of regulatory mechanisms that coordinate changes in morphology with shifting environmental demands

  8. Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

    PubMed Central

    Botton, Alessandro; Galla, Giulio; Conesa, Ana; Bachem, Christian; Ramina, Angelo; Barcaccia, Gianni

    2008-01-01

    Background After 10-year-use of AFLP (Amplified Fragment Length Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and are being extensively exploited for genome scanning and gene mapping, as well as cDNA-AFLP for transcriptome profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed transcripts would be of great utility for both functional genomics and systems biology research in plants. This may be achieved by means of the Gene Ontology (GO), consisting in three structured vocabularies (i.e. ontologies) describing genes, transcripts and proteins of any organism in terms of their associated cellular component, biological process and molecular function in a species-independent manner. In this paper, the functional annotation of about 8,000 AFLP-derived ESTs retrieved in the NCBI databases was carried out by using GO terminology. Results Descriptive statistics on the type, size and nature of gene sequences obtained by means of AFLP technology were calculated. The gene products associated with mRNA transcripts were then classified according to the three main GO vocabularies. A comparison of the functional content of cDNA-AFLP records was also performed by splitting the sequence dataset into monocots and dicots and by comparing them to all annotated ESTs of Arabidopsis and rice, respectively. On the whole, the statistical parameters adopted for the in silico AFLP-derived transcriptome-anchored sequence analysis proved to be critical for obtaining reliable GO results. Such an exhaustive annotation may offer a suitable platform for functional genomics, particularly useful in non-model species. Conclusion Reliable GO annotations of AFLP-derived sequences can be gathered through the optimization of the experimental steps

  9. Integrative genomic and transcriptomic analysis for pinpointing recurrent alterations of plant homeodomain genes and their clinical significance in breast cancer.

    PubMed

    Yu, Huimei; Jiang, Yuanyuan; Liu, Lanxin; Shan, Wenqi; Chu, Xiaofang; Yang, Zhe; Yang, Zeng-Quan

    2017-02-21

    A wide range of the epigenetic effectors that regulate chromatin modification, gene expression, genomic stability, and DNA repair contain structurally conserved domains called plant homeodomain (PHD) fingers. Alternations of several PHD finger-containing proteins (PHFs) due to genomic amplification, mutations, deletions, and translocations have been linked directly to various types of cancer. However, little is known about the genomic landscape and the clinical significance of PHFs in breast cancer. Hence, we performed a large-scale genomic and transcriptomic analysis of 98 PHF genes in breast cancer using TCGA and METABRIC datasets and correlated the recurrent alterations with clinicopathological features and survival of patients. Different subtypes of breast cancer had different patterns of copy number and expression for each PHF. We identified a subset of PHF genes that was recurrently altered with high prevalence, including PYGO2 (pygopus family PHD finger 2), ZMYND8 (zinc finger, MYND-type containing 8), ASXL1 (additional sex combs like 1) and CHD3 (chromodomain helicase DNA binding protein 3). Copy number increase and overexpression of ZMYND8 were more prevalent in Luminal B subtypes and were significantly associated with shorter survival of breast cancer patients. ZMYND8 was also involved in a positive feedback circuit of the estrogen receptor (ER) pathway, and the expression of ZMYND8 was repressed by the bromodomain and extra terminal (BET) inhibitor in breast cancer. Our findings suggest a promising avenue for future research-to focus on a subset of PHFs to better understand the molecular mechanisms and to identify therapeutic targets in breast cancer.

  10. Integrative genomic and transcriptomic analysis for pinpointing recurrent alterations of plant homeodomain genes and their clinical significance in breast cancer

    PubMed Central

    Yu, Huimei; Jiang, Yuanyuan; Liu, Lanxin; Shan, Wenqi; Chu, Xiaofang; Yang, Zhe; Yang, Zeng-Quan

    2017-01-01

    A wide range of the epigenetic effectors that regulate chromatin modification, gene expression, genomic stability, and DNA repair contain structurally conserved domains called plant homeodomain (PHD) fingers. Alternations of several PHD finger-containing proteins (PHFs) due to genomic amplification, mutations, deletions, and translocations have been linked directly to various types of cancer. However, little is known about the genomic landscape and the clinical significance of PHFs in breast cancer. Hence, we performed a large-scale genomic and transcriptomic analysis of 98 PHF genes in breast cancer using TCGA and METABRIC datasets and correlated the recurrent alterations with clinicopathological features and survival of patients. Different subtypes of breast cancer had different patterns of copy number and expression for each PHF. We identified a subset of PHF genes that was recurrently altered with high prevalence, including PYGO2 (pygopus family PHD finger 2), ZMYND8 (zinc finger, MYND-type containing 8), ASXL1 (additional sex combs like 1) and CHD3 (chromodomain helicase DNA binding protein 3). Copy number increase and overexpression of ZMYND8 were more prevalent in Luminal B subtypes and were significantly associated with shorter survival of breast cancer patients. ZMYND8 was also involved in a positive feedback circuit of the estrogen receptor (ER) pathway, and the expression of ZMYND8 was repressed by the bromodomain and extra terminal (BET) inhibitor in breast cancer. Our findings suggest a promising avenue for future research—to focus on a subset of PHFs to better understand the molecular mechanisms and to identify therapeutic targets in breast cancer. PMID:28055972

  11. Trade-off between Transcriptome Plasticity and Genome Evolution in Cephalopods.

    PubMed

    Liscovitch-Brauer, Noa; Alon, Shahar; Porath, Hagit T; Elstein, Boaz; Unger, Ron; Ziv, Tamar; Admon, Arie; Levanon, Erez Y; Rosenthal, Joshua J C; Eisenberg, Eli

    2017-04-06

    RNA editing, a post-transcriptional process, allows the diversification of proteomes beyond the genomic blueprint; however it is infrequently used among animals for this purpose. Recent reports suggesting increased levels of RNA editing in squids thus raise the question of the nature and effects of these events. We here show that RNA editing is particularly common in behaviorally sophisticated coleoid cephalopods, with tens of thousands of evolutionarily conserved sites. Editing is enriched in the nervous system, affecting molecules pertinent for excitability and neuronal morphology. The genomic sequence flanking editing sites is highly conserved, suggesting that the process confers a selective advantage. Due to the large number of sites, the surrounding conservation greatly reduces the number of mutations and genomic polymorphisms in protein-coding regions. This trade-off between genome evolution and transcriptome plasticity highlights the importance of RNA recoding as a strategy for diversifying proteins, particularly those associated with neural function. PAPERCLIP. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Genomic and transcriptomic insights into the efficient entomopathogenicity of Bacillus thuringiensis.

    PubMed

    Zhu, Lei; Peng, Donghai; Wang, Yueying; Ye, Weixing; Zheng, Jinshui; Zhao, Changming; Han, Dongmei; Geng, Ce; Ruan, Lifang; He, Jin; Yu, Ziniu; Sun, Ming

    2015-09-28

    Bacillus thuringiensis has been globally used as a microbial pesticide for over 70 years. However, information regarding its various adaptions and virulence factors and their roles in the entomopathogenic process remains limited. In this work, we present the complete genomes of two industrially patented Bacillus thuringiensis strains (HD-1 and YBT-1520). A comparative genomic analysis showed a larger and more complicated genome constitution that included novel insecticidal toxicity-related genes (ITRGs). All of the putative ITRGs were summarized according to the steps of infection. A comparative genomic analysis showed that highly toxic strains contained significantly more ITRGs, thereby providing additional strategies for infection, immune evasion, and cadaver utilization. Furthermore, a comparative transcriptomic analysis suggested that a high expression of these ITRGs was a key factor in efficient entomopathogenicity. We identified an active extra urease synthesis system in the highly toxic strains that may aid B. thuringiensis survival in insects (similar to previous results with well-known pathogens). Taken together, these results explain the efficient entomopathogenicity of B. thuringiensis. It provides novel insights into the strategies used by B. thuringiensis to resist and overcome host immune defenses and helps identify novel toxicity factors.

  13. Discover hidden splicing variations by mapping personal transcriptomes to personal genomes

    PubMed Central

    Stein, Shayna; Lu, Zhi-xiang; Bahrami-Samani, Emad; Park, Juw Won; Xing, Yi

    2015-01-01

    RNA-seq has become a popular technology for studying genetic variation of pre-mRNA alternative splicing. Commonly used RNA-seq aligners rely on the consensus splice site dinucleotide motifs to map reads across splice junctions. Consequently, genomic variants that create novel splice site dinucleotides may produce splice junction RNA-seq reads that cannot be mapped to the reference genome. We developed and evaluated an approach to identify ‘hidden’ splicing variations in personal transcriptomes, by mapping personal RNA-seq data to personal genomes. Computational analysis and experimental validation indicate that this approach identifies personal specific splice junctions at a low false positive rate. Applying this approach to an RNA-seq data set of 75 individuals, we identified 506 personal specific splice junctions, among which 437 were novel splice junctions not documented in current human transcript annotations. 94 splice junctions had splice site SNPs associated with GWAS signals of human traits and diseases. These involve genes whose splicing variations have been implicated in diseases (such as OAS1), as well as novel associations between alternative splicing and diseases (such as ICA1). Collectively, our work demonstrates that the personal genome approach to RNA-seq read alignment enables the discovery of a large but previously unknown catalog of splicing variations in human populations. PMID:26578562

  14. Genomic and transcriptomic insights into the efficient entomopathogenicity of Bacillus thuringiensis

    PubMed Central

    Zhu, Lei; Peng, Donghai; Wang, Yueying; Ye, Weixing; Zheng, Jinshui; Zhao, Changming; Han, Dongmei; Geng, Ce; Ruan, Lifang; He, Jin; Yu, Ziniu; Sun, Ming

    2015-01-01

    Bacillus thuringiensis has been globally used as a microbial pesticide for over 70 years. However, information regarding its various adaptions and virulence factors and their roles in the entomopathogenic process remains limited. In this work, we present the complete genomes of two industrially patented Bacillus thuringiensis strains (HD-1 and YBT-1520). A comparative genomic analysis showed a larger and more complicated genome constitution that included novel insecticidal toxicity-related genes (ITRGs). All of the putative ITRGs were summarized according to the steps of infection. A comparative genomic analysis showed that highly toxic strains contained significantly more ITRGs, thereby providing additional strategies for infection, immune evasion, and cadaver utilization. Furthermore, a comparative transcriptomic analysis suggested that a high expression of these ITRGs was a key factor in efficient entomopathogenicity. We identified an active extra urease synthesis system in the highly toxic strains that may aid B. thuringiensis survival in insects (similar to previous results with well-known pathogens). Taken together, these results explain the efficient entomopathogenicity of B. thuringiensis. It provides novel insights into the strategies used by B. thuringiensis to resist and overcome host immune defenses and helps identify novel toxicity factors. PMID:26411888

  15. Extensive Transcriptomic and Genomic Analysis Provides New Insights about Luminal Breast Cancers

    PubMed Central

    Tishchenko, Inna; Milioli, Heloisa Helena; Riveros, Carlos; Moscato, Pablo

    2016-01-01

    Despite constituting approximately two thirds of all breast cancers, the luminal A and B tumours are poorly classified at both clinical and molecular levels. There are contradictory reports on the nature of these subtypes: some define them as intrinsic entities, others as a continuum. With the aim of addressing these uncertainties and identifying molecular signatures of patients at risk, we conducted a comprehensive transcriptomic and genomic analysis of 2,425 luminal breast cancer samples. Our results indicate that the separation between the molecular luminal A and B subtypes—per definition—is not associated with intrinsic characteristics evident in the differentiation between other subtypes. Moreover, t-SNE and MST-kNN clustering approaches based on 10,000 probes, associated with luminal tumour initiation and/or development, revealed the close connections between luminal A and B tumours, with no evidence of a clear boundary between them. Thus, we considered all luminal tumours as a single heterogeneous group for analysis purposes. We first stratified luminal tumours into two distinct groups by their HER2 gene cluster co-expression: HER2-amplified luminal and ordinary-luminal. The former group is associated with distinct transcriptomic and genomic profiles, and poor prognosis; it comprises approximately 8% of all luminal cases. For the remaining ordinary-luminal tumours we further identified the molecular signature correlated with disease outcomes, exhibiting an approximately continuous gene expression range from low to high risk. Thus, we employed four virtual quantiles to segregate the groups of patients. The clinico-pathological characteristics and ratios of genomic aberrations are concordant with the variations in gene expression profiles, hinting at a progressive staging. The comparison with the current separation into luminal A and B subtypes revealed a substantially improved survival stratification. Concluding, we suggest a review of the definition of

  16. Genome and transcriptome sequencing of the halophilic fungus Wallemia ichthyophaga: haloadaptations present and absent

    PubMed Central

    2013-01-01

    Background The basidomycete Wallemia ichthyophaga from the phylogenetically distinct class Wallemiomycetes is the most halophilic fungus known to date. It requires at least 10% NaCl and thrives in saturated salt solution. To investigate the genomic basis of this exceptional phenotype, we obtained a de-novo genome sequence of the species type-strain and analysed its transcriptomic response to conditions close to the limits of its lower and upper salinity range. Results The unusually compact genome is 9.6 Mb large and contains 1.67% repetitive sequences. Only 4884 predicted protein coding genes cover almost three quarters of the sequence. Of 639 differentially expressed genes, two thirds are more expressed at lower salinity. Phylogenomic analysis based on the largest dataset used to date (whole proteomes) positions Wallemiomycetes as a 250-million-year-old sister group of Agaricomycotina. Contrary to the closely related species Wallemia sebi, W. ichthyophaga appears to have lost the ability for sexual reproduction. Several protein families are significantly expanded or contracted in the genome. Among these, there are the P-type ATPase cation transporters, but not the sodium/ hydrogen exchanger family. Transcription of all but three cation transporters is not salt dependent. The analysis also reveals a significant enrichment in hydrophobins, which are cell-wall proteins with multiple cellular functions. Half of these are differentially expressed, and most contain an unusually large number of acidic amino acids. This discovery is of particular interest due to the numerous applications of hydrophobines from other fungi in industry, pharmaceutics and medicine. Conclusions W. ichthyophaga is an extremophilic specialist that shows only low levels of adaptability and genetic recombination. This is reflected in the characteristics of its genome and its transcriptomic response to salt. No unusual traits were observed in common salt-tolerance mechanisms, such as transport of

  17. Draft assembly of elite inbred line PH207 provides insights into genomic and transcriptome diversity in maize

    USDA-ARS?s Scientific Manuscript database

    Intense artificial selection over the last 100 years has produced elite maize (Zea mays) inbred lines that combine to produce high-yielding hybrids. To further our understanding of how genome and transcriptome variation contribute to the production of high-yielding hybrids, we generated a draft geno...

  18. Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network

    PubMed Central

    Martín-Jiménez, Cynthia A.; Salazar-Barreto, Diego; Barreto, George E.; González, Janneth

    2017-01-01

    Astrocytes are the most abundant cells of the central nervous system; they have a predominant role in maintaining brain metabolism. In this sense, abnormal metabolic states have been found in different neuropathological diseases. Determination of metabolic states of astrocytes is difficult to model using current experimental approaches given the high number of reactions and metabolites present. Thus, genome-scale metabolic networks derived from transcriptomic data can be used as a framework to elucidate how astrocytes modulate human brain metabolic states during normal conditions and in neurodegenerative diseases. We performed a Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network with the purpose of elucidating a significant portion of the metabolic map of the astrocyte. This is the first global high-quality, manually curated metabolic reconstruction network of a human astrocyte. It includes 5,007 metabolites and 5,659 reactions distributed among 8 cell compartments, (extracellular, cytoplasm, mitochondria, endoplasmic reticle, Golgi apparatus, lysosome, peroxisome and nucleus). Using the reconstructed network, the metabolic capabilities of human astrocytes were calculated and compared both in normal and ischemic conditions. We identified reactions activated in these two states, which can be useful for understanding the astrocytic pathways that are affected during brain disease. Additionally, we also showed that the obtained flux distributions in the model, are in accordance with literature-based findings. Up to date, this is the most complete representation of the human astrocyte in terms of inclusion of genes, proteins, reactions and metabolic pathways, being a useful guide for in-silico analysis of several metabolic behaviors of the astrocyte during normal and pathologic states. PMID:28243200

  19. Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network.

    PubMed

    Martín-Jiménez, Cynthia A; Salazar-Barreto, Diego; Barreto, George E; González, Janneth

    2017-01-01

    Astrocytes are the most abundant cells of the central nervous system; they have a predominant role in maintaining brain metabolism. In this sense, abnormal metabolic states have been found in different neuropathological diseases. Determination of metabolic states of astrocytes is difficult to model using current experimental approaches given the high number of reactions and metabolites present. Thus, genome-scale metabolic networks derived from transcriptomic data can be used as a framework to elucidate how astrocytes modulate human brain metabolic states during normal conditions and in neurodegenerative diseases. We performed a Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network with the purpose of elucidating a significant portion of the metabolic map of the astrocyte. This is the first global high-quality, manually curated metabolic reconstruction network of a human astrocyte. It includes 5,007 metabolites and 5,659 reactions distributed among 8 cell compartments, (extracellular, cytoplasm, mitochondria, endoplasmic reticle, Golgi apparatus, lysosome, peroxisome and nucleus). Using the reconstructed network, the metabolic capabilities of human astrocytes were calculated and compared both in normal and ischemic conditions. We identified reactions activated in these two states, which can be useful for understanding the astrocytic pathways that are affected during brain disease. Additionally, we also showed that the obtained flux distributions in the model, are in accordance with literature-based findings. Up to date, this is the most complete representation of the human astrocyte in terms of inclusion of genes, proteins, reactions and metabolic pathways, being a useful guide for in-silico analysis of several metabolic behaviors of the astrocyte during normal and pathologic states.

  20. Analysis of the Legionella longbeachae genome and transcriptome uncovers unique strategies to cause Legionnaires' disease.

    PubMed

    Cazalet, Christel; Gomez-Valero, Laura; Rusniok, Christophe; Lomma, Mariella; Dervins-Ravault, Delphine; Newton, Hayley J; Sansom, Fiona M; Jarraud, Sophie; Zidane, Nora; Ma, Laurence; Bouchier, Christiane; Etienne, Jerôme; Hartland, Elizabeth L; Buchrieser, Carmen

    2010-02-19

    Legionella pneumophila and L. longbeachae are two species of a large genus of bacteria that are ubiquitous in nature. L. pneumophila is mainly found in natural and artificial water circuits while L. longbeachae is mainly present in soil. Under the appropriate conditions both species are human pathogens, capable of causing a severe form of pneumonia termed Legionnaires' disease. Here we report the sequencing and analysis of four L. longbeachae genomes, one complete genome sequence of L. longbeachae strain NSW150 serogroup (Sg) 1, and three draft genome sequences another belonging to Sg1 and two to Sg2. The genome organization and gene content of the four L. longbeachae genomes are highly conserved, indicating strong pressure for niche adaptation. Analysis and comparison of L. longbeachae strain NSW150 with L. pneumophila revealed common but also unexpected features specific to this pathogen. The interaction with host cells shows distinct features from L. pneumophila, as L. longbeachae possesses a unique repertoire of putative Dot/Icm type IV secretion system substrates, eukaryotic-like and eukaryotic domain proteins, and encodes additional secretion systems. However, analysis of the ability of a dotA mutant of L. longbeachae NSW150 to replicate in the Acanthamoeba castellanii and in a mouse lung infection model showed that the Dot/Icm type IV secretion system is also essential for the virulence of L. longbeachae. In contrast to L. pneumophila, L. longbeachae does not encode flagella, thereby providing a possible explanation for differences in mouse susceptibility to infection between the two pathogens. Furthermore, transcriptome analysis revealed that L. longbeachae has a less pronounced biphasic life cycle as compared to L. pneumophila, and genome analysis and electron microscopy suggested that L. longbeachae is encapsulated. These species-specific differences may account for the different environmental niches and disease epidemiology of these two Legionella

  1. Identification of G protein coupled receptors for opsines and neurohormones in Rhodnius prolixus. Genomic and transcriptomic analysis.

    PubMed

    Ons, Sheila; Lavore, Andrés; Sterkel, Marcos; Wulff, Juan Pedro; Sierra, Ivana; Martínez-Barnetche, Jesús; Rodriguez, Mario Henry; Rivera-Pomar, Rolando

    2016-02-01

    The importance of Chagas disease motivated the scientific effort to obtain the complete genomic sequence of the vector species Rhodnius prolixus, this information is also relevant to the understanding of triatomine biology in general. The central nervous system is the key regulator of insect physiology and behavior. Neurohormones (neuropeptides and biogenic amines) are the chemical messengers involved in the regulation and integration of neuroendocrine signals. In insects, this signaling is mainly mediated by the interaction of neurohormone ligands with G protein coupled receptors (GPCRs). The recently sequenced R. prolixus genome provides us with the opportunity to analyze this important family of genes in triatomines, supplying relevant information for further functional studies. Next-generation sequencing methods offer an excellent opportunity for transcriptomic exploration in key organs and tissues in the presence of a reference genome as well as when a reference genome is not available. We undertook a genomic analysis to obtain a genome-wide inventory of opsines and the GPCRs for neurohormones in R. prolixus. Furthermore, we performed a transcriptomic analysis of R. prolixus central nervous system, focusing on neuropeptide precursor genes and neurohormone and opsines GPCRs. In addition, we mined the whole transcriptomes of Triatoma dimidiata, Triatoma infestans and Triatoma pallidipennis - three sanitary relevant triatomine species - to identify neuropeptide precursors and GPCRs genes. Our study reveals a high degree of sequence conservation in the molecular components of the neuroendocrine system of triatomines.

  2. Genome-wide methylation and transcriptome analysis in penile carcinoma: uncovering new molecular markers.

    PubMed

    Kuasne, Hellen; Cólus, Ilce Mara de Syllos; Busso, Ariane Fidelis; Hernandez-Vargas, Hector; Barros-Filho, Mateus Camargo; Marchi, Fabio Albuquerque; Scapulatempo-Neto, Cristovam; Faria, Eliney Ferreira; Lopes, Ademar; Guimarães, Gustavo Cardoso; Herceg, Zdenko; Rogatto, Silvia Regina

    2015-01-01

    Despite penile carcinoma (PeCa) being a relatively rare neoplasm, it remains an important public health issue for poor and developing countries. Contrary to most tumors, limited data are available for markers that are capable of assisting in diagnosis, prognosis, and treatment of PeCa. We aimed to identify molecular markers for PeCa by evaluating their epigenomic and transcriptome profiles and comparing them with surrounding non-malignant tissue (SNT) and normal glans (NG). Genome-wide methylation analysis revealed 171 hypermethylated probes in PeCa. Transcriptome profiling presented 2,883 underexpressed and 1,378 overexpressed genes. Integrative analysis revealed a panel of 54 genes with an inverse correlation between methylation and gene expression levels. Distinct methylome and transcriptome patterns were found for human papillomavirus (HPV)-positive (38.6%) and negative tumors. Interestingly, grade 3 tumors showed a distinct methylation profile when compared to grade 1. In addition, univariate analysis revealed that low BDNF methylation was associated with lymph node metastasis and shorter disease-free survival. CpG hypermethylation and gene underexpression were confirmed for a panel of genes, including TWIST1, RSOP2, SOX3, SOX17, PROM1, OTX2, HOXA3, and MEIS1. A unique methylome signature was found for PeCa compared to SNT, with aberrant DNA methylation appearing to modulate the expression of specific genes. This study describes new pathways with the potential to regulate penile carcinogenesis, including stem cell regulatory pathways and markers associated to a worse prognosis. These findings may be instrumental in the discovery and application of new genetic and epigenetic biomarkers in PeCa.

  3. Effects of Space Environment on Genome, Transcriptome, and Proteome of Klebsiella pneumoniae.

    PubMed

    Guo, Yinghua; Li, Jia; Liu, Jinwen; Wang, Tong; Li, Yinhu; Yuan, Yanting; Zhao, Jiao; Chang, De; Fang, Xiangqun; Li, Tianzhi; Wang, Junfeng; Dai, Wenkui; Fang, Chengxiang; Liu, Changting

    2015-11-01

    The aim of this study was to explore the effects of space flight on Klebsiella pneumoniae. A strain of K. pneumoniae was sent to space for 398 h aboard the ShenZhou VIII spacecraft during November 1, 2011-November 17, 2011. At the same time, a ground simulation with similar temperature conditions during the space flight was performed as a control. After the space mission, the flight and control strains were analyzed using phenotypic, genomic, transcriptomic and proteomic techniques. The flight strains LCT-KP289 exhibited a higher cotrimoxazole resistance level and changes in metabolism relative to the ground control strain LCT-KP214. After the space flight, 73 SNPs and a plasmid copy number variation were identified in the flight strain. Based on the transcriptomic analysis, there are 232 upregulated and 1879 downregulated genes, of which almost all were for metabolism. Proteomic analysis revealed that there were 57 upregulated and 125 downregulated proteins. These differentially expressed proteins had several functions that included energy production and conversion, carbohydrate transport and metabolism, translation, ribosomal structure and biogenesis, posttranslational modification, protein turnover, and chaperone functions. At a systems biology level, the ytfG gene had a synonymous mutation that resulted in significantly downregulated expression at both transcriptomic and proteomic levels. The mutation of the ytfG gene may influence fructose and mannose metabolic processes of K. pneumoniae during space flight, which may be beneficial to the field of space microbiology, providing potential therapeutic strategies to combat or prevent infection in astronauts. Copyright © 2015 IMSS. Published by Elsevier Inc. All rights reserved.

  4. Genome-wide functional genomic and transcriptomic analyses for genes regulating sensitivity to vorinostat

    PubMed Central

    Falkenberg, Katrina J; Gould, Cathryn M; Johnstone, Ricky W; Simpson, Kaylene J

    2014-01-01

    Identification of mechanisms of resistance to histone deacetylase inhibitors, such as vorinostat, is important in order to utilise these anticancer compounds more efficiently in the clinic. Here, we present a dataset containing multiple tiers of stringent siRNA screening for genes that when knocked down conferred sensitivity to vorinostat-induced cell death. We also present data from a miRNA overexpression screen for miRNAs contributing to vorinostat sensitivity. Furthermore, we provide transcriptomic analysis using massively parallel sequencing upon knockdown of 14 validated vorinostat-resistance genes. These datasets are suitable for analysis of genes and miRNAs involved in cell death in the presence and absence of vorinostat as well as computational biology approaches to identify gene regulatory networks. PMID:25977774

  5. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.

    PubMed

    Curtis, Christina; Shah, Sohrab P; Chin, Suet-Feung; Turashvili, Gulisa; Rueda, Oscar M; Dunning, Mark J; Speed, Doug; Lynch, Andy G; Samarajiwa, Shamith; Yuan, Yinyin; Gräf, Stefan; Ha, Gavin; Haffari, Gholamreza; Bashashati, Ali; Russell, Roslin; McKinney, Steven; Langerød, Anita; Green, Andrew; Provenzano, Elena; Wishart, Gordon; Pinder, Sarah; Watson, Peter; Markowetz, Florian; Murphy, Leigh; Ellis, Ian; Purushotham, Arnie; Børresen-Dale, Anne-Lise; Brenton, James D; Tavaré, Simon; Caldas, Carlos; Aparicio, Samuel

    2012-04-18

    The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in ~40% of genes, with the landscape dominated by cis- and trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome.

  6. Genome-wide RNA-seq analysis of human and mouse platelet transcriptomes.

    PubMed

    Rowley, Jesse W; Oler, Andrew J; Tolley, Neal D; Hunter, Benjamin N; Low, Elizabeth N; Nix, David A; Yost, Christian C; Zimmerman, Guy A; Weyrich, Andrew S

    2011-10-06

    Inbred mice are a useful tool for studying the in vivo functions of platelets. Nonetheless, the mRNA signature of mouse platelets is not known. Here, we use paired-end next-generation RNA sequencing (RNA-seq) to characterize the polyadenylated transcriptomes of human and mouse platelets. We report that RNA-seq provides unprecedented resolution of mRNAs that are expressed across the entire human and mouse genomes. Transcript expression and abundance are often conserved between the 2 species. Several mRNAs, however, are differentially expressed in human and mouse platelets. Moreover, previously described functional disparities between mouse and human platelets are reflected in differences at the transcript level, including protease activated receptor-1, protease activated receptor-3, platelet activating factor receptor, and factor V. This suggests that RNA-seq is a useful tool for predicting differences in platelet function between mice and humans. Our next-generation sequencing analysis provides new insights into the human and murine platelet transcriptomes. The sequencing dataset will be useful in the design of mouse models of hemostasis and a catalyst for discovery of new functions of platelets. Access to the dataset is found in the "Introduction."

  7. Integrated Transcriptomic-Proteomic Analysis Using a Proteogenomic Workflow Refines Rat Genome Annotation.

    PubMed

    Kumar, Dhirendra; Yadav, Amit Kumar; Jia, Xinying; Mulvenna, Jason; Dash, Debasis

    2016-01-01

    Proteogenomic re-annotation and mRNA splicing information can lead to the discovery of various protein forms for eukaryotic model organisms like rat. However, detection of novel proteoforms using mass spectrometry proteomics data remains a formidable challenge. We developed EuGenoSuite, an open source multiple algorithmic proteomic search tool and utilized it in our in-house integrated transcriptomic-proteomic pipeline to facilitate automated proteogenomic analysis. Using four proteogenomic pipelines (integrated transcriptomic-proteomic, Peppy, Enosi, and ProteoAnnotator) on publicly available RNA-sequence and MS proteomics data, we discovered 363 novel peptides in rat brain microglia representing novel proteoforms for 249 gene loci in the rat genome. These novel peptides aided in the discovery of novel exons, translation of annotated untranslated regions, pseudogenes, and splice variants for various loci; many of which have known disease associations, including neurological disorders like schizophrenia, amyotrophic lateral sclerosis, etc. Novel isoforms were also discovered for genes implicated in cardiovascular diseases and breast cancer for which rats are considered model organisms. Our integrative multi-omics data analysis not only enables the discovery of new proteoforms but also generates an improved reference for human disease studies in the rat model.

  8. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges

    PubMed Central

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress. PMID:26738846

  9. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges

    NASA Astrophysics Data System (ADS)

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress.

  10. Genome-based analysis of the transcriptome from mature chickpea root nodules

    PubMed Central

    Afonso-Grunz, Fabian; Molina, Carlos; Hoffmeier, Klaus; Rycak, Lukas; Kudapa, Himabindu; Varshney, Rajeev K.; Drevon, Jean-Jacques; Winter, Peter; Kahl, Günter

    2014-01-01

    Symbiotic nitrogen fixation (SNF) in root nodules of grain legumes such as chickpea is a highly complex process that drastically affects the gene expression patterns of both the prokaryotic as well as eukaryotic interacting cells. A successfully established symbiotic relationship requires mutual signaling mechanisms and a continuous adaptation of the metabolism of the involved cells to varying environmental conditions. Although some of these processes are well understood today many of the molecular mechanisms underlying SNF, especially in chickpea, remain unclear. Here, we reannotated our previously published transcriptome data generated by deepSuperSAGE (Serial Analysis of Gene Expression) to the recently published draft genome of chickpea to assess the root- and nodule-specific transcriptomes of the eukaryotic host cells. The identified gene expression patterns comprise up to 71 significantly differentially expressed genes and the expression of twenty of these was validated by quantitative real-time PCR with the tissues from five independent biological replicates. Many of the differentially expressed transcripts were found to encode proteins implicated in sugar metabolism, antioxidant defense as well as biotic and abiotic stress responses of the host cells, and some of them were already known to contribute to SNF in other legumes. The differentially expressed genes identified in this study represent candidates that can be used for further characterization of the complex molecular mechanisms underlying SNF in chickpea. PMID:25071808

  11. Genomics of Compositae crops: reference transcriptome assemblies and evidence of hybridization with wild relatives.

    PubMed

    Hodgins, Kathryn A; Lai, Zhao; Oliveira, Luiz O; Still, David W; Scascitelli, Moira; Barker, Michael S; Kane, Nolan C; Dempewolf, Hannes; Kozik, Alex; Kesseli, Richard V; Burke, John M; Michelmore, Richard W; Rieseberg, Loren H

    2014-01-01

    Although the Compositae harbours only two major food crops, sunflower and lettuce, many other species in this family are utilized by humans and have experienced various levels of domestication. Here, we have used next-generation sequencing technology to develop 15 reference transcriptome assemblies for Compositae crops or their wild relatives. These data allow us to gain insight into the evolutionary and genomic consequences of plant domestication. Specifically, we performed Illumina sequencing of Cichorium endivia, Cichorium intybus, Echinacea angustifolia, Iva annua, Helianthus tuberosus, Dahlia hybrida, Leontodon taraxacoides and Glebionis segetum, as well 454 sequencing of Guizotia scabra, Stevia rebaudiana, Parthenium argentatum and Smallanthus sonchifolius. Illumina reads were assembled using Trinity, and 454 reads were assembled using MIRA and CAP3. We evaluated the coverage of the transcriptomes using BLASTX analysis of a set of ultra-conserved orthologs (UCOs) and recovered most of these genes (88-98%). We found a correlation between contig length and read length for the 454 assemblies, and greater contig lengths for the 454 compared with the Illumina assemblies. This suggests that longer reads can aid in the assembly of more complete transcripts. Finally, we compared the divergence of orthologs at synonymous sites (Ks) between Compositae crops and their wild relatives and found greater divergence when the progenitors were self-incompatible. We also found greater divergence between pairs of taxa that had some evidence of postzygotic isolation. For several more distantly related congeners, such as chicory and endive, we identified a signature of introgression in the distribution of Ks values.

  12. The expanding transcriptome: the genome as the ‘Book of Sand'

    PubMed Central

    Mendes Soares, Luis M; Valcárcel, Juan

    2006-01-01

    The central dogma of molecular biology inspired by classical work in prokaryotic organisms accounts for only part of the genetic agenda of complex eukaryotes. First, post-transcriptional events lead to the generation of multiple mRNAs, proteins and functions from a single primary transcript, revealing regulatory networks distinct in mechanism and biological function from those controlling RNA transcription. Second, a variety of populous families of small RNAs (small nuclear RNAs, small nucleolar RNAs, microRNAs, siRNAs and shRNAs) assemble on ribonucleoprotein complexes and regulate virtually all aspects of the gene expression pathway, with profound biological consequences. Third, high-throughput methods of genomic analysis reveal that RNAs other than non-protein-coding RNAs (ncRNAs) represent a major component of the transcriptome that may perform novel functions in gene regulation and beyond. Post-transcriptional regulation, small RNAs and ncRNAs provide an expanding picture of the transcriptome that enriches our views of what genes are, how they operate, evolve and are regulated. PMID:16511566

  13. Genomic and Transcriptomic Features of Response to Anti-PD-1 Therapy in Metastatic Melanoma.

    PubMed

    Hugo, Willy; Zaretsky, Jesse M; Sun, Lu; Song, Chunying; Moreno, Blanca Homet; Hu-Lieskovan, Siwen; Berent-Maoz, Beata; Pang, Jia; Chmielowski, Bartosz; Cherry, Grace; Seja, Elizabeth; Lomeli, Shirley; Kong, Xiangju; Kelley, Mark C; Sosman, Jeffrey A; Johnson, Douglas B; Ribas, Antoni; Lo, Roger S

    2016-03-24

    PD-1 immune checkpoint blockade provides significant clinical benefits for melanoma patients. We analyzed the somatic mutanomes and transcriptomes of pretreatment melanoma biopsies to identify factors that may influence innate sensitivity or resistance to anti-PD-1 therapy. We find that overall high mutational loads associate with improved survival, and tumors from responding patients are enriched for mutations in the DNA repair gene BRCA2. Innately resistant tumors display a transcriptional signature (referred to as the IPRES, or innate anti-PD-1 resistance), indicating concurrent up-expression of genes involved in the regulation of mesenchymal transition, cell adhesion, extracellular matrix remodeling, angiogenesis, and wound healing. Notably, mitogen-activated protein kinase (MAPK)-targeted therapy (MAPK inhibitor) induces similar signatures in melanoma, suggesting that a non-genomic form of MAPK inhibitor resistance mediates cross-resistance to anti-PD-1 therapy. Validation of the IPRES in other independent tumor cohorts defines a transcriptomic subset across distinct types of advanced cancer. These findings suggest that attenuating the biological processes that underlie IPRES may improve anti-PD-1 response in melanoma and other cancer types.

  14. Genome-wide characterization of transcriptional start sites in humans by integrative transcriptome analysis

    PubMed Central

    Yamashita, Riu; Sathira, Nuankanya P.; Kanai, Akinori; Tanimoto, Kousuke; Arauchi, Takako; Tanaka, Yoshiaki; Hashimoto, Shin-ichi; Sugano, Sumio; Nakai, Kenta; Suzuki, Yutaka

    2011-01-01

    We performed a genome-wide analysis of transcriptional start sites (TSSs) in human genes by multifaceted use of a massively parallel sequencer. By analyzing 800 million sequences that were obtained from various types of transcriptome analyses, we characterized 140 million TSS tags in 12 human cell types. Despite the large number of TSS clusters (TSCs), the number of TSCs was observed to decrease sharply with increasing expression levels. Highly expressed TSCs exhibited several characteristic features: Nucleosome-seq analysis revealed highly ordered nucleosome structures, ChIP-seq analysis detected clear RNA polymerase II binding signals in their surrounding regions, evaluations of previously sequenced and newly shotgun-sequenced complete cDNA sequences showed that they encode preferable transcripts for protein translation, and RNA-seq analysis of polysome-incorporated RNAs yielded direct evidence that those transcripts are actually translated into proteins. We also demonstrate that integrative interpretation of transcriptome data is essential for the selection of putative alternative promoter TSCs, two of which also have protein consequences. Furthermore, discriminative chromatin features that separate TSCs at different expression levels were found for both genic TSCs and intergenic TSCs. The collected integrative information should provide a useful basis for future biological characterization of TSCs. PMID:21372179

  15. Genome-wide transcriptome analysis revealed organelle specific responses to temperature variations in algae

    PubMed Central

    Shin, HyeonSeok; Hong, Seong-Joo; Yoo, Chan; Han, Mi-Ae; Lee, Hookeun; Choi, Hyung-Kyoon; Cho, Suhyung; Lee, Choul-Gyun; Cho, Byung-Kwan

    2016-01-01

    Temperature is a critical environmental factor that affects microalgal growth. However, microalgal coping mechanisms for temperature variations are unclear. Here, we determined changes in transcriptome, total carbohydrate, total fatty acid methyl ester, and fatty acid composition of Tetraselmis sp. KCTC12432BP, a strain with a broad temperature tolerance range, to elucidate the tolerance mechanisms in response to large temperature variations. Owing to unavailability of genome sequence information, de novo transcriptome assembly coupled with BLAST analysis was performed using strand specific RNA-seq data. This resulted in 26,245 protein-coding transcripts, of which 83.7% could be annotated to putative functions. We identified more than 681 genes differentially expressed, suggesting an organelle-specific response to temperature variation. Among these, the genes related to the photosynthetic electron transfer chain, which are localized in the plastid thylakoid membrane, were upregulated at low temperature. However, the transcripts related to the electron transport chain and biosynthesis of phosphatidylethanolamine localized in mitochondria were upregulated at high temperature. These results show that the low energy uptake by repressed photosynthesis under low and high temperature conditions is compensated by different mechanisms, including photosystem I and mitochondrial oxidative phosphorylation, respectively. This study illustrates that microalgae tolerate different temperature conditions through organelle specific mechanisms. PMID:27883062

  16. An evaluation of transcriptome-based exon capture for frog phylogenomics across multiple scales of divergence (Class: Amphibia, Order: Anura).

    PubMed

    Portik, Daniel M; Smith, Lydia L; Bi, Ke

    2016-09-01

    Custom sequence capture experiments are becoming an efficient approach for gathering large sets of orthologous markers in nonmodel organisms. Transcriptome-based exon capture utilizes transcript sequences to design capture probes, typically using a reference genome to identify intron-exon boundaries to exclude shorter exons (<200 bp). Here, we test directly using transcript sequences for probe design, which are often composed of multiple exons of varying lengths. Using 1260 orthologous transcripts, we conducted sequence captures across multiple phylogenetic scales for frogs, including outgroups ~100 Myr divergent from the ingroup. We recovered a large phylogenomic data set consisting of sequence alignments for 1047 of the 1260 transcriptome-based loci (~561 000 bp) and a large quantity of highly variable regions flanking the exons in transcripts (~70 000 bp), the latter improving substantially by only including ingroup species (~797 000 bp). We recovered both shorter (<100 bp) and longer exons (>200 bp), with no major reduction in coverage towards the ends of exons. We observed significant differences in the performance of blocking oligos for target enrichment and nontarget depletion during captures, and differences in PCR duplication rates resulting from the number of individuals pooled for capture reactions. We explicitly tested the effects of phylogenetic distance on capture sensitivity, specificity, and missing data, and provide a baseline estimate of expectations for these metrics based on a priori knowledge of nuclear pairwise differences among samples. We provide recommendations for transcriptome-based exon capture design based on our results, cost estimates and offer multiple pipelines for data assembly and analysis.

  17. Identification of metastasis-associated genes in colorectal cancer through an integrated genomic and transcriptomic analysis

    PubMed Central

    Peng, Sihua

    2013-01-01

    Objective Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of microarray data was presented, by combined with evidence acquired from comparative genomic hybridization (CGH) data. Methods Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify metastasis-associated genes in CRC. Results A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions Our results demonstrated that integration analysis is an effective strategy for mining cancer-associated genes. PMID:24385689

  18. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels

    DOE PAGES

    Petit, Elsa; Coppi, Maddalena V.; Hayes, James C.; ...

    2015-06-02

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of our present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer.more » These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. Lastly, these characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.« less

  19. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels

    PubMed Central

    Petit, Elsa; Coppi, Maddalena V.; Hayes, James C.; Tolonen, Andrew C.; Warnick, Thomas; Latouf, William G.; Amisano, Danielle; Biddle, Amy; Mukherjee, Supratim; Ivanova, Natalia; Lykidis, Athanassios; Land, Miriam; Hauser, Loren; Kyrpides, Nikos; Henrissat, Bernard; Lau, Joanne; Schnell, Danny J.; Church, George M.; Leschine, Susan B.; Blanchard, Jeffrey L.

    2015-01-01

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels. PMID:26035711

  20. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels.

    PubMed

    Petit, Elsa; Coppi, Maddalena V; Hayes, James C; Tolonen, Andrew C; Warnick, Thomas; Latouf, William G; Amisano, Danielle; Biddle, Amy; Mukherjee, Supratim; Ivanova, Natalia; Lykidis, Athanassios; Land, Miriam; Hauser, Loren; Kyrpides, Nikos; Henrissat, Bernard; Lau, Joanne; Schnell, Danny J; Church, George M; Leschine, Susan B; Blanchard, Jeffrey L

    2015-01-01

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.

  1. Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events

    PubMed Central

    Liu, Jinfeng; Lee, William; Jiang, Zhaoshi; Chen, Zhongqiang; Jhunjhunwala, Suchit; Haverty, Peter M.; Gnad, Florian; Guan, Yinghui; Gilbert, Houston N.; Stinson, Jeremy; Klijn, Christiaan; Guillory, Joseph; Bhatt, Deepali; Vartanian, Steffan; Walter, Kimberly; Chan, Jocelyn; Holcomb, Thomas; Dijkgraaf, Peter; Johnson, Stephanie; Koeman, Julie; Minna, John D.; Gazdar, Adi F.; Stern, Howard M.; Hoeflich, Klaus P.; Wu, Thomas D.; Settleman, Jeff; de Sauvage, Frederic J.; Gentleman, Robert C.; Neve, Richard M.; Stokoe, David; Modrusan, Zora; Seshagiri, Somasekar; Shames, David S.; Zhang, Zemin

    2012-01-01

    Lung cancer is a highly heterogeneous disease in terms of both underlying genetic lesions and response to therapeutic treatments. We performed deep whole-genome sequencing and transcriptome sequencing on 19 lung cancer cell lines and three lung tumor/normal pairs. Overall, our data show that cell line models exhibit similar mutation spectra to human tumor samples. Smoker and never-smoker cancer samples exhibit distinguishable patterns of mutations. A number of epigenetic regulators, including KDM6A, ASH1L, SMARCA4, and ATAD2, are frequently altered by mutations or copy number changes. A systematic survey of splice-site mutations identified 106 splice site mutations associated with cancer specific aberrant splicing, including mutations in several known cancer-related genes. RAC1b, an isoform of the RAC1 GTPase that includes one additional exon, was found to be preferentially up-regulated in lung cancer. We further show that its expression is significantly associated with sensitivity to a MAP2K (MEK) inhibitor PD-0325901. Taken together, these data present a comprehensive genomic landscape of a large number of lung cancer samples and further demonstrate that cancer-specific alternative splicing is a widespread phenomenon that has potential utility as therapeutic biomarkers. The detailed characterizations of the lung cancer cell lines also provide genomic context to the vast amount of experimental data gathered for these lines over the decades, and represent highly valuable resources for cancer biology. PMID:23033341

  2. Comprehensive Comparative Genomic and Transcriptomic Analyses of the Legume Genes Controlling the Nodulation Process

    PubMed Central

    Qiao, Zhenzhen; Pingault, Lise; Nourbakhsh-Rey, Mehrnoush; Libault, Marc

    2016-01-01

    Nitrogen is one of the most essential plant nutrients and one of the major factors limiting crop productivity. Having the goal to perform a more sustainable agriculture, there is a need to maximize biological nitrogen fixation, a feature of legumes. To enhance our understanding of the molecular mechanisms controlling the interaction between legumes and rhizobia, the symbiotic partner fixing and assimilating the atmospheric nitrogen for the plant, researchers took advantage of genetic and genomic resources developed across different legume models (e.g., Medicago truncatula, Lotus japonicus, Glycine max, and Phaseolus vulgaris) to identify key regulatory protein coding genes of the nodulation process. In this study, we are presenting the results of a comprehensive comparative genomic analysis to highlight orthologous and paralogous relationships between the legume genes controlling nodulation. Mining large transcriptomic datasets, we also identified several orthologous and paralogous genes characterized by the induction of their expression during nodulation across legume plant species. This comprehensive study prompts new insights into the evolution of the nodulation process in legume plant and will benefit the scientific community interested in the transfer of functional genomic information between species. PMID:26858743

  3. Comprehensive Comparative Genomic and Transcriptomic Analyses of the Legume Genes Controlling the Nodulation Process.

    PubMed

    Qiao, Zhenzhen; Pingault, Lise; Nourbakhsh-Rey, Mehrnoush; Libault, Marc

    2016-01-01

    Nitrogen is one of the most essential plant nutrients and one of the major factors limiting crop productivity. Having the goal to perform a more sustainable agriculture, there is a need to maximize biological nitrogen fixation, a feature of legumes. To enhance our understanding of the molecular mechanisms controlling the interaction between legumes and rhizobia, the symbiotic partner fixing and assimilating the atmospheric nitrogen for the plant, researchers took advantage of genetic and genomic resources developed across different legume models (e.g., Medicago truncatula, Lotus japonicus, Glycine max, and Phaseolus vulgaris) to identify key regulatory protein coding genes of the nodulation process. In this study, we are presenting the results of a comprehensive comparative genomic analysis to highlight orthologous and paralogous relationships between the legume genes controlling nodulation. Mining large transcriptomic datasets, we also identified several orthologous and paralogous genes characterized by the induction of their expression during nodulation across legume plant species. This comprehensive study prompts new insights into the evolution of the nodulation process in legume plant and will benefit the scientific community interested in the transfer of functional genomic information between species.

  4. Genome, Transcriptome, and Functional Analyses of Penicillium expansum Provide New Insights Into Secondary Metabolism and Pathogenicity.

    PubMed

    Ballester, Ana-Rosa; Marcet-Houben, Marina; Levin, Elena; Sela, Noa; Selma-Lázaro, Cristina; Carmona, Lourdes; Wisniewski, Michael; Droby, Samir; González-Candelas, Luis; Gabaldón, Toni

    2015-03-01

    The relationship between secondary metabolism and infection in pathogenic fungi has remained largely elusive. The genus Penicillium comprises a group of plant pathogens with varying host specificities and with the ability to produce a wide array of secondary metabolites. The genomes of three Penicillium expansum strains, the main postharvest pathogen of pome fruit, and one Pencillium italicum strain, a postharvest pathogen of citrus fruit, were sequenced and compared with 24 other fungal species. A genomic analysis of gene clusters responsible for the production of secondary metabolites was performed. Putative virulence factors in P. expansum were identified by means of a transcriptomic analysis of apple fruits during the course of infection. Despite a major genome contraction, P. expansum is the Penicillium species with the largest potential for the production of secondary metabolites. Results using knockout mutants clearly demonstrated that neither patulin nor citrinin are required by P. expansum to successfully infect apples. Li et al. ( MPMI-12-14-0398-FI ) reported similar results and conclusions in their recently accepted paper.

  5. Genome and Transcriptome of Clostridium phytofermentans, Catalyst for the Direct Conversion of Plant Feedstocks to Fuels

    SciTech Connect

    Petit, Elsa; Coppi, Maddalena V.; Hayes, James C.; Tolonen, Andrew C.; Warnick, Thomas; Latouf, William G.; Amisano, Danielle; Biddle, Amy; Mukherjee, Supratim; Ivanova, Natalia; Lykidis, Athanassios; Land, Miriam; Hauser, Loren; Kyrpides, Nikos; Henrissat, Bernard; Lau, Joanne; Schnell, Danny J.; Church, George M.; Leschine, Susan B.; Blanchard, Jeffrey L.

    2015-06-02

    Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of our present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. Lastly, these characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.

  6. An integrated genomic and transcriptomic survey of mucormycosis-causing fungi

    PubMed Central

    Chibucos, Marcus C.; Soliman, Sameh; Gebremariam, Teclegiorgis; Lee, Hongkyu; Daugherty, Sean; Orvis, Joshua; Shetty, Amol C.; Crabtree, Jonathan; Hazen, Tracy H.; Etienne, Kizee A.; Kumari, Priti; O'Connor, Timothy D.; Rasko, David A.; Filler, Scott G.; Fraser, Claire M.; Lockhart, Shawn R.; Skory, Christopher D.; Ibrahim, Ashraf S.; Bruno, Vincent M.

    2016-01-01

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. Here we sequence 30 fungal genomes, and perform transcriptomics with three representative Rhizopus and Mucor strains and with human airway epithelial cells during fungal invasion, to reveal key host and fungal determinants contributing to pathogenesis. Analysis of the host transcriptional response to Mucorales reveals platelet-derived growth factor receptor B (PDGFRB) signaling as part of a core response to divergent pathogenic fungi; inhibition of PDGFRB reduces Mucorales-induced damage to host cells. The unique presence of CotH invasins in all invasive Mucorales, and the correlation between CotH gene copy number and clinical prevalence, are consistent with an important role for these proteins in mucormycosis pathogenesis. Our work provides insight into the evolution of this medically and economically important group of fungi, and identifies several molecular pathways that might be exploited as potential therapeutic targets. PMID:27447865

  7. Molecular Subtyping of Pancreatic Cancer: Translating Genomics and Transcriptomics into the Clinic

    PubMed Central

    Du, Yongxing; Zhao, Bangbo; Liu, Ziwen; Ren, Xiaoxia; Zhao, Wenjing; Li, Zongze; You, Lei; Zhao, Yupei

    2017-01-01

    Pancreatic cancer remains one of the most lethal malignancies, and insights into both personalized diagnosis and intervention of this disease are urgently needed. The rapid development of sequencing technologies has enabled the successive completion of a series of genetic and epigenetic sequencing studies of pancreatic cancer. The mutational landscape of pancreatic cancer is generally portrayed in terms of somatic mutations, structural variations, epigenetic alterations and the core signaling pathways. In recent years, four significant molecular subtype classifications of pancreatic cancer have been proposed based on the expression of transcription factors and downstream targets or the distribution of structural rearrangements. Increasing researches focus on the identification of somatic mutations and other genetic aberrations that drive pancreatic cancer has led to a new era of precision medicine based on molecular subtyping. However, few known molecular classifications are used to guide clinical strategies. Specific scientific, regulatory and ethical challenges must be overcome before genomic and transcriptomic discoveries can be translated into the clinic. PMID:28367231

  8. Comprehensive transcriptome and improved genome annotation of Bacillus licheniformis WX-02.

    PubMed

    Guo, Jing; Cheng, Gang; Gou, Xiang-Yong; Xing, Feng; Li, Sen; Han, Yi-Chao; Wang, Long; Song, Jia-Ming; Shu, Cheng-Cheng; Chen, Shou-Wen; Chen, Ling-Ling

    2015-08-19

    The updated genome of Bacillus licheniformis WX-02 comprises a circular chromosome of 4286821 base-pairs containing 4512 protein-coding genes. We applied strand-specific RNA-sequencing to explore the transcriptome profiles of B. licheniformis WX-02 under normal and high-salt conditions (NaCl 6%). We identified 2381 co-expressed gene pairs constituting 871 operon structures. In addition, 1169 antisense transcripts and 90 small RNAs were detected. Systematic comparison of differentially expressed genes under different conditions revealed that genes involved in multiple functions were significantly repressed in long-term high salt adaptation process. Genes related to promotion of glutamic acid synthesis were activated by 6% NaCl, potentially explaining the high yield of γ-PGA under salt condition. This study will be useful for the optimization of crucial metabolic activities in this bacterium.

  9. The genome and transcriptome of Japanese flounder provide insights into flatfish asymmetry.

    PubMed

    Shao, Changwei; Bao, Baolong; Xie, Zhiyuan; Chen, Xinye; Li, Bo; Jia, Xiaodong; Yao, Qiulin; Ortí, Guillermo; Li, Wenhui; Li, Xihong; Hamre, Kristin; Xu, Juan; Wang, Lei; Chen, Fangyuan; Tian, Yongsheng; Schreiber, Alex M; Wang, Na; Wei, Fen; Zhang, Jilin; Dong, Zhongdian; Gao, Lei; Gai, Junwei; Sakamoto, Takashi; Mo, Sudong; Chen, Wenjun; Shi, Qiong; Li, Hui; Xiu, Yunji; Li, Yangzhen; Xu, Wenteng; Shi, Zhiyi; Zhang, Guojie; Power, Deborah M; Wang, Qingyin; Schartl, Manfred; Chen, Songlin

    2017-01-01

    Flatfish have the most extreme asymmetric body morphology of vertebrates. During metamorphosis, one eye migrates to the contralateral side of the skull, and this migration is accompanied by extensive craniofacial transformations and simultaneous development of lopsided body pigmentation. The evolution of this developmental and physiological innovation remains enigmatic. Comparative genomics of two flatfish and transcriptomic analyses during metamorphosis point to a role for thyroid hormone and retinoic acid signaling, as well as phototransduction pathways. We demonstrate that retinoic acid is critical in establishing asymmetric pigmentation and, via cross-talk with thyroid hormones, in modulating eye migration. The unexpected expression of the visual opsins from the phototransduction pathway in the skin translates illumination differences and generates retinoic acid gradients that underlie the generation of asymmetry. Identifying the genetic underpinning of this unique developmental process answers long-standing questions about the evolutionary origin of asymmetry, but it also provides insight into the mechanisms that control body shape in vertebrates.

  10. TransComb: genome-guided transcriptome assembly via combing junctions in splicing graphs.

    PubMed

    Liu, Juntao; Yu, Ting; Jiang, Tao; Li, Guojun

    2016-10-19

    Transcriptome assemblers aim to reconstruct full-length transcripts from RNA-seq data. We present TransComb, a genome-guided assembler developed based on a junction graph, weighted by a bin-packing strategy and paired-end information. A newly designed extension method based on weighted junction graphs can accurately extract paths representing expressed transcripts, whether they have low or high expression levels. Tested on both simulated and real datasets, TransComb demonstrates significant improvements in both recall and precision over leading assemblers, including StringTie, Cufflinks, Bayesembler, and Traph. In addition, it runs much faster and requires less memory on average. TransComb is available at http://sourceforge.net/projects/transcriptomeassembly/files/ .

  11. Transcriptome Sequencing of Two Phenotypic Mosaic Eucalyptus Trees Reveals Large Scale Transcriptome Re-Modelling

    PubMed Central

    Padovan, Amanda; Patel, Hardip R.; Chuah, Aaron; Huttley, Gavin A.; Krause, Sandra T.; Degenhardt, Jörg; Foley, William J.; Külheim, Carsten

    2015-01-01

    Phenotypic mosaic trees offer an ideal system for studying differential gene expression. We have investigated two mosaic eucalypt trees from two closely related species (Eucalyptus melliodora and E. sideroxylon), which each support two types of leaves: one part of the canopy is resistant to insect herbivory and the remaining leaves are susceptible. Driving this ecological distinction are differences in plant secondary metabolites. We used these phenotypic mosaics to investigate genome wide patterns of foliar gene expression with the aim of identifying patterns of differential gene expression and the somatic mutation(s) that lead to this phenotypic mosaicism. We sequenced the mRNA pool from leaves of the resistant and susceptible ecotypes from both mosaic eucalypts using the Illumina HiSeq 2000 platform. We found large differences in pathway regulation and gene expression between the ecotypes of each mosaic. The expression of the genes in the MVA and MEP pathways is reflected by variation in leaf chemistry, however this is not the case for the terpene synthases. Apart from the terpene biosynthetic pathway, there are several other metabolic pathways that are differentially regulated between the two ecotypes, suggesting there is much more phenotypic diversity than has been described. Despite the close relationship between the two species, they show large differences in the global patterns of gene and pathway regulation. PMID:25978451

  12. Cryptococcus gattii comparative genomics and transcriptomics: a NIH/NIAID White Paper.

    PubMed

    Chaturvedi, V; Nierman, W C

    2012-06-01

    Cryptococcus gattii is an emerging global pathogen. Recent reports suggest that C. gattii cryptococcosis is more common in immunocompetent as well as HIV-infected AIDS patients than earlier estimated. An ongoing outbreak of C. gattii in Vancouver, Canada, and the US Pacific Northwest has heightened public health awareness in North America. We have few clues as to what causes emergence or re-emergence of highly pathogenic strains, why C. gattii split up from its sibling pathogen C. neoformans, why it thrives in trees instead, and why immunocompetent individuals are vulnerable to this pathogen? C. gattii comprises of four distinct lineages, but the information on the genome of C. gattii is inadequate and unrepresentative as it is limited to two strains, R265 and WM276, which are MATα, serotype B, genotype VGII/VGI from Canada and Australia, respectively. There is a wide gap in knowledge about the genomes of VGIII and VGIV strains, serotype C strains, and MATa strains. The geographical representation is inadequate in the absence of strains from California, South America, Asia, and Africa. Additional obstacles to work with this pathogen are the following: (a) complex molecular typing schemes and (b) lack of functional genomics analyses. We propose to complete genome sequencing of 12 reference strains by next-generation sequencing technology and to map their transcriptomes by RNA-Seq technology. This effort would lead to new resources for the scientific community including (1) insight from additional C. gattii genomes to anchor future research studies, (2) validation of single-nucleotide polymorphisms (SNPs) for molecular typing to improve epidemiology studies, and (3) transcript analyses from strains under relevant pathogenic and non-pathogenic conditions to accelerate the discovery of proteins for diagnostics, drug targets, and vaccines.

  13. Comparative genomics and transcriptome analysis of Aspergillus niger and metabolic engineering for citrate production

    PubMed Central

    Yin, Xian; Shin, Hyun-dong; Li, Jianghua; Du, Guocheng; Liu, Long; Chen, Jian

    2017-01-01

    Despite a long and successful history of citrate production in Aspergillus niger, the molecular mechanism of citrate accumulation is only partially understood. In this study, we used comparative genomics and transcriptome analysis of citrate-producing strains—namely, A. niger H915-1 (citrate titer: 157 g L−1), A1 (117 g L−1), and L2 (76 g L−1)—to gain a genome-wide view of the mechanism of citrate accumulation. Compared with A. niger A1 and L2, A. niger H915-1 contained 92 mutated genes, including a succinate-semialdehyde dehydrogenase in the γ-aminobutyric acid shunt pathway and an aconitase family protein involved in citrate synthesis. Furthermore, transcriptome analysis of A. niger H915-1 revealed that the transcription levels of 479 genes changed between the cell growth stage (6 h) and the citrate synthesis stage (12 h, 24 h, 36 h, and 48 h). In the glycolysis pathway, triosephosphate isomerase was up-regulated, whereas pyruvate kinase was down-regulated. Two cytosol ATP-citrate lyases, which take part in the cycle of citrate synthesis, were up-regulated, and may coordinate with the alternative oxidases in the alternative respiratory pathway for energy balance. Finally, deletion of the oxaloacetate acetylhydrolase gene in H915-1 eliminated oxalate formation but neither influence on pH decrease nor difference in citrate production were observed. PMID:28106122

  14. Genome-Wide Transcriptome and Proteome Analysis on Different Developmental Stages of Cordyceps militaris

    PubMed Central

    Yin, Yalin; Yu, Guojun; Chen, Yijie; Jiang, Shuai; Wang, Man; Jin, Yanxia; Lan, Xianqing; Liang, Yi; Sun, Hui

    2012-01-01

    Background Cordyceps militaris, an ascomycete caterpillar fungus, has been used as a traditional Chinese medicine for many years owing to its anticancer and immunomodulatory activities. Currently, artificial culturing of this beneficial fungus has been widely used and can meet the market, but systematic molecular studies on the developmental stages of cultured C. militaris at transcriptional and translational levels have not been determined. Methodology/Principal Findings We utilized high-throughput Illumina sequencing to obtain the transcriptomes of C. militaris mycelium and fruiting body. All clean reads were mapped to C. militaris genome and most of the reads showed perfect coverage. Alternative splicing and novel transcripts were predicted to enrich the database. Gene expression analysis revealed that 2,113 genes were up-regulated in mycelium and 599 in fruiting body. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were performed to analyze the genes with expression differences. Moreover, the putative cordycepin metabolism difference between different developmental stages was studied. In addition, the proteome data of mycelium and fruiting body were obtained by one-dimensional gel electrophoresis (1-DGE) coupled with nano-electrospray ionization liquid chromatography tandem mass spectrometry (nESI-LC-MS/MS). 359 and 214 proteins were detected from mycelium and fruiting body respectively. GO, KEGG and Cluster of Orthologous Groups (COG) analysis were further conducted to better understand their difference. We analyzed the amounts of some noteworthy proteins in these two samples including lectin, superoxide dismutase, glycoside hydrolase and proteins involved in cordycepin metabolism, providing important information for further protein studies. Conclusions/Significance The results reveal the difference in gene expression between the mycelium and fruiting body of artificially cultivated C. militaris by transcriptome and proteome

  15. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

    DOE PAGES

    Seaver, Samuel M.D.; Bradbury, Louis M.T.; Frelin, Océane; ...

    2015-03-10

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions andmore » possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.« less

  16. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

    SciTech Connect

    Seaver, Samuel M.D.; Bradbury, Louis M.T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-03-10

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  17. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

    PubMed Central

    Seaver, Samuel M. D.; Bradbury, Louis M. T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-01-01

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes. PMID:25806041

  18. Improved Evidence-Based Genome-scale Metabolic Models for Maize Leaf, Embryo, and Endosperm

    SciTech Connect

    Seaver, Samuel M.D.; Frelin, Oceane; Bradbury, Louis M.T.; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

    2015-03-10

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  19. Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm.

    PubMed

    Seaver, Samuel M D; Bradbury, Louis M T; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D; Henry, Christopher S

    2015-01-01

    There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  20. Genome-wide profiling of untranslated regions by paired-end ditag sequencing reveals unexpected transcriptome complexity in yeast.

    PubMed

    Kang, Ya-Ni; Lai, Deng-Pan; Ooi, Hong Sain; Shen, Ting-Ting; Kou, Yao; Tian, Jing; Czajkowsky, Daniel M; Shao, Zhifeng; Zhao, Xiaodong

    2015-02-01

    The identification of structural and functional elements encoded in a genome is a challenging task. Although the transcriptome of budding yeast has been extensively analyzed, the boundaries and untranslated regions of yeast genes remain elusive. To address this least-explored field of yeast genomics, we performed a transcript profiling analysis through paired-end ditag (PET) approach coupled with deep sequencing. With 562,133 PET sequences we accurately defined the boundaries and untranslated regions of 3,409 ORFs, suggesting many yeast genes have multiple transcription start sites (TSSs). We also identified 85 previously uncharacterized transcripts either in intergenic regions or from the opposite strand of reported genomic features. Furthermore, our data revealed the extensive 3' end heterogeneity of yeast genes and identified a novel putative motif for polyadenylation. Our results indicate the yeast transcriptome is more complex than expected. This study would serve as an invaluable resource for elucidating the regulation and evolution of yeast genes.

  1. Genome-wide transcriptome analysis of Chinese pollination-constant nonastringent persimmon fruit treated with ethanol

    PubMed Central

    2014-01-01

    Background The persimmon Diospyros kaki Thunb. is an important commercial and deciduous fruit tree. The fruits have proanthocyanidin (PA) content of >25% of the dry weight and are astringent. PAs cause astringency that is often undesirable for human consumption; thus, the removal of astringency is an important practice in the persimmon industry. Soluble PAs can be converted to insoluble PAs by enclosing the fruit in a polyethylene bag containing diluted ethanol. The genomic resource development of the persimmon is delayed because of its large and complex genome. Second-generation sequencing is an efficient technique for generating huge sequences that can represent a large number of genes and their expression levels. Results We used 454 sequencing for the de novo transcriptome assembly of persimmon fruit treated with 5% ethanol (Tr library) and without treatment as the control (Co library) to investigate the genes and pathways that control PA biosynthesis and other secondary metabolites. We obtained 374.6 Mb in clean nucleotides comprising 624,690 and 626,203 clean sequencing reads from the Tr and Co libraries, respectively. We also identified 83,898 unigenes; 54,719 (~65.2%) unigenes were annotated based on similarity searches with known proteins. Up to 14,954 of the unigenes were assigned to the protein database Clusters of Orthologous Groups (COG), 24,337 were assigned to the term annotation database of Gene Ontology (GO), and 45,506 were assigned to 200 pathways in the database of Kyoto Encyclopedia of Genes and Genomes (KEGG). The two libraries were compared to identify the differentially expressed unigenes. The expression levels of genes involved in PA biosynthesis and tannin coagulation were analysed, and some of them were verified using quantitative real time PCR (qRT-PCR). Conclusions This study provides abundant genomic data for persimmon and offers comprehensive sequence resources for persimmon research. The transcriptome dataset will improve our

  2. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    SciTech Connect

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  3. Genome and transcriptome analysis of the basidiomycetous yeast Pseudozyma antarctica producing extracellular glycolipids, mannosylerythritol lipids.

    PubMed

    Morita, Tomotake; Koike, Hideaki; Hagiwara, Hiroko; Ito, Emi; Machida, Masayuki; Sato, Shun; Habe, Hiroshi; Kitamoto, Dai

    2014-01-01

    Pseudozyma antarctica is a non-pathogenic phyllosphere yeast known as an excellent producer of mannosylerythritol lipids (MELs), multi-functional extracellular glycolipids, from vegetable oils. To clarify the genetic characteristics of P. antarctica, we analyzed the 18 Mb genome of P. antarctica T-34. On the basis of KOG analysis, the number of genes (219 genes) categorized into lipid transport and metabolism classification in P. antarctica was one and a half times larger than that of yeast Saccharomyces cerevisiae (140 genes). The gene encoding an ATP/citrate lyase (ACL) related to acetyl-CoA synthesis conserved in oleaginous strains was found in P. antarctica genome: the single ACL gene possesses the four domains identical to that of the human gene, whereas the other oleaginous ascomycetous species have the two genes covering the four domains. P. antarctica genome exhibited a remarkable degree of synteny to U. maydis genome, however, the comparison of the gene expression profiles under the culture on the two carbon sources, glucose and soybean oil, by the DNA microarray method revealed that transcriptomes between the two species were significantly different. In P. antarctica, expression of the gene sets relating fatty acid metabolism were markedly up-regulated under the oily conditions compared with glucose. Additionally, MEL biosynthesis cluster of P. antarctica was highly expressed regardless of the carbon source as compared to U. maydis. These results strongly indicate that P. antarctica has an oleaginous nature which is relevant to its non-pathogenic and MEL-overproducing characteristics. The analysis and dataset contribute to stimulate the development of improved strains with customized properties for high yield production of functional bio-based materials.

  4. Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae).

    PubMed

    Nock, Catherine J; Baten, Abdul; Barkla, Bronwyn J; Furtado, Agnelo; Henry, Robert J; King, Graham J

    2016-11-17

    The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focus for studies of adaptive radiation due to remarkable species diversification in Mediterranean-climate biodiversity hotspots, and numerous evolutionary transitions between biomes. Despite a long history of research, comparative analyses in the Proteaceae and macadamia breeding programs are restricted by a paucity of genetic information. To address this, we sequenced the genome and transcriptome of the widely grown Macadamia integrifolia cultivar 741. Over 95 gigabases of DNA and RNA-seq sequence data were de novo assembled and annotated. The draft assembly has a total length of 518 Mb and spans approximately 79% of the estimated genome size. Following annotation, 35,337 protein-coding genes were predicted of which over 90% were expressed in at least one of the leaf, shoot or flower tissues examined. Gene family comparisons with five other eudicot species revealed 13,689 clusters containing macadamia genes and 1005 macadamia-specific clusters, and provides evidence for linage-specific expansion of gene families involved in pathogen recognition, plant defense and monoterpene synthesis. Cyanogenesis is an important defense strategy in the Proteaceae, and a detailed analysis of macadamia gene homologues potentially involved in cyanogenic glycoside biosynthesis revealed several highly expressed candidate genes. The gene space of macadamia provides a foundation for comparative genomics, gene discovery and the acceleration of molecular-assisted breeding. This study presents the first available genomic resources for the large basal eudicot family Proteaceae, access to most macadamia genes and opportunities to uncover the genetic basis of traits of importance for adaptation and crop

  5. The draft genome and transcriptome of Amaranthus hypochondriacus: a C4 dicot producing high-lysine edible pseudo-cereal.

    PubMed

    Sunil, Meeta; Hariharan, Arun K; Nayak, Soumya; Gupta, Saurabh; Nambisan, Suran R; Gupta, Ravi P; Panda, Binay; Choudhary, Bibha; Srinivasan, Subhashini

    2014-12-01

    Grain amaranths, edible C4 dicots, produce pseudo-cereals high in lysine. Lysine being one of the most limiting essential amino acids in cereals and C4 photosynthesis being one of the most sought-after phenotypes in protein-rich legume crops, the genome of one of the grain amaranths is likely to play a critical role in crop research. We have sequenced the genome and transcriptome of Amaranthus hypochondriacus, a diploid (2n = 32) belonging to the order Caryophyllales with an estimated genome size of 466 Mb. Of the 411 linkage single-nucleotide polymorphisms (SNPs) reported for grain amaranths, 355 SNPs (86%) are represented in the scaffolds and 74% of the 8.6 billion bases of the sequenced transcriptome map to the genomic scaffolds. The genome of A. hypochondriacus, codes for at least 24,829 proteins, shares the paleohexaploidy event with species under the superorders Rosids and Asterids, harbours 1 SNP in 1,000 bases, and contains 13.76% of repeat elements. Annotation of all the genes in the lysine biosynthetic pathway using comparative genomics and expression analysis offers insights into the high-lysine phenotype. As the first grain species under Caryophyllales and the first C4 dicot genome reported, the work presented here will be beneficial in improving crops and in expanding our understanding of angiosperm evolution.

  6. The Draft Genome and Transcriptome of Amaranthus hypochondriacus: A C4 Dicot Producing High-Lysine Edible Pseudo-Cereal

    PubMed Central

    Sunil, Meeta; Hariharan, Arun K.; Nayak, Soumya; Gupta, Saurabh; Nambisan, Suran R.; Gupta, Ravi P.; Panda, Binay; Choudhary, Bibha; Srinivasan, Subhashini

    2014-01-01

    Grain amaranths, edible C4 dicots, produce pseudo-cereals high in lysine. Lysine being one of the most limiting essential amino acids in cereals and C4 photosynthesis being one of the most sought-after phenotypes in protein-rich legume crops, the genome of one of the grain amaranths is likely to play a critical role in crop research. We have sequenced the genome and transcriptome of Amaranthus hypochondriacus, a diploid (2n = 32) belonging to the order Caryophyllales with an estimated genome size of 466 Mb. Of the 411 linkage single-nucleotide polymorphisms (SNPs) reported for grain amaranths, 355 SNPs (86%) are represented in the scaffolds and 74% of the 8.6 billion bases of the sequenced transcriptome map to the genomic scaffolds. The genome of A. hypochondriacus, codes for at least 24,829 proteins, shares the paleohexaploidy event with species under the superorders Rosids and Asterids, harbours 1 SNP in 1,000 bases, and contains 13.76% of repeat elements. Annotation of all the genes in the lysine biosynthetic pathway using comparative genomics and expression analysis offers insights into the high-lysine phenotype. As the first grain species under Caryophyllales and the first C4 dicot genome reported, the work presented here will be beneficial in improving crops and in expanding our understanding of angiosperm evolution. PMID:25071079

  7. Draft Assembly of Elite Inbred Line PH207 Provides Insights into Genomic and Transcriptome Diversity in Maize[OPEN

    PubMed Central

    Soifer, Ilya; Barad, Omer; Shem-Tov, Doron; Baruch, Kobi; Lu, Fei; Hernandez, Alvaro G.; Wright, Chris L.; Koehler, Klaus; Buell, C. Robin; de Leon, Natalia

    2016-01-01

    Intense artificial selection over the last 100 years has produced elite maize (Zea mays) inbred lines that combine to produce high-yielding hybrids. To further our understanding of how genome and transcriptome variation contribute to the production of high-yielding hybrids, we generated a draft genome assembly of the inbred line PH207 to complement and compare with the existing B73 reference sequence. B73 is a founder of the Stiff Stalk germplasm pool, while PH207 is a founder of Iodent germplasm, both of which have contributed substantially to the production of temperate commercial maize and are combined to make heterotic hybrids. Comparison of these two assemblies revealed over 2500 genes present in only one of the two genotypes and 136 gene families that have undergone extensive expansion or contraction. Transcriptome profiling revealed extensive expression variation, with as many as 10,564 differentially expressed transcripts and 7128 transcripts expressed in only one of the two genotypes in a single tissue. Genotype-specific genes were more likely to have tissue/condition-specific expression and lower transcript abundance. The availability of a high-quality genome assembly for the elite maize inbred PH207 expands our knowledge of the breadth of natural genome and transcriptome variation in elite maize inbred lines across heterotic pools. PMID:27803309

  8. Draft Assembly of Elite Inbred Line PH207 Provides Insights into Genomic and Transcriptome Diversity in Maize.

    PubMed

    Hirsch, Candice N; Hirsch, Cory D; Brohammer, Alex B; Bowman, Megan J; Soifer, Ilya; Barad, Omer; Shem-Tov, Doron; Baruch, Kobi; Lu, Fei; Hernandez, Alvaro G; Fields, Christopher J; Wright, Chris L; Koehler, Klaus; Springer, Nathan M; Buckler, Edward; Buell, C Robin; de Leon, Natalia; Kaeppler, Shawn M; Childs, Kevin L; Mikel, Mark A

    2016-11-01

    Intense artificial selection over the last 100 years has produced elite maize (Zea mays) inbred lines that combine to produce high-yielding hybrids. To further our understanding of how genome and transcriptome variation contribute to the production of high-yielding hybrids, we generated a draft genome assembly of the inbred line PH207 to complement and compare with the existing B73 reference sequence. B73 is a founder of the Stiff Stalk germplasm pool, while PH207 is a founder of Iodent germplasm, both of which have contributed substantially to the production of temperate commercial maize and are combined to make heterotic hybrids. Comparison of these two assemblies revealed over 2500 genes present in only one of the two genotypes and 136 gene families that have undergone extensive expansion or contraction. Transcriptome profiling revealed extensive expression variation, with as many as 10,564 differentially expressed transcripts and 7128 transcripts expressed in only one of the two genotypes in a single tissue. Genotype-specific genes were more likely to have tissue/condition-specific expression and lower transcript abundance. The availability of a high-quality genome assembly for the elite maize inbred PH207 expands our knowledge of the breadth of natural genome and transcriptome variation in elite maize inbred lines across heterotic pools.

  9. Genomic and Transcriptomic Analyses of Indole-3-Acetic Acid Biosynthesis in Diatoms

    NASA Astrophysics Data System (ADS)

    Lim, R.; Armbrust, V.

    2016-02-01

    Indole-3-acetic acid (IAA) is a major plant growth hormone and a common mediator of plant-bacterial interactions. Recently, IAA has also been found to play a role in interactions between diatoms and bacteria, with IAA production by an associated Sulfitobacter leading to increased growth rates in the marine diatom Pseudo-nitzschia multiseries. It is unclear, however, if diatoms themselves are able to synthesize IAA and whether this capability is widespread throughout Bacillariophyta. Four major tryptophan-dependent IAA biosynthesis pathways have been identified in plants and bacteria, each denoted by the first intermediate downstream of tryptophan: the indole-3-pyruvate (IPyA), tryptamine (TAM), indole-3-acetaldoxime (IAOx) and indole-3-acetamide (IAM) pathways. To investigate the possibility of IAA biosynthesis in diatoms, we first analyzed publicly available genomes of raphid pennates P. multiseries, Phaeodactylum tricornutum, Fragilariopsis cylindrus and centric Thalassiosira pseudonana for potential homologs to plant and bacterial IAA biosynthesis genes. The P. multiseries, F. cylindrus and P. tricornutum genomes encode downstream enzymes for bacterial TAM and IAM and plant IPyA pathways. The more evolutionarily ancient T. pseudonana encodes one TAM enzyme in its genome. To investigate the potential distribution of these pathways more broadly, we surveyed the transcriptomes of 11 diatom species that include representatives from all four Bacillariophyta classes. Datasets used were sequenced as part of the Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) and obtained from cultures maintained axenically. Transcripts associated with the TAM pathway were most frequently detected, with potential homologs to required enzymes identified in 10 of the 11 species examined. Transcripts homologous to rate-limiting IPyA enzymes were detected in six species. Only two centric and araphid pennate species expressed transcripts associated with enzymes in the

  10. Genomic and transcriptomic analysis of NDM-1 Klebsiella pneumoniae in spaceflight reveal mechanisms underlying environmental adaptability

    PubMed Central

    Li, Jia; Liu, Fei; Wang, Qi; Ge, Pupu; Woo, Patrick C. Y.; Yan, Jinghua; Zhao, Yanlin; Gao, George F.; Liu, Cui Hua; Liu, Changting

    2014-01-01

    The emergence and rapid spread of New Delhi Metallo-beta-lactamase-1 (NDM-1)-producing Klebsiella pneumoniae strains has caused a great concern worldwide. To better understand the mechanisms underlying environmental adaptation of those highly drug-resistant K. pneumoniae strains, we took advantage of the China's Shenzhou 10 spacecraft mission to conduct comparative genomic and transcriptomic analysis of a NDM-1 K. pneumoniae strain (ATCC BAA-2146) being cultivated under different conditions. The samples were recovered from semisolid medium placed on the ground (D strain), in simulated space condition (M strain), or in Shenzhou 10 spacecraft (T strain) for analysis. Our data revealed multiple variations underlying pathogen adaptation into different environments in terms of changes in morphology, H2O2 tolerance and biofilm formation ability, genomic stability and regulation of metabolic pathways. Additionally, we found a few non-coding RNAs to be differentially regulated. The results are helpful for better understanding the adaptive mechanisms of drug-resistant bacterial pathogens. PMID:25163721

  11. Genomic and transcriptomic analysis of imatinib resistance in gastrointestinal stromal tumors

    PubMed Central

    Takahashi, Tsuyoshi; Elzawahry, Asmaa; Mimaki, Sachiyo; Furukawa, Eisaku; Nakatsuka, Rie; Nakamura, Hiromi; Nishigaki, Takahiko; Serada, Satoshi; Naka, Tetsuji; Hirota, Seiichi; Shibata, Tatsuhiro; Tsuchihara, Katsuya

    2017-01-01

    Gastrointestinal stromal tumors represent the most common mesenchymal tumor of the digestive tract, driven by gain‐of‐function mutations in KIT. Despite its proven benefits, half of the patients treated with imatinib show disease progression within 2 years due to secondary resistance mutations in KIT. It remains unclear how the genomic and transcriptomic features change during the acquisition of imatinib resistance. Here, we performed exome sequencing and microarray transcription analysis for four imatinib‐resistant cell lines and one cell line briefly exposed to imatinib. We also performed exome sequencing of clinical tumor samples. The cell line briefly exposed to imatinib exhibited few single‐nucleotide variants and copy‐number alterations, but showed marked upregulation of genes related to detoxification and downregulation of genes involved in cell cycle progression. Meanwhile, resistant cell lines harbored numerous genomic changes: amplified genes related to detoxification and deleted genes with cyclin‐dependent kinase activity. Some variants in the resistant samples were traced back to the drug‐sensitive samples, indicating the presence of ancestral subpopulations. The subpopulations carried variants associated with cell death. Pre‐existing cancer cells with genetic alterations promoting apoptosis resistance may serve as a basis whereby cancer cells with critical mutations, such as secondary KIT mutations, can establish full imatinib resistance. © 2017 The Authors Genes, Chromosomes and Cancer Published by Wiley Periodicals, Inc. PMID:27997714

  12. A Genomic, Transcriptomic and Proteomic Look at the GE2270 Producer Planobispora rosea, an Uncommon Actinomycete

    PubMed Central

    Gallo, Giuseppe; Petiti, Luca; Corti, Giorgio; Alt, Silke; Cruz, Joao C. S.; Salzano, Anna Maria; Scaloni, Andrea; Puglia, Anna Maria; De Bellis, Gianluca; Peano, Clelia; Donadio, Stefano; Sosio, Margherita

    2015-01-01

    We report the genome sequence of Planobispora rosea ATCC 53733, a mycelium-forming soil-dweller belonging to one of the lesser studied genera of Actinobacteria and producing the thiopeptide GE2270. The P. rosea genome presents considerable convergence in gene organization and function with other members in the family Streptosporangiaceae, with a significant number (44%) of shared orthologs. Patterns of gene expression in P. rosea cultures during exponential and stationary phase have been analyzed using whole transcriptome shotgun sequencing and by proteome analysis. Among the differentially abundant proteins, those involved in protein metabolism are particularly represented, including the GE2270-insensitive EF-Tu. Two proteins from the pbt cluster, directing GE2270 biosynthesis, slightly increase their abundance values over time. While GE2270 production starts during the exponential phase, most pbt genes, as analyzed by qRT-PCR, are down-regulated. The exception is represented by pbtA, encoding the precursor peptide of the ribosomally synthesized GE2270, whose expression reached the highest level at the entry into stationary phase. PMID:26207753

  13. Rapidly developing functional genomics in ecological model systems via 454 transcriptome sequencing.

    PubMed

    Wheat, Christopher W

    2010-04-01

    Next generation sequencing technology affords new opportunities in ecological genetics. This paper addresses how an ecological genetics research program focused on a phenotype of interest can quickly move from no genetic resources to having various functional genomic tools. 454 sequencing and its error rates are discussed, followed by a review of de novo transcriptome assemblies focused on the first successful de novo assembly which happens to be in an ecological model system (the Glanville fritillary butterfly). The potential future developments in 454 sequencing are also covered. Particular attention is paid to the difficulties ecological geneticists are likely to encounter through reviewing relevant studies in both model and non-model systems. Various post-sequencing issues and applications of 454 generated data are presented (e.g. database management, microarray construction, molecular marker and candidate gene development). How to use species with genomic resources to inform study of those without is also discussed. In closing, some of the drawbacks of 454 sequencing are presented along with future prospects of this technology.

  14. Genome-wide primary transcriptome analysis of H2-producing archaeon Thermococcus onnurineus NA1

    PubMed Central

    Cho, Suhyung; Kim, Min-Sik; Jeong, Yujin; Lee, Bo-Rahm; Lee, Jung-Hyun; Kang, Sung Gyun; Cho, Byung-Kwan

    2017-01-01

    In spite of their pivotal roles in transcriptional and post-transcriptional processes, the regulatory elements of archaeal genomes are not yet fully understood. Here, we determine the primary transcriptome of the H2-producing archaeon Thermococcus onnurineus NA1. We identified 1,082 purine-rich transcription initiation sites along with well-conserved TATA box, A-rich B recognition element (BRE), and promoter proximal element (PPE) motif in promoter regions, a high pyrimidine nucleotide content (T/C) at the −1 position, and Shine-Dalgarno (SD) motifs (GGDGRD) in 5′ untranslated regions (5′ UTRs). Along with differential transcript levels, 117 leaderless genes and 86 non-coding RNAs (ncRNAs) were identified, representing diverse cellular functions and potential regulatory functions under the different growth conditions. Interestingly, we observed low GC content in ncRNAs for RNA-based regulation via unstructured forms or interaction with other cellular components. Further comparative analysis of T. onnurineus upstream regulatory sequences with those of closely related archaeal genomes demonstrated that transcription of orthologous genes are initiated by highly conserved promoter sequences, however their upstream sequences for transcriptional and translational regulation are largely diverse. These results provide the genetic information of T. onnurineus for its future application in metabolic engineering. PMID:28216628

  15. The genome and transcriptome of the zoonotic hookworm Ancylostoma ceylanicum identify infection-specific gene families

    PubMed Central

    Schwarz, Erich M; Hu, Yan; Antoshechkin, Igor; Miller, Melanie M; Sternberg, Paul W; Aroian, Raffi V

    2015-01-01

    Hookworms infect over 400 million people, stunting and impoverishing them1–3. Sequencing hookworm genomes and finding which genes they express during infection should help in devising new drugs or vaccines against hookworms4,5. Unlike other hookworms, Ancylostoma ceylanicum infects both humans and other mammals, providing a laboratory model for hookworm disease6,7. We determined an A. ceylanicum genome sequence of 313 Mb, with transcriptomic data throughout infection showing expression of 30,738 genes. Approximately 900 genes were upregulated during early infection in vivo, including ASPRs, a cryptic subfamily of activation-associated secreted proteins (ASPs)8. Genes downregulated during early infection included ion channels and G protein–coupled receptors; this downregulation was observed in both parasitic and free-living nematodes. Later, at the onset of heavy blood feeding, C-lectin genes were upregulated along with genes for secreted clade V proteins (SCVPs), encoding a previously undescribed protein family. These findings provide new drug and vaccine targets and should help elucidate hookworm pathogenesis. PMID:25730766

  16. Heavy metals induce oxidative stress and genome-wide modulation in transcriptome of rice root.

    PubMed

    Dubey, Sonali; Shri, Manju; Misra, Prashant; Lakhwani, Deepika; Bag, Sumit Kumar; Asif, Mehar H; Trivedi, Prabodh Kumar; Tripathi, Rudro Deo; Chakrabarty, Debasis

    2014-06-01

    Industrial growth, ecological disturbances and agricultural practices have contaminated the soil and water with many harmful compounds, including heavy metals. These heavy metals affect growth and development of plants as well as cause severe human health hazards through food chain contamination. In past, studies have been made to identify biochemical and molecular networks associated with heavy metal toxicity and uptake in plants. Studies suggested that most of the physiological and molecular processes affected by different heavy metals are similar to those affected by other abiotic stresses. To identify common and unique responses by different metals, we have studied biochemical and genome-wide modulation in transcriptome of rice (IR-64 cultivar) root after exposure to cadmium (Cd), arsenate [As(V)], lead (Pb) and chromium [Cr(VI)] in hydroponic condition. We observed that root tissue shows variable responses for antioxidant enzyme system for different heavy metals. Genome-wide expression analysis suggests variable number of genes differentially expressed in root in response to As(V), Cd, Pb and Cr(VI) stresses. In addition to unique genes, each heavy metal modulated expression of a large number of common genes. Study also identified cis-acting regions of the promoters which can be determinants for the modulated expression of the genes in response to different heavy metals. Our study advances understanding related to various processes and networks which might be responsible for heavy metal stresses, accumulation and detoxification.

  17. Genomic and transcriptomic analysis of NDM-1 Klebsiella pneumoniae in spaceflight reveal mechanisms underlying environmental adaptability.

    PubMed

    Li, Jia; Liu, Fei; Wang, Qi; Ge, Pupu; Woo, Patrick C Y; Yan, Jinghua; Zhao, Yanlin; Gao, George F; Liu, Cui Hua; Liu, Changting

    2014-08-28

    The emergence and rapid spread of New Delhi Metallo-beta-lactamase-1 (NDM-1)-producing Klebsiella pneumoniae strains has caused a great concern worldwide. To better understand the mechanisms underlying environmental adaptation of those highly drug-resistant K. pneumoniae strains, we took advantage of the China's Shenzhou 10 spacecraft mission to conduct comparative genomic and transcriptomic analysis of a NDM-1 K. pneumoniae strain (ATCC BAA-2146) being cultivated under different conditions. The samples were recovered from semisolid medium placed on the ground (D strain), in simulated space condition (M strain), or in Shenzhou 10 spacecraft (T strain) for analysis. Our data revealed multiple variations underlying pathogen adaptation into different environments in terms of changes in morphology, H2O2 tolerance and biofilm formation ability, genomic stability and regulation of metabolic pathways. Additionally, we found a few non-coding RNAs to be differentially regulated. The results are helpful for better understanding the adaptive mechanisms of drug-resistant bacterial pathogens.

  18. Genome-wide transcriptome profiling of radish (Raphanus sativus L.) in response to vernalization

    PubMed Central

    Wang, Shufen; Xu, Wenling; Liu, Xianxian

    2017-01-01

    Vernalization is a key process for premature bolting. Although many studies on vernalization have been reported, the molecular mechanism of vernalization is still largely unknown in radish. In this study, we sequenced the transcriptomes of radish seedlings at three different time points during vernalization. More than 36 million clean reads were generated for each sample and the portions mapped to the reference genome were all above 67.0%. Our results show that the differentially expressed genes (DEGs) between room temperature and the early stage of vernalization (4,845) are the most in all treatments pairs. A series of vernalization related genes, including two FLOWERING LOCUS C (FLC) genes, were screened according to the annotations. A total of 775 genes were also filtered as the vernalization related candidates based on their expression profiles. Cold stress responsive genes were also analyzed to further confirm the sequencing result. Several key genes in vernalization or cold stress response were validated by quantitative RT-PCR (RT-qPCR). This study identified a number of genes that may be involved in vernalization, which are useful for other functional genomics research in radish. PMID:28498850

  19. Genome-wide transcriptome profiling of radish (Raphanus sativus L.) in response to vernalization.

    PubMed

    Liu, Chen; Wang, Shufen; Xu, Wenling; Liu, Xianxian

    2017-01-01

    Vernalization is a key process for premature bolting. Although many studies on vernalization have been reported, the molecular mechanism of vernalization is still largely unknown in radish. In this study, we sequenced the transcriptomes of radish seedlings at three different time points during vernalization. More than 36 million clean reads were generated for each sample and the portions mapped to the reference genome were all above 67.0%. Our results show that the differentially expressed genes (DEGs) between room temperature and the early stage of vernalization (4,845) are the most in all treatments pairs. A series of vernalization related genes, including two FLOWERING LOCUS C (FLC) genes, were screened according to the annotations. A total of 775 genes were also filtered as the vernalization related candidates based on their expression profiles. Cold stress responsive genes were also analyzed to further confirm the sequencing result. Several key genes in vernalization or cold stress response were validated by quantitative RT-PCR (RT-qPCR). This study identified a number of genes that may be involved in vernalization, which are useful for other functional genomics research in radish.

  20. Genome-wide primary transcriptome analysis of H2-producing archaeon Thermococcus onnurineus NA1.

    PubMed

    Cho, Suhyung; Kim, Min-Sik; Jeong, Yujin; Lee, Bo-Rahm; Lee, Jung-Hyun; Kang, Sung Gyun; Cho, Byung-Kwan

    2017-02-20

    In spite of their pivotal roles in transcriptional and post-transcriptional processes, the regulatory elements of archaeal genomes are not yet fully understood. Here, we determine the primary transcriptome of the H2-producing archaeon Thermococcus onnurineus NA1. We identified 1,082 purine-rich transcription initiation sites along with well-conserved TATA box, A-rich B recognition element (BRE), and promoter proximal element (PPE) motif in promoter regions, a high pyrimidine nucleotide content (T/C) at the -1 position, and Shine-Dalgarno (SD) motifs (GGDGRD) in 5' untranslated regions (5' UTRs). Along with differential transcript levels, 117 leaderless genes and 86 non-coding RNAs (ncRNAs) were identified, representing diverse cellular functions and potential regulatory functions under the different growth conditions. Interestingly, we observed low GC content in ncRNAs for RNA-based regulation via unstructured forms or interaction with other cellular components. Further comparative analysis of T. onnurineus upstream regulatory sequences with those of closely related archaeal genomes demonstrated that transcription of orthologous genes are initiated by highly conserved promoter sequences, however their upstream sequences for transcriptional and translational regulation are largely diverse. These results provide the genetic information of T. onnurineus for its future application in metabolic engineering.

  1. Population genomic footprints of fine-scale differentiation between habitats in Mediterranean blue tits.

    PubMed

    Szulkin, M; Gagnaire, P-A; Bierne, N; Charmantier, A

    2016-01-01

    Linking population genetic variation to the spatial heterogeneity of the environment is of fundamental interest to evolutionary biology and ecology, in particular when phenotypic differences between populations are observed at biologically small spatial scales. Here, we applied restriction-site associated DNA sequencing (RAD-Seq) to test whether phenotypically differentiated populations of wild blue tits (Cyanistes caeruleus) breeding in a highly heterogeneous environment exhibit genetic structure related to habitat type. Using 12 106 SNPs in 197 individuals from deciduous and evergreen oak woodlands, we applied complementary population genomic analyses, which revealed that genetic variation is influenced by both geographical distance and habitat type. A fine-scale genetic differentiation supported by genome- and transcriptome-wide analyses was found within Corsica, between two adjacent habitats where blue tits exhibit marked differences in breeding time while nesting < 6 km apart. Using redundancy analysis (RDA), we show that genomic variation remains associated with habitat type when controlling for spatial and temporal effects. Finally, our results suggest that the observed patterns of genomic differentiation were not driven by a small proportion of highly differentiated loci, but rather emerged through a process such as habitat choice, which reduces gene flow between habitats across the entire genome. The pattern of genomic isolation-by-environment closely matches differentiation observed at the phenotypic level, thereby offering significant potential for future inference of phenotype-genotype associations in a heterogeneous environment.

  2. The Genomic HyperBrowser: an analysis web server for genome-scale data.

    PubMed

    Sandve, Geir K; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid K; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalas, Matús; Lien, Tonje; Rye, Morten B; Frigessi, Arnoldo; Hovig, Eivind

    2013-07-01

    The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.

  3. The first Chameleon transcriptome: comparative genomic analysis of the OXPHOS system reveals loss of COX8 in Iguanian lizards.

    PubMed

    Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan

    2013-01-01

    Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system.

  4. The First Chameleon Transcriptome: Comparative Genomic Analysis of the OXPHOS System Reveals Loss of COX8 in Iguanian Lizards

    PubMed Central

    Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan

    2013-01-01

    Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system. PMID:24009133

  5. Transcriptomic signature to oxidative stress exposure at the time of embryonic genome activation in bovine blastocysts.

    PubMed

    Cagnone, Gael L M; Sirard, Marc-André

    2013-04-01

    In order to understand how in vitro culture affects embryonic quality, we analyzed survival and global gene expression in bovine blastocysts after exposure to increased oxidative stress conditions. Two pro-oxidant agents, one that acts extracellularly by promoting reactive oxygen species (ROS) production (0.01 mM 2,2'-azobis (2-amidinopropane) dihydrochloride [AAPH]) or another that acts intracellularly by inhibiting glutathione synthesis (0.4 mM buthionine sulfoximine [BSO]) were added separately to in vitro culture media from Day 3 (8-16-cell stage) onward. Transcriptomic analysis was then performed on resulting Day-7 blastocysts. In the literature, these two pro-oxidant conditions were shown to induce delayed degeneration in a proportion of Day-8 blastocysts. In our experiment, no morphological difference was visible, but AAPH tended to decrease the blastocyst rate while BSO significantly reduced it, indicating a differential impact on the surviving population. At the transcriptomic level, blastocysts that survived either pro-oxidant exposure showed oxidative stress and an inflammatory response (ARRB2), although AAPH induced higher disturbances in cellular homeostasis (SERPINE1). Functional genomics of the BSO profile, however, identified differential expression of genes related to glycine metabolism and energy metabolism (TPI1). These differential features might be indicative of pre-degenerative blastocysts (IGFBP7) in the AAPH population whereas BSO exposure would select the most viable individuals (TKDP1). Together, these results illustrate how oxidative disruption of pre-attachment development is associated with systematic up-regulation of several metabolic markers. Moreover, it indicates that a better capacity to survive anti-oxidant depletion may allow for the survival of blastocysts with a quieter metabolism after compaction. Copyright © 2013 Wiley Periodicals, Inc.

  6. Genome-wide transcriptome profiling reveals novel insights into Luffa cylindrica browning.

    PubMed

    Chen, Xia; Tan, Taiming; Xu, Changcheng; Huang, Shuping; Tan, Jie; Zhang, Min; Wang, Chunli; Xie, Conghua

    2015-08-07

    Luffa cylindrica (sponge gourd) is one of the most popular vegetables in China. Production and consumption of L. cylindrica are limited due to postharvest browning; however, little is known about the genetic regulation of the browning process. In the present study, transcriptome profiles of L. cylindrica cultivars, YLB05 (browning resistant) and XTR05 (browning sensitive), were analyzed using next-generation sequencing to clarify the genes and mechanisms associated with browning. A total of 9.1 Gb of valid data including 116,703 unigenes (>200 bp) were obtained and 39,473 sequences were annotated by alignment against five public databases. Of these, there were 27,407 genes assigned to 747 Gene Ontology functional categories; and 12,350 genes were annotated with 25 Eukaryotic Orthologous Groups (KOG) categories with 343 KOG functional terms. Additionally, by searching against the Kyoto Encyclopedia of Genes and Genomes database, 8689 unigenes were mapped to 189 pathways. Furthermore, there were 24,556 sequences found to be differentially regulated, including 4344 annotated unigenes. Several genes potentially associated with phenolic oxidation, carbohydrate and hormone metabolism were found differentially regulated between the cultivars of different browning sensitivities. Our results suggest that elements involved in enzymatic processes and other pathways might be responsible for L. cylindrica browning. The present study provides a comprehensive transcriptome sequence resource, which will facilitate further studies on gene discovery and exploiting the fruit browning mechanism of L. cylindrica. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Genome-wide transcriptome analysis of expression in rice seedling roots in response to supplemental nitrogen.

    PubMed

    Chandran, Anil Kumar Nalini; Priatama, Ryza A; Kumar, Vikranth; Xuan, Yuanhu; Je, Byoung Il; Kim, Chul Min; Jung, Ki-Hong; Han, Chang-Deok

    2016-08-01

    Nitrogen (N) is the most important macronutrient for plant growth and grain yields. For rice crops, nitrate and ammonium are the major N sources. To explore the genomic responses to ammonium supplements in rice roots, we used 17-day-old seedlings grown in the absence of external N that were then exposed to 0.5mM (NH4)2SO4 for 3h. Transcriptomic profiles were examined by microarray experiments. In all, 634 genes were up-regulated at least two-fold by the N-supplement when compared with expression in roots from untreated control plants. Gene Ontology (GO) enrichment analysis revealed that those upregulated genes are associated with 23 GO terms. Among them, metabolic processes for diverse amino acids (i.e., aspartate, threonine, tryptophan, glutamine, l-phenylalanine, and thiamin) as well as nitrogen compounds are highly over-represented, demonstrating that our selected genes are suitable for studying the N-response in roots. This enrichment analysis also indicated that nitrogen is closely linked to diverse transporter activities by primary metabolites, including proteins (amino acids), lipids, and carbohydrates, and is associated with carbohydrate catabolism and cell wall organization. Integration of results from omics analysis of metabolic pathways and transcriptome data using the MapMan tool suggested that the TCA cycle and pathway for mitochondrial electron transport are co-regulated when rice roots are exposed to ammonium. We also investigated the expression of N-responsive marker genes by performing a comparative analysis with root samples from plants grown under different NH4(+) treatments. The diverse responses to such treatment provide useful insight into the global changes related to the shift from an N-deficiency to an enhanced N-supply in rice, a model crop plant. Copyright © 2016 Elsevier GmbH. All rights reserved.

  8. Wheat miRNA ancestors: evident by transcriptome analysis of A, B, and D genome donors.

    PubMed

    Alptekin, Burcu; Budak, Hikmet

    2017-05-01

    MicroRNAs are critical players of post-transcriptional gene regulation with profound effects on the fundamental processes of cellular life. Their identification and characterization, together with their targets, hold great significance in exploring and exploiting their roles on a functional context, providing valuable clues into the regulation of important biological processes, such as stress tolerance or environmental adaptation. Wheat is a hardy crop, extensively harvested in temperate regions, and is a major component of the human diet. With the advent of the next generation sequencing technologies considerably decreasing sequencing costs per base-pair, genomic, and transcriptomic data from several wheat species, including the progenitors and wild relatives have become available. In this study, we performed in silico identification and comparative analysis of microRNA repertoires of bread wheat (Triticum aestivum L.) and its diploid progenitors and relatives, Aegilops sharonensis, Aegilops speltoides, Aegilops tauschii, Triticum monococcum, and Triticum urartu through the utilization of publicly available transcriptomic data. Over 200 miRNA families were identified, majority of which have not previously been reported. Ancestral relationships expanded our understanding of wheat miRNA evolution, while T. monococcum miRNAs delivered important clues on the effects of domestication on miRNA expression. Comparative analyses on wild Ae. sharonensis accessions highlighted candidate miRNAs that can be linked to stress tolerance. The miRNA repertoires of bread wheat and its diploid progenitors and relatives provide important insight into the diversification and distribution of miRNA genes, which should contribute to the elucidation of miRNA evolution of Poaceae family. A thorough understanding of the convergent and divergent expression profiles of miRNAs in different genetic backgrounds can provide unique opportunities to modulation of gene regulation for better crop

  9. Integrative Analysis of Genomics and Transcriptome Data to Identify Potential Functional Genes of BMDs in Females.

    PubMed

    Chen, Yuan-Cheng; Guo, Yan-Fang; He, Hao; Lin, Xu; Wang, Xia-Fang; Zhou, Rou; Li, Wen-Ting; Pan, Dao-Yan; Shen, Jie; Deng, Hong-Wen

    2016-05-01

    Osteoporosis is known to be highly heritable. However, to date, the findings from more than 20 genome-wide association studies (GWASs) have explained less than 6% of genetic risks. Studies suggest that the missing heritability data may be because of joint effects among genes. To identify novel heritability for osteoporosis, we performed a system-level study on bone mineral density (BMD) by weighted gene coexpression network analysis (WGCNA), using the largest GWAS data set for BMD in the field, Genetic Factors for Osteoporosis Consortium (GEFOS-2), and a transcriptomic gene expression data set generated from transiliac bone biopsies in women. A weighted gene coexpression network was generated for 1574 genes with GWAS nominal evidence of association (p ≤ 0.05) based on dissimilarity measurement on the expression data. Twelve distinct gene modules were identified, and four modules showed nominally significant associations with BMD (p ≤ 0.05), but only one module, the yellow module, demonstrated a good correlation between module membership (MM) and gene significance (GS), suggesting that the yellow module serves an important biological role in bone regulation. Interestingly, through characterization of module content and topology, the yellow module was found to be significantly enriched with contractile fiber part (GO:044449), which is widely recognized as having a close relationship between muscle and bone. Furthermore, detailed submodule analyses of important candidate genes (HOMER1, SPTBN1) by all edges within the yellow module implied significant enrichment of functional connections between bone and cytoskeletal protein binding. Our study yielded novel information from system genetics analyses of GWAS data jointly with transcriptomic data. The findings highlighted a module and several genes in the model as playing important roles in the regulation of bone mass in females, which may yield novel insights into the genetic basis of osteoporosis. © 2016

  10. Genome wide transcriptome analysis of dendritic cells identifies genes with altered expression in psoriasis.

    PubMed

    Filkor, Kata; Hegedűs, Zoltán; Szász, András; Tubak, Vilmos; Kemény, Lajos; Kondorosi, Éva; Nagy, István

    2013-01-01

    Activation of dendritic cells by different pathogens induces the secretion of proinflammatory mediators resulting in local inflammation. Importantly, innate immunity must be properly controlled, as its continuous activation leads to the development of chronic inflammatory diseases such as psoriasis. Lipopolysaccharide (LPS) or peptidoglycan (PGN) induced tolerance, a phenomenon of transient unresponsiveness of cells to repeated or prolonged stimulation, proved valuable model for the study of chronic inflammation. Thus, the aim of this study was the identification of the transcriptional diversity of primary human immature dendritic cells (iDCs) upon PGN induced tolerance. Using SAGE-Seq approach, a tag-based transcriptome sequencing method, we investigated gene expression changes of primary human iDCs upon stimulation or restimulation with Staphylococcus aureus derived PGN, a widely used TLR2 ligand. Based on the expression pattern of the altered genes, we identified non-tolerizeable and tolerizeable genes. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (Kegg) analysis showed marked enrichment of immune-, cell cycle- and apoptosis related genes. In parallel to the marked induction of proinflammatory mediators, negative feedback regulators of innate immunity, such as TNFAIP3, TNFAIP8, Tyro3 and Mer are markedly downregulated in tolerant cells. We also demonstrate, that the expression pattern of TNFAIP3 and TNFAIP8 is altered in both lesional, and non-lesional skin of psoriatic patients. Finally, we show that pretreatment of immature dendritic cells with anti-TNF-α inhibits the expression of IL-6 and CCL1 in tolerant iDCs and partially releases the suppression of TNFAIP8. Our findings suggest that after PGN stimulation/restimulation the host cell utilizes different mechanisms in order to maintain critical balance between inflammation and tolerance. Importantly, the transcriptome sequencing of stimulated/restimulated iDCs identified numerous genes with

  11. Genome-Scale Variation of Tubeworm Symbionts

    NASA Astrophysics Data System (ADS)

    Robidart, J.; Felbeck, H.

    2005-12-01

    Hydrothermal vent tubeworms are completely dependent on their bacterial symbionts for nutrition. Despite this dependency, many studies have concluded that bacterial symbionts are acquired anew from the environment, every generation rather than the more reliable mode of symbiont transmission from parent directly to offspring. Ribosomal 16S sequences have shown little variation of symbiont phylogeny from worm to worm, but higher resolution genome-scale analyses have found that there is genomic heterogeneity between symbionts from worms in different environments. What genes can be "spared," while resulting in an intact symbiosis? Have symbionts from one environment gained physiological capabilities that make them more fit in that environment? In order to answer these questions, subtractive hybridization was used on symbionts of Riftia pachyptila tubeworms from different environments to gain insight into which genes are present in one symbiont and absent in the other. Many genes were found to be unique to each symbiont and these results will be presented. This technique will be applied to answer many fundamental questions regarding microbial symbiont evolution to a specific physico-chemical environment, to a different host species, and more.

  12. The draft genome, transcriptome, and microbiome of Dermatophagoides farinae reveal a broad spectrum of dust mite allergens.

    PubMed

    Chan, Ting-Fung; Ji, Kun-Mei; Yim, Aldrin Kay-Yuen; Liu, Xiao-Yu; Zhou, Jun-Wei; Li, Rui-Qi; Yang, Kevin Yi; Li, Jing; Li, Meng; Law, Patrick Tik-Wan; Wu, Yu-Lan; Cai, Ze-Lang; Qin, Hao; Bao, Ying; Leung, Ross Ka-Kit; Ng, Patrick Kwok-Shing; Zou, Ju; Zhong, Xiao-Jun; Ran, Pi-Xin; Zhong, Nan-Shan; Liu, Zhi-Gang; Tsui, Stephen Kwok-Wing

    2015-02-01

    A sequenced house dust mite (HDM) genome would advance our understanding of HDM allergens, a common cause of human allergies. We sought to produce an annotated Dermatophagoides farinae draft genome and develop a combined genomic-transcriptomic-proteomic approach for elucidation of HDM allergens. A D farinae draft genome and transcriptome were assembled with high-throughput sequencing, accommodating microbiome sequences. The allergen gene structures were validated by means of Sanger sequencing. The mite's microbiome composition was determined, and the predominant genus was validated immunohistochemically. The allergenicity of a ubiquinol-cytochrome c reductase binding protein homologue was evaluated with immunoblotting, immunosorbent assays, and skin prick tests. The full gene structures of 20 canonical allergens and 7 noncanonical allergen homologues were produced. A novel major allergen, ubiquinol-cytochrome c reductase binding protein-like protein, was found and designated Der f 24. All 40 sera samples from patients with mite allergy had IgE antibodies against rDer f 24. Of 10 patients tested, 5 had positive skin reactions. The predominant bacterial genus among 100 identified species was Enterobacter (63.4%). An intron was found in the 13.8-kDa D farinae bacteriolytic enzyme gene, indicating that it is of HDM origin. The Kyoto Encyclopedia of Genes and Genomes pathway analysis revealed a phototransduction pathway in D farinae, as well as thiamine and amino acid synthesis pathways, which is suggestive of an endosymbiotic relationship between D farinae and its microbiome. An HDM genome draft produced from genomic, transcriptomic, and proteomic experiments revealed allergen genes and a diverse endosymbiotic microbiome, providing a tool for further identification and characterization of HDM allergens and development of diagnostics and immunotherapeutic vaccines. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  13. Transcriptomics and molecular evolutionary rate analysis of the bladderwort (Utricularia), a carnivorous plant with a minimal genome

    PubMed Central

    2011-01-01

    Background The carnivorous plant Utricularia gibba (bladderwort) is remarkable in having a minute genome, which at ca. 80 megabases is approximately half that of Arabidopsis. Bladderworts show an incredible diversity of forms surrounding a defined theme: tiny, bladder-like suction traps on terrestrial, epiphytic, or aquatic plants with a diversity of unusual vegetative forms. Utricularia plants, which are rootless, are also anomalous in physiological features (respiration and carbon distribution), and highly enhanced molecular evolutionary rates in chloroplast, mitochondrial and nuclear ribosomal sequences. Despite great interest in the genus, no genomic resources exist for Utricularia, and the substitution rate increase has received limited study. Results Here we describe the sequencing and analysis of the Utricularia gibba transcriptome. Three different organs were surveyed, the traps, the vegetative shoot bodies, and the inflorescence stems. We also examined the bladderwort transcriptome under diverse stress conditions. We detail aspects of functional classification, tissue similarity, nitrogen and phosphorus metabolism, respiration, DNA repair, and detoxification of reactive oxygen species (ROS). Long contigs of plastid and mitochondrial genomes, as well as sequences for 100 individual nuclear genes, were compared with those of other plants to better establish information on molecular evolutionary rates. Conclusion The Utricularia transcriptome provides a detailed genomic window into processes occurring in a carnivorous plant. It contains a deep representation of the complex metabolic pathways that characterize a putative minimal plant genome, permitting its use as a source of genomic information to explore the structural, functional, and evolutionary diversity of the genus. Vegetative shoots and traps are the most similar organs by functional classification of their transcriptome, the traps expressing hydrolytic enzymes for prey digestion that were previously

  14. Transcriptomics and molecular evolutionary rate analysis of the bladderwort (Utricularia), a carnivorous plant with a minimal genome.

    PubMed

    Ibarra-Laclette, Enrique; Albert, Victor A; Pérez-Torres, Claudia A; Zamudio-Hernández, Flor; Ortega-Estrada, María de J; Herrera-Estrella, Alfredo; Herrera-Estrella, Luis

    2011-06-03

    The carnivorous plant Utricularia gibba (bladderwort) is remarkable in having a minute genome, which at ca. 80 megabases is approximately half that of Arabidopsis. Bladderworts show an incredible diversity of forms surrounding a defined theme: tiny, bladder-like suction traps on terrestrial, epiphytic, or aquatic plants with a diversity of unusual vegetative forms. Utricularia plants, which are rootless, are also anomalous in physiological features (respiration and carbon distribution), and highly enhanced molecular evolutionary rates in chloroplast, mitochondrial and nuclear ribosomal sequences. Despite great interest in the genus, no genomic resources exist for Utricularia, and the substitution rate increase has received limited study. Here we describe the sequencing and analysis of the Utricularia gibba transcriptome. Three different organs were surveyed, the traps, the vegetative shoot bodies, and the inflorescence stems. We also examined the bladderwort transcriptome under diverse stress conditions. We detail aspects of functional classification, tissue similarity, nitrogen and phosphorus metabolism, respiration, DNA repair, and detoxification of reactive oxygen species (ROS). Long contigs of plastid and mitochondrial genomes, as well as sequences for 100 individual nuclear genes, were compared with those of other plants to better establish information on molecular evolutionary rates. The Utricularia transcriptome provides a detailed genomic window into processes occurring in a carnivorous plant. It contains a deep representation of the complex metabolic pathways that characterize a putative minimal plant genome, permitting its use as a source of genomic information to explore the structural, functional, and evolutionary diversity of the genus. Vegetative shoots and traps are the most similar organs by functional classification of their transcriptome, the traps expressing hydrolytic enzymes for prey digestion that were previously thought to be encoded by

  15. Genomic and transcriptomic insights into the thermo-regulated biosynthesis of validamycin in Streptomyces hygroscopicus 5008

    PubMed Central

    2012-01-01

    Background Streptomyces hygroscopicus 5008 has been used for the production of the antifungal validamycin/jinggangmycin for more than 40 years. A high yield of validamycin is achieved by culturing the strain at 37°C, rather than at 30°C for normal growth and sporulation. The mechanism(s) of its thermo-regulated biosynthesis was largely unknown. Results The 10,383,684-bp genome of strain 5008 was completely sequenced and composed of a linear chromosome, a 164.57-kb linear plasmid, and a 73.28-kb circular plasmid. Compared with other Streptomyces genomes, the chromosome of strain 5008 has a smaller core region and shorter terminal inverted repeats, encodes more α/β hydrolases, major facilitator superfamily transporters, and Mg2+/Mn2+-dependent regulatory phosphatases. Transcriptomic analysis revealed that the expression of 7.5% of coding sequences was increased at 37°C, including biosynthetic genes for validamycin and other three secondary metabolites. At 37°C, a glutamate dehydrogenase was transcriptionally up-regulated, and further proved its involvement in validamycin production by gene replacement. Moreover, efficient synthesis and utilization of intracellular glutamate were noticed in strain 5008 at 37°C, revealing glutamate as the nitrogen source for validamycin biosynthesis. Furthermore, a SARP-family regulatory gene with enhanced transcription at 37°C was identified and confirmed to be positively involved in the thermo-regulation of validamycin production by gene inactivation and transcriptional analysis. Conclusions Strain 5008 seemed to have evolved with specific genomic components to facilitate the thermo-regulated validamycin biosynthesis. The data obtained here will facilitate future studies for validamycin yield improvement and industrial bioprocess optimization. PMID:22827618

  16. Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane.

    PubMed

    Taniguti, Lucas M; Schaker, Patricia D C; Benevenuto, Juliana; Peters, Leila P; Carvalho, Giselle; Palhares, Alessandra; Quecine, Maria C; Nunes, Filipe R S; Kmit, Maria C P; Wai, Alvan; Hausner, Georg; Aitken, Karen S; Berkman, Paul J; Fraser, James A; Moolhuijzen, Paula M; Coutinho, Luiz L; Creste, Silvana; Vieira, Maria L C; Kitajima, João P; Monteiro-Vitorello, Claudia B

    2015-01-01

    Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence) revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions.

  17. Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane

    PubMed Central

    Benevenuto, Juliana; Peters, Leila P.; Carvalho, Giselle; Palhares, Alessandra; Quecine, Maria C.; Nunes, Filipe R. S.; Kmit, Maria C. P.; Wai, Alvan; Hausner, Georg; Aitken, Karen S.; Berkman, Paul J.; Fraser, James A.; Moolhuijzen, Paula M.; Coutinho, Luiz L.; Creste, Silvana; Vieira, Maria L. C.; Kitajima, João P.; Monteiro-Vitorello, Claudia B.

    2015-01-01

    Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence) revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions. PMID:26065709

  18. From genes to milk: genomic organization and epigenetic regulation of the mammary transcriptome.

    PubMed

    Lemay, Danielle G; Pollard, Katherine S; Martin, William F; Freeman Zadrowski, Courtneay; Hernandez, Joseph; Korf, Ian; German, J Bruce; Rijnkels, Monique

    2013-01-01

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin state contributes to the co-regulation of gene neighborhoods. The mammary gland represents a unique evolutionary model, due to its recent appearance, in the context of vertebrate genomes. An understanding of how the mammary gland is regulated to produce milk is also of biomedical and agricultural importance for human lactation and dairying. Here, we integrate epigenomic and transcriptomic data to develop a comprehensive regulatory model. Neighborhoods of mammary-expressed genes were determined using expression data derived from pregnant and lactating mice and a neighborhood scoring tool, G-NEST. Regions of open and closed chromatin were identified by ChIP-Seq of histone modifications H3K36me3, H3K4me2, and H3K27me3 in the mouse mammary gland and liver tissue during lactation. We found that neighborhoods of genes in regions of uniquely active chromatin in the lactating mammary gland, compared with liver tissue, were extremely rare. Rather, genes in most neighborhoods were suppressed during lactation as reflected in their expression levels and their location in regions of silenced chromatin. Chromatin silencing was largely shared between the liver and mammary gland during lactation, and what distinguished the mammary gland was mainly a small tissue-specific repertoire of isolated, expressed genes. These findings suggest that an advantage of the neighborhood organization is in the collective repression of groups of genes via a shared mechanism of chromatin repression. Genes essential to the mammary gland's uniqueness are isolated from neighbors, and likely have less tolerance for variation in expression, properties they share with genes responsible for an organism's survival.

  19. From Genes to Milk: Genomic Organization and Epigenetic Regulation of the Mammary Transcriptome

    PubMed Central

    Lemay, Danielle G.; Pollard, Katherine S.; Martin, William F.; Freeman Zadrowski, Courtneay; Hernandez, Joseph; Korf, Ian; German, J. Bruce; Rijnkels, Monique

    2013-01-01

    Even in genomes lacking operons, a gene's position in the genome influences its potential for expression. The mechanisms by which adjacent genes are co-expressed are still not completely understood. Using lactation and the mammary gland as a model system, we explore the hypothesis that chromatin state contributes to the co-regulation of gene neighborhoods. The mammary gland represents a unique evolutionary model, due to its recent appearance, in the context of vertebrate genomes. An understanding of how the mammary gland is regulated to produce milk is also of biomedical and agricultural importance for human lactation and dairying. Here, we integrate epigenomic and transcriptomic data to develop a comprehensive regulatory model. Neighborhoods of mammary-expressed genes were determined using expression data derived from pregnant and lactating mice and a neighborhood scoring tool, G-NEST. Regions of open and closed chromatin were identified by ChIP-Seq of histone modifications H3K36me3, H3K4me2, and H3K27me3 in the mouse mammary gland and liver tissue during lactation. We found that neighborhoods of genes in regions of uniquely active chromatin in the lactating mammary gland, compared with liver tissue, were extremely rare. Rather, genes in most neighborhoods were suppressed during lactation as reflected in their expression levels and their location in regions of silenced chromatin. Chromatin silencing was largely shared between the liver and mammary gland during lactation, and what distinguished the mammary gland was mainly a small tissue-specific repertoire of isolated, expressed genes. These findings suggest that an advantage of the neighborhood organization is in the collective repression of groups of genes via a shared mechanism of chromatin repression. Genes essential to the mammary gland's uniqueness are isolated from neighbors, and likely have less tolerance for variation in expression, properties they share with genes responsible for an organism's survival

  20. Genomics and transcriptomics characterization of genes expressed during postharvest at 4°C by the edible basidiomycete Pleurotus ostreatus.

    PubMed

    Ramírez, Lucía; Oguiza, José Antonio; Pérez, Gúmer; Lavín, José Luis; Omarini, Alejandra; Santoyo, Francisco; Alfaro, Manuel; Castanera, Raúl; Parenti, Alejandra; Muguerza, Elaia; Pisabarro, Antonio G

    2011-06-01

    Pleurotus ostreatus is an industrially cultivated basidiomycete with nutritional and environmental applications. Its genome, which was sequenced by the Joint Genome Institute, has become a model for lignin degradation and for fungal genomics and transcriptomics studies. The complete P. ostreatus genome contains 35 Mbp organized in 11 chromosomes, and two different haploid genomes have been individually sequenced. In this work, genomics and transcriptomics approaches were employed in the study of P. ostreatus under different physiological conditions. Specifically, we analyzed a collection of expressed sequence tags (EST) obtained from cut fruit bodies that had been stored at 4°C for 7 days (postharvest conditions). Studies of the 253 expressed clones that had been automatically and manually annotated provided a detailed picture of the life characteristics of the self-sustained fruit bodies. The results suggested a complex metabolism in which autophagy, RNA metabolism, and protein and carbohydrate turnover are increased. Genes involved in environment sensing and morphogenesis were expressed under these conditions. The data improve our understanding of the decay process in postharvest mushrooms and highlight the use of high-throughput techniques to construct models of living organisms subjected to different environmental conditions.

  1. Genome and transcriptome analysis of the Mesoamerican common bean and the role of gene duplications in establishing tissue and temporal specialization of genes.

    PubMed

    Vlasova, Anna; Capella-Gutiérrez, Salvador; Rendón-Anaya, Martha; Hernández-Oñate, Miguel; Minoche, André E; Erb, Ionas; Câmara, Francisco; Prieto-Barja, Pablo; Corvelo, André; Sanseverino, Walter; Westergaard, Gastón; Dohm, Juliane C; Pappas, Georgios J; Saburido-Alvarez, Soledad; Kedra, Darek; Gonzalez, Irene; Cozzuto, Luca; Gómez-Garrido, Jessica; Aguilar-Morón, María A; Andreu, Nuria; Aguilar, O Mario; Garcia-Mas, Jordi; Zehnsdorf, Maik; Vázquez, Martín P; Delgado-Salinas, Alfonso; Delaye, Luis; Lowy, Ernesto; Mentaberry, Alejandro; Vianello-Brondani, Rosana P; García, José Luís; Alioto, Tyler; Sánchez, Federico; Himmelbauer, Heinz; Santalla, Marta; Notredame, Cedric; Gabaldón, Toni; Herrera-Estrella, Alfredo; Guigó, Roderic

    2016-02-25

    Legumes are the third largest family of angiosperms and the second most important crop class. Legume genomes have been shaped by extensive large-scale gene duplications, including an approximately 58 million year old whole genome duplication shared by most crop legumes. We report the genome and the transcription atlas of coding and non-coding genes of a Mesoamerican genotype of common bean (Phaseolus vulgaris L., BAT93). Using a comprehensive phylogenomics analysis, we assessed the past and recent evolution of common bean, and traced the diversification of patterns of gene expression following duplication. We find that successive rounds of gene duplications in legumes have shaped tissue and developmental expression, leading to increased levels of specialization in larger gene families. We also find that many long non-coding RNAs are preferentially expressed in germ-line-related tissues (pods and seeds), suggesting that they play a significant role in fruit development. Our results also suggest that most bean-specific gene family expansions, including resistance gene clusters, predate the split of the Mesoamerican and Andean gene pools. The genome and transcriptome data herein generated for a Mesoamerican genotype represent a counterpart to the genomic resources already available for the Andean gene pool. Altogether, this information will allow the genetic dissection of the characters involved in the domestication and adaptation of the crop, and their further implementation in breeding strategies for this important crop.

  2. Integration of Ixodes ricinus genome sequencing with transcriptome and proteome annotation of the naïve midgut.

    PubMed

    Cramaro, Wibke J; Revets, Dominique; Hunewald, Oliver E; Sinner, Regina; Reye, Anna L; Muller, Claude P

    2015-10-28

    In Europe, Ixodes ricinus ticks are the most important vectors of diseases threatening humans, livestock, wildlife and companion animals. Nevertheless, genomic sequence information is missing and functional annotation of transcripts and proteins is limited. This lack of information is restricting studies of the vector and its interactions with pathogens and hosts. Here we present and integrate the first analysis of the I. ricinus genome with the transcriptome and proteome of the unfed I. ricinus midgut. Whole genome sequencing was performed on I. ricinus ticks and the sequences were de novo assembled. In parallel, I. ricinus ticks were dissected and the midgut transcriptome sequenced. Both datasets were integrated by transcript discovery analysis to identify putative genes and genome contigs were screened for homology. An alignment-based and a motif-search-based approach were combined for the annotation of the midgut transcriptome. Additionally, midgut proteins were identified and annotated by mass spectrometry with public databases and the in-house built transcriptome database as references and results were cross-validated. The de novo assembly of 1 billion DNA sequences to a reference genome of 393 Mb length provides an unprecedented insight into the I. ricinus genome. A homology search revealed sequences in the assembled genome contigs homologous to 89% of the I. scapularis genome scaffolds indicating coverage of most genome regions. We identified moreover 6,415 putative genes. More than 10,000 transcripts from naïve midgut were annotated with respect of predicted function and/or cellular localization. By combining an alignment-based with a motif-search-based annotation approach, we doubled the number of annotations throughout all functional categories. In addition, 574 gel spots were significantly identified by mass spectrometry (p<0.05) and 285 distinct proteins expressed in the naïve midgut were annotated functionally and/or for cellular localization. Our

  3. Comparative genomic and transcriptomic analyses reveal habitat differentiation and different transcriptional responses during pectin metabolism in Alishewanella species.

    PubMed

    Jung, Jaejoon; Park, Woojun

    2013-10-01

    Alishewanella species are expected to have high adaptability to diverse environments because they are isolated from different natural habitats. To investigate how the evolutionary history of Alishewanella species is reflected in their genomes, we performed comparative genomic and transcriptomic analyses of A. jeotgali, A. aestuarii, and A. agri, which were isolated from fermented seafood, tidal flat sediment, and soil, respectively. Genomic islands with variable GC contents indicated that invasion of prophage and transposition events occurred in A. jeotgali and A. agri but not in A. aestuarii. Habitat differentiation of A. agri from a marine environment to a terrestrial environment was proposed because the species-specific genes of A. agri were similar to those of soil bacteria, whereas those of A. jeotgali and A. aestuarii were more closely related to marine bacteria. Comparative transcriptomic analysis with pectin as a sole carbon source revealed different transcriptional responses in Alishewanella species, especially in oxidative stress-, methylglyoxal detoxification-, membrane maintenance-, and protease/chaperone activity-related genes. Transcriptomic and experimental data demonstrated that A. agri had a higher pectin degradation rate and more resistance to oxidative stress under pectin-amended conditions than the other 2 Alishewanella species. However, expression patterns of genes in the pectin metabolic pathway and of glyoxylate bypass genes were similar among all 3 Alishewanella species. Our comparative genomic and transcriptomic data revealed that Alishewanella species have evolved through horizontal gene transfer and habitat differentiation and that pectin degradation pathways in Alishewanella species are highly conserved, although stress responses of each Alishewanella species differed under pectin culture conditions.

  4. Identification of Candidate Adherent-Invasive E. coli Signature Transcripts by Genomic/Transcriptomic Analysis.

    PubMed

    Zhang, Yuanhao; Rowehl, Leahana; Krumsiek, Julia M; Orner, Erika P; Shaikh, Nurmohammad; Tarr, Phillip I; Sodergren, Erica; Weinstock, George M; Boedeker, Edgar C; Xiong, Xuejian; Parkinson, John; Frank, Daniel N; Li, Ellen; Gathungu, Grace

    2015-01-01

    quantitative polymerase chain reaction assays for 6 genes were conducted on fecal and ileal RNA samples from 22 inflammatory bowel disease (IBD), and 32 patients without IBD (non-IBD). The expression of Cas loci was detected in a higher proportion of CD than non-IBD fecal and ileal RNA samples (p <0.05). These results support a comparative genomic/transcriptomic approach towards identifying candidate AIEC signature transcripts.

  5. Identification of Candidate Adherent-Invasive E. coli Signature Transcripts by Genomic/Transcriptomic Analysis

    PubMed Central

    Zhang, Yuanhao; Rowehl, Leahana; Krumsiek, Julia M.; Orner, Erika P.; Shaikh, Nurmohammad; Tarr, Phillip I.; Sodergren, Erica; Weinstock, George M.; Boedeker, Edgar C.; Xiong, Xuejian; Parkinson, John; Frank, Daniel N.; Li, Ellen; Gathungu, Grace

    2015-01-01

    quantitative polymerase chain reaction assays for 6 genes were conducted on fecal and ileal RNA samples from 22 inflammatory bowel disease (IBD), and 32 patients without IBD (non-IBD). The expression of Cas loci was detected in a higher proportion of CD than non-IBD fecal and ileal RNA samples (p <0.05). These results support a comparative genomic/transcriptomic approach towards identifying candidate AIEC signature transcripts. PMID:26125937

  6. Detection of driver protein complexes in breast cancer metastasis by large-scale transcriptome-interactome integration.

    PubMed

    Garcia, Maxime; Finetti, Pascal; Bertucci, Francois; Birnbaum, Daniel; Bidaut, Ghislain

    2014-01-01

    With the development of high-throughput gene expression profiling technologies came the opportunity to define genomic signatures predicting clinical condition or cancer patient outcome. However, such signatures show dependency on training set, lack of generalization, and instability, partly due to microarray data topology. Additional issues for analyzing tumor gene expression are that subtle molecular perturbations in driver genes leading to cancer and metastasis (masked in typical differential expression analysis) may provoke expression changes of greater amplitude in downstream genes (easily detected). In this chapter, we are describing an interactome-based algorithm, Interactome-Transcriptome Integration (ITI) that is used to find a generalizable signature for prediction of breast cancer relapse by superimposition of a large-scale protein-protein interaction data (human interactome) over several gene expression datasets. ITI extracts regions in the interactome whose expression is discriminating for predicting relapse-free survival in cancer and allow detection of subnetworks that constitutes a generalizable and stable genomic signature. In this chapter, we describe the practical aspects of running the full ITI pipeline (subnetwork detection and classification) on six microarray datasets.

  7. Analyses of transcriptome sequences reveal multiple ancient large-scale duplication events in the ancestor of Sphagnopsida (Bryophyta).

    PubMed

    Devos, Nicolas; Szövényi, Péter; Weston, David J; Rothfels, Carl J; Johnson, Matthew G; Shaw, A Jonathan

    2016-07-01

    The goal of this research was to investigate whether there has been a whole-genome duplication (WGD) in the ancestry of Sphagnum (peatmoss) or the class Sphagnopsida, and to determine if the timing of any such duplication(s) and patterns of paralog retention could help explain the rapid radiation and current ecological dominance of peatmosses. RNA sequencing (RNA-seq) data were generated for nine taxa in Sphagnopsida (Bryophyta). Analyses of frequency plots for synonymous substitutions per synonymous site (Ks ) between paralogous gene pairs and reconciliation of 578 gene trees were conducted to assess evidence of large-scale or genome-wide duplication events in each transcriptome. Both Ks frequency plots and gene tree-based analyses indicate multiple duplication events in the history of the Sphagnopsida. The most recent WGD event predates divergence of Sphagnum from the two other genera of Sphagnopsida. Duplicate retention is highly variable across species, which might be best explained by local adaptation. Our analyses indicate that the last WGD could have been an important factor underlying the diversification of peatmosses and facilitated their rise to ecological dominance in peatlands. The timing of the duplication events and their significance in the evolutionary history of peat mosses are discussed.

  8. Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa

    PubMed Central

    2012-01-01

    Introduction Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. Results cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. Conclusions We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable

  9. Comparative description of ten transcriptomes of newly sequenced invertebrates and efficiency estimation of genomic sampling in non-model taxa.

    PubMed

    Riesgo, Ana; Andrade, Sónia C S; Sharma, Prashant P; Novo, Marta; Pérez-Porro, Alicia R; Vahtera, Varpu; González, Vanessa L; Kawauchi, Gisele Y; Giribet, Gonzalo

    2012-11-29

    Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene

  10. Analysis of Plant Pan-Genomes and Transcriptomes with GET_HOMOLOGUES-EST, a Clustering Solution for Sequences of the Same Species

    PubMed Central

    Contreras-Moreira, Bruno; Cantalapiedra, Carlos P.; García-Pereira, María J.; Gordon, Sean P.; Vogel, John P.; Igartua, Ernesto; Casas, Ana M.; Vinuesa, Pablo

    2017-01-01

    The pan-genome of a species is defined as the union of all the genes and non-coding sequences found in all its individuals. However, constructing a pan-genome for plants with large genomes is daunting both in sequencing cost and the scale of the required computational analysis. A more affordable alternative is to focus on the genic repertoire by using transcriptomic data. Here, the software GET_HOMOLOGUES-EST was benchmarked with genomic and RNA-seq data of 19 Arabidopsis thaliana ecotypes and then applied to the analysis of transcripts from 16 Hordeum vulgare genotypes. The goal was to sample their pan-genomes and classify sequences as core, if detected in all accessions, or accessory, when absent in some of them. The resulting sequence clusters were used to simulate pan-genome growth, and to compile Average Nucleotide Identity matrices that summarize intra-species variation. Although transcripts were found to under-estimate pan-genome size by at least 10%, we concluded that clusters of expressed sequences can recapitulate phylogeny and reproduce two properties observed in A. thaliana gene models: accessory loci show lower expression and higher non-synonymous substitution rates than core genes. Finally, accessory sequences were observed to preferentially encode transposon components in both species, plus disease resistance genes in cultivated barleys, and a variety of protein domains from other families that appear frequently associated with presence/absence variation in the literature. These results demonstrate that pan-genome analyses are useful to explore germplasm diversity. PMID:28261241

  11. Analysis of Plant Pan-Genomes and Transcriptomes with GET_HOMOLOGUES-EST, a Clustering Solution for Sequences of the Same Species.

    PubMed

    Contreras-Moreira, Bruno; Cantalapiedra, Carlos P; García-Pereira, María J; Gordon, Sean P; Vogel, John P; Igartua, Ernesto; Casas, Ana M; Vinuesa, Pablo

    2017-01-01

    The pan-genome of a species is defined as the union of all the genes and non-coding sequences found in all its individuals. However, constructing a pan-genome for plants with large genomes is daunting both in sequencing cost and the scale of the required computational analysis. A more affordable alternative is to focus on the genic repertoire by using transcriptomic data. Here, the software GET_HOMOLOGUES-EST was benchmarked with genomic and RNA-seq data of 19 Arabidopsis thaliana ecotypes and then applied to the analysis of transcripts from 16 Hordeum vulgare genotypes. The goal was to sample their pan-genomes and classify sequences as core, if detected in all accessions, or accessory, when absent in some of them. The resulting sequence clusters were used to simulate pan-genome growth, and to compile Average Nucleotide Identity matrices that summarize intra-species variation. Although transcripts were found to under-estimate pan-genome size by at least 10%, we concluded that clusters of expressed sequences can recapitulate phylogeny and reproduce two properties observed in A. thaliana gene models: accessory loci show lower expression and higher non-synonymous substitution rates than core genes. Finally, accessory sequences were observed to preferentially encode transposon components in both species, plus disease resistance genes in cultivated barleys, and a variety of protein domains from other families that appear frequently associated with presence/absence variation in the literature. These results demonstrate that pan-genome analyses are useful to explore germplasm diversity.

  12. Genomic analysis of host - Peste des petits ruminants vaccine viral transcriptome uncovers transcription factors modulating immune regulatory pathways.

    PubMed

    Manjunath, Siddappa; Kumar, Gandham Ravi; Mishra, Bishnu Prasad; Mishra, Bina; Sahoo, Aditya Prasad; Joshi, Chaitanya G; Tiwari, Ashok K; Rajak, Kaushal Kishore; Janga, Sarath Chandra

    2015-02-24

    Peste des petits ruminants (PPR), is an acute transboundary viral disease of economic importance, affecting goats and sheep. Mass vaccination programs around the world resulted in the decline of PPR outbreaks. Sungri 96 is a live attenuated vaccine, widely used in Northern India against PPR. This vaccine virus, isolated from goat works efficiently both in sheep and goat. Global gene expression changes under PPR vaccine virus infection are not yet well defined. Therefore, in this study we investigated the host-vaccine virus interactions by infecting the peripheral blood mononuclear cells isolated from goat with PPRV (Sungri 96 vaccine virus), to quantify the global changes in the transcriptomic signature by RNA-sequencing. Viral genome of Sungri 96 vaccine virus was assembled from the PPRV infected transcriptome confirming the infection and demonstrating the feasibility of building a complete non-host genome from the blood transcriptome. Comparison of infected transcriptome with control transcriptome revealed 985 differentially expressed genes. Functional analysis showed enrichment of immune regulatory pathways under PPRV infection. Key genes involved in immune system regulation, spliceosomal and apoptotic pathways were identified to be dysregulated. Network analysis revealed that the protein - protein interaction network among differentially expressed genes is significantly disrupted in infected state. Several genes encoding TFs that govern immune regulatory pathways were identified to co-regulate the differentially expressed genes. These data provide insights into the host - PPRV vaccine virus interactome for the first time. Our findings suggested dysregulation of immune regulatory pathways and genes encoding Transcription Factors (TFs) that govern these pathways in response to viral infection.

  13. Large-Scale Transcriptome Analysis in Faba Bean (Vicia faba L.) under Ascochyta fabae Infection

    PubMed Central

    Ocaña, Sara; Seoane, Pedro; Bautista, Rocio; Palomino, Carmen; Claros, Gonzalo M.; Torres, Ana M.; Madrid, Eva

    2015-01-01

    Faba bean is an important food crop worldwide. However, progress in faba bean genomics lags far behind that of model systems due to limited availability of genetic and genomic information. Using the Illumina platform the faba bean transcriptome from leaves of two lines (29H and Vf136) subjected to Ascochyta fabae infection have been characterized. De novo transcriptome assembly provided a total of 39,185 different transcripts that were functionally annotated, and among these, 13,266 were assigned to gene ontology against Arabidopsis. Quality of the assembly was validated by RT-qPCR amplification of selected transcripts differentially expressed. Comparison of faba bean transcripts with those of better-characterized plant genomes such as Arabidopsis thaliana, Medicago truncatula and Cicer arietinum revealed a sequence similarity of 68.3%, 72.8% and 81.27%, respectively. Moreover, 39,060 single nucleotide polymorphism (SNP) and 3,669 InDels were identified for genotyping applications. Mapping of the sequence reads generated onto the assembled transcripts showed that 393 and 457 transcripts were overexpressed in the resistant (29H) and susceptible genotype (Vf136), respectively. Transcripts involved in plant-pathogen interactions such as leucine rich proteins (LRR) or plant growth regulators involved in plant adaptation to abiotic and biotic stresses were found to be differently expressed in the resistant line. The results reported here represent the most comprehensive transcript database developed so far in faba bean, providing valuable information that could be used to gain insight into the pathways involved in the resistance mechanism against A. fabae and to identify potential resistance genes to be further used in marker assisted selection. PMID:26267359

  14. Large-Scale Transcriptome Analysis in Faba Bean (Vicia faba L.) under Ascochyta fabae Infection.

    PubMed

    Ocaña, Sara; Seoane, Pedro; Bautista, Rocio; Palomino, Carmen; Claros, Gonzalo M; Torres, Ana M; Madrid, Eva

    2015-01-01

    Faba bean is an important food crop worldwide. However, progress in faba bean genomics lags far behind that of model systems due to limited availability of genetic and genomic information. Using the Illumina platform the faba bean transcriptome from leaves of two lines (29H and Vf136) subjected to Ascochyta fabae infection have been characterized. De novo transcriptome assembly provided a total of 39,185 different transcripts that were functionally annotated, and among these, 13,266 were assigned to gene ontology against Arabidopsis. Quality of the assembly was validated by RT-qPCR amplification of selected transcripts differentially expressed. Comparison of faba bean transcripts with those of better-characterized plant genomes such as Arabidopsis thaliana, Medicago truncatula and Cicer arietinum revealed a sequence similarity of 68.3%, 72.8% and 81.27%, respectively. Moreover, 39,060 single nucleotide polymorphism (SNP) and 3,669 InDels were identified for genotyping applications. Mapping of the sequence reads generated onto the assembled transcripts showed that 393 and 457 transcripts were overexpressed in the resistant (29H) and susceptible genotype (Vf136), respectively. Transcripts involved in plant-pathogen interactions such as leucine rich proteins (LRR) or plant growth regulators involved in plant adaptation to abiotic and biotic stresses were found to be differently expressed in the resistant line. The results reported here represent the most comprehensive transcript database developed so far in faba bean, providing valuable information that could be used to gain insight into the pathways involved in the resistance mechanism against A. fabae and to identify potential resistance genes to be further used in marker assisted selection.

  15. Comparative genomic hybridization and transcriptome analysis with a pan-genome microarray reveal distinctions between JP2 and non-JP2 genotypes of Aggregatibacter actinomycetemcomitans.

    PubMed

    Huang, Y; Kittichotirat, W; Mayer, M P A; Hall, R; Bumgarner, R; Chen, C

    2013-02-01

    It was postulated that the highly virulent JP2 genotype of Aggregatibacter actinomycetemcomitans may possess a constellation of distinct virulence determinants not found in non-JP2 genotypes. This study compared the genome content and the transcriptome of the serotype b JP2 genotype and the closely related serotype b non-JP2 genotype of A. actinomycetemcomitans. A custom-designed pan-genomic microarray of A. actinomycetemcomitans was constructed and validated against a panel of 11 sequenced reference strains. The microarray was subsequently used for comparative genomic hybridization of serotype b strains of JP2 (six strains) and non-JP2 (six strains) genotypes, and for transcriptome analysis of strains of JP2 (three strains) and non-JP2 (two strains). Two JP2-specific and two non-JP2-specific genomic islands were identified. In one instance, distinct genomic islands were found to be inserted into the same locus among strains of different genotypes. Transcriptome analysis identified five operons, including the leukotoxin operon, to have at least two genes with an expression ratio of 2 or greater between genotypes. Two of the differentially expressed operons were members of the membrane-bound nitrate reductase system (nap operon) and the Tol-Pal system of gram-negative bacterial species. This study is the first to demonstrate the differences in the full genome content and gene expression between A. actinomycetemcomitans strains of JP2 and non-JP2 genotypes. The information is essential for designing hypothesis-driven experiments to examine the pathogenic mechanisms of A. actinomycetemcomitans.

  16. Transcriptome analysis of root response to citrus blight based on the newly assembled Swingle citrumelo draft genome.

    PubMed

    Zhang, Yunzeng; Barthe, Gary; Grosser, Jude W; Wang, Nian

    2016-07-08

    Citrus blight is a citrus tree overall decline disease and causes serious losses in the citrus industry worldwide. Although it was described more than one hundred years ago, its causal agent remains unknown and its pathophysiology is not well determined, which hampers our understanding of the disease and design of suitable disease management. In this study, we sequenced and assembled the draft genome for Swingle citrumelo, one important citrus rootstock. The draft genome is approximately 280 Mb, which covers 74 % of the estimated Swingle citrumelo genome and the average coverage is around 15X. The draft genome of Swingle citrumelo enabled us to conduct transcriptome analysis of roots of blight and healthy Swingle citrumelo using RNA-seq. The RNA-seq was reliable as evidenced by the high consistence of RNA-seq analysis and quantitative reverse transcription PCR results (R(2) = 0.966). Comparison of the gene expression profiles between blight and healthy root samples revealed the molecular mechanism underneath the characteristic blight phenotypes including decline, starch accumulation, and drought stress. The JA and ET biosynthesis and signaling pathways showed decreased transcript abundance, whereas SA-mediated defense-related genes showed increased transcript abundance in blight trees, suggesting unclassified biotrophic pathogen was involved in this disease. Overall, the Swingle citrumelo draft genome generated in this study will advance our understanding of plant biology and contribute to the citrus breeding. Transcriptome analysis of blight and healthy trees deepened our understanding of the pathophysiology of citrus blight.

  17. Simplified DGS procedure for large-scale genome structural study.

    PubMed

    Jung, Yong-Chul; Xu, Jia; Chen, Jun; Kim, Yeong; Winchester, David; Wang, San Ming

    2009-11-01

    Ditag genome scanning (DGS) uses next-generation DNA sequencing to sequence the ends of ditag fragments produced by restriction enzymes. These sequences are compared to known genome sequences to determine their structure. In order to use DGS for large-scale genome structural studies, we have substantially revised the original protocol by replacing the in vivo genomic DNA cloning with in vitro adaptor ligation, eliminating the ditag concatemerization steps, and replacing the 454 sequencer with Solexa or SOLiD sequencers for ditag sequence collection. This revised protocol further increases genome coverage and resolution and allows DGS to be used to analyze multiple genomes simultaneously.

  18. Breast cancer genome and transcriptome integration implicates specific mutational signatures with immune cell infiltration.

    PubMed

    Smid, Marcel; Rodríguez-González, F Germán; Sieuwerts, Anieta M; Salgado, Roberto; Prager-Van der Smissen, Wendy J C; Vlugt-Daane, Michelle van der; van Galen, Anne; Nik-Zainal, Serena; Staaf, Johan; Brinkman, Arie B; van de Vijver, Marc J; Richardson, Andrea L; Fatima, Aquila; Berentsen, Kim; Butler, Adam; Martin, Sancha; Davies, Helen R; Debets, Reno; Gelder, Marion E Meijer-Van; van Deurzen, Carolien H M; MacGrogan, Gaëtan; Van den Eynden, Gert G G M; Purdie, Colin; Thompson, Alastair M; Caldas, Carlos; Span, Paul N; Simpson, Peter T; Lakhani, Sunil R; Van Laere, Steven; Desmedt, Christine; Ringnér, Markus; Tommasi, Stefania; Eyford, Jorunn; Broeks, Annegien; Vincent-Salomon, Anne; Futreal, P Andrew; Knappskog, Stian; King, Tari; Thomas, Gilles; Viari, Alain; Langerød, Anita; Børresen-Dale, Anne-Lise; Birney, Ewan; Stunnenberg, Hendrik G; Stratton, Mike; Foekens, John A; Martens, John W M

    2016-09-26

    A recent comprehensive whole genome analysis of a large breast cancer cohort was used to link known and novel drivers and substitution signatures to the transcriptome of 266 cases. Here, we validate that subtype-specific aberrations show concordant expression changes for, for example, TP53, PIK3CA, PTEN, CCND1 and CDH1. We find that CCND3 expression levels do not correlate with amplification, while increased GATA3 expression in mutant GATA3 cancers suggests GATA3 is an oncogene. In luminal cases the total number of substitutions, irrespective of type, associates with cell cycle gene expression and adverse outcome, whereas the number of mutations of signatures 3 and 13 associates with immune-response specific gene expression, increased numbers of tumour-infiltrating lymphocytes and better outcome. Thus, while earlier reports imply that the sheer number of somatic aberrations could trigger an immune-response, our data suggests that substitutions of a particular type are more effective in doing so than others.

  19. Genomic and transcriptomic analyses of the tangerine pathotype of Alternaria alternata in response to oxidative stress

    PubMed Central

    Wang, Mingshuang; Sun, Xuepeng; Yu, Dongliang; Xu, Jianping; Chung, Kuangren; Li, Hongye

    2016-01-01

    The tangerine pathotype of Alternaria alternata produces the A. citri toxin (ACT) and is the causal agent of citrus brown spot that results in significant yield losses worldwide. Both the production of ACT and the ability to detoxify reactive oxygen species (ROS) are required for A. alternata pathogenicity in citrus. In this study, we report the 34.41 Mb genome sequence of strain Z7 of the tangerine pathotype of A. alternata. The host selective ACT gene cluster in strain Z7 was identified, which included 25 genes with 19 of them not reported previously. Of these, 10 genes were present only in the tangerine pathotype, representing the most likely candidate genes for this pathotype specialization. A transcriptome analysis of the global effects of H2O2 on gene expression revealed 1108 up-regulated and 498 down-regulated genes. Expressions of those genes encoding catalase, peroxiredoxin, thioredoxin and glutathione were highly induced. Genes encoding several protein families including kinases, transcription factors, transporters, cytochrome P450, ubiquitin and heat shock proteins were found associated with adaptation to oxidative stress. Our data not only revealed the molecular basis of ACT biosynthesis but also provided new insights into the potential pathways that the phytopathogen A. alternata copes with oxidative stress. PMID:27582273

  20. Genome-Wide Transcriptome Profiling of Region-Specific Vulnerability to Oxidative Stress in the Hippocampus

    PubMed Central

    Wang, Xinkun; Pal, Ranu; Chen, Xue-wen; Kumar, Keshava N.; Kim, Ok-Jin; Michaelis, Elias K.

    2007-01-01

    Neurons in the hippocampal CA1 region are particularly sensitive to oxidative stress (OS), whereas those in CA3 are resistant. To uncover mechanisms for selective CA1 vulnerability to OS, we treated organotypic hippocampal slices with duroquinone and compared transcriptional profiles of CA1 vs. CA3 cells at various intervals. Gene Ontology and biological pathway analyses of differentially expressed genes showed that at all time points, CA1 had higher transcriptional activity of stress/inflammatory response, transition metal transport, ferroxidase, and pre-synaptic signaling activity, while CA3 had higher GABA-signaling, postsynaptic, and calcium and potassium channel activity. Real-time PCR and immunoblots confirmed the transcriptome data and the induction of OS by duroquinone in both hippocampal regions. Our functional genomics approach has identified in CA1 cells molecular pathways as well as unique genes, such as, guanosine deaminase, lipocalin2, synaptotagmin 4, and latrophilin 2, whose time-dependent induction following the initiation of OS may represent attempts at neurite outgrowth, synaptic recovery, and resistance against OS. PMID:17553663

  1. Breast cancer genome and transcriptome integration implicates specific mutational signatures with immune cell infiltration

    PubMed Central

    Smid, Marcel; Rodríguez-González, F. Germán; Sieuwerts, Anieta M.; Salgado, Roberto; Prager-Van der Smissen, Wendy J. C.; Vlugt-Daane, Michelle van der; van Galen, Anne; Nik-Zainal, Serena; Staaf, Johan; Brinkman, Arie B.; van de Vijver, Marc J.; Richardson, Andrea L.; Fatima, Aquila; Berentsen, Kim; Butler, Adam; Martin, Sancha; Davies, Helen R.; Debets, Reno; Gelder, Marion E. Meijer-Van; van Deurzen, Carolien H. M.; MacGrogan, Gaëtan; Van den Eynden, Gert G. G. M.; Purdie, Colin; Thompson, Alastair M.; Caldas, Carlos; Span, Paul N.; Simpson, Peter T.; Lakhani, Sunil R.; Van Laere, Steven; Desmedt, Christine; Ringnér, Markus; Tommasi, Stefania; Eyford, Jorunn; Broeks, Annegien; Vincent-Salomon, Anne; Futreal, P. Andrew; Knappskog, Stian; King, Tari; Thomas, Gilles; Viari, Alain; Langerød, Anita; Børresen-Dale, Anne-Lise; Birney, Ewan; Stunnenberg, Hendrik G.; Stratton, Mike; Foekens, John A.; Martens, John W. M.

    2016-01-01

    A recent comprehensive whole genome analysis of a large breast cancer cohort was used to link known and novel drivers and substitution signatures to the transcriptome of 266 cases. Here, we validate that subtype-specific aberrations show concordant expression changes for, for example, TP53, PIK3CA, PTEN, CCND1 and CDH1. We find that CCND3 expression levels do not correlate with amplification, while increased GATA3 expression in mutant GATA3 cancers suggests GATA3 is an oncogene. In luminal cases the total number of substitutions, irrespective of type, associates with cell cycle gene expression and adverse outcome, whereas the number of mutations of signatures 3 and 13 associates with immune-response specific gene expression, increased numbers of tumour-infiltrating lymphocytes and better outcome. Thus, while earlier reports imply that the sheer number of somatic aberrations could trigger an immune-response, our data suggests that substitutions of a particular type are more effective in doing so than others. PMID:27666519

  2. Linking amyotrophic lateral sclerosis and spinal muscular atrophy through RNA-transcriptome homeostasis: a genomics perspective.

    PubMed

    Gama-Carvalho, Margarida; L Garcia-Vaquero, Marina; R Pinto, Francisco; Besse, Florence; Weis, Joachim; Voigt, Aaron; Schulz, Jörg B; De Las Rivas, Javier

    2017-04-01

    In this review, we present our most recent understanding of key biomolecular processes that underlie two motor neuron degenerative disorders, amyotrophic lateral sclerosis, and spinal muscular atrophy. We focus on the role of four multifunctional proteins involved in RNA metabolism (TDP-43, FUS, SMN, and Senataxin) that play a causal role in these diseases. Recent results have led to a novel scenario of intricate connections between these four proteins, bringing transcriptome homeostasis into the spotlight as a common theme in motor neuron degeneration. We review reported functional and physical interactions between these four proteins, highlighting their common association with nuclear bodies and small nuclear ribonucleoprotein particle biogenesis and function. We discuss how these interactions are turning out to be particularly relevant for the control of transcription and chromatin homeostasis, including the recent identification of an association between SMN and Senataxin required to ensure the resolution of DNA-RNA hybrid formation and proper termination by RNA polymerase II. These connections strongly support the existence of common pathways underlying the spinal muscular atrophy and amyotrophic lateral sclerosis phenotype. We also discuss the potential of genome-wide expression profiling, in particular RNA sequencing derived data, to contribute to unravelling the underlying mechanisms. We provide a review of publicly available datasets that have addressed both diseases using these approaches, and highlight the value of investing in cross-disease studies to promote our understanding of the pathways leading to neurodegeneration.

  3. Genomic and transcriptomic analyses of the tangerine pathotype of Alternaria alternata in response to oxidative stress.

    PubMed

    Wang, Mingshuang; Sun, Xuepeng; Yu, Dongliang; Xu, Jianping; Chung, Kuangren; Li, Hongye

    2016-09-01

    The tangerine pathotype of Alternaria alternata produces the A. citri toxin (ACT) and is the causal agent of citrus brown spot that results in significant yield losses worldwide. Both the production of ACT and the ability to detoxify reactive oxygen species (ROS) are required for A. alternata pathogenicity in citrus. In this study, we report the 34.41 Mb genome sequence of strain Z7 of the tangerine pathotype of A. alternata. The host selective ACT gene cluster in strain Z7 was identified, which included 25 genes with 19 of them not reported previously. Of these, 10 genes were present only in the tangerine pathotype, representing the most likely candidate genes for this pathotype specialization. A transcriptome analysis of the global effects of H2O2 on gene expression revealed 1108 up-regulated and 498 down-regulated genes. Expressions of those genes encoding catalase, peroxiredoxin, thioredoxin and glutathione were highly induced. Genes encoding several protein families including kinases, transcription factors, transporters, cytochrome P450, ubiquitin and heat shock proteins were found associated with adaptation to oxidative stress. Our data not only revealed the molecular basis of ACT biosynthesis but also provided new insights into the potential pathways that the phytopathogen A. alternata copes with oxidative stress.

  4. Stepwise Evolution of Coral Biomineralization Revealed with Genome-Wide Proteomics and Transcriptomics

    PubMed Central

    Sawada, Hitoshi; Satoh, Noriyuki

    2016-01-01

    Despite the importance of stony corals in many research fields related to global issues, such as marine ecology, climate change, paleoclimatogy, and metazoan evolution, very little is known about the evolutionary origin of coral skeleton formation. In order to investigate the evolution of coral biomineralization, we have identified skeletal organic matrix proteins (SOMPs) in the skeletal proteome of the scleractinian coral, Acropora digitifera, for which large genomic and transcriptomic datasets are available. Scrupulous gene annotation was conducted based on comparisons of functional domain structures among metazoans. We found that SOMPs include not only coral-specific proteins, but also protein families that are widely conserved among cnidarians and other metazoans. We also identified several conserved transmembrane proteins in the skeletal proteome. Gene expression analysis revealed that expression of these conserved genes continues throughout development. Therefore, these genes are involved not only skeleton formation, but also in basic cellular functions, such as cell-cell interaction and signaling. On the other hand, genes encoding coral-specific proteins, including extracellular matrix domain-containing proteins, galaxins, and acidic proteins, were prominently expressed in post-settlement stages, indicating their role in skeleton formation. Taken together, the process of coral skeleton formation is hypothesized as: 1) formation of initial extracellular matrix between epithelial cells and substrate, employing pre-existing transmembrane proteins; 2) additional extracellular matrix formation using novel proteins that have emerged by domain shuffling and rapid molecular evolution and; 3) calcification controlled by coral-specific SOMPs. PMID:27253604

  5. Genome-wide survey of ds exonization to enrich transcriptomes and proteomes in plants.

    PubMed

    Liu, Li-Yu Daisy; Charng, Yuh-Chyang

    2012-01-01

    Insertion of transposable elements (TEs) into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization which can enrich the complexity of transcriptomes and proteomes. Previously, we performed the first experimental assessment of TE exonization by inserting a Ds element into each intron of the rice epsps gene. Exonization of Ds in plants was biased toward providing splice donor sites from the beginning of the inserted Ds sequence. Additionally, Ds inserted in the reverse direction resulted in a continuous splice donor consensus region by offering 4 donor sites in the same intron. The current study involved genome-wide computational analysis of Ds exonization events in the dicot Arabidopsis thaliana and the monocot Oryza sativa (rice). Up to 71% of the exonized transcripts were putative targets for the nonsense-mediated decay (NMD) pathway. The insertion patterns of Ds and the polymorphic splice donor sites increased the transcripts and subsequent protein isoforms. Protein isoforms contain protein sequence due to unspliced intron-TE region and/or a shift of the reading frame. The number of interior protein isoforms would be twice that of C-terminal isoforms, on average. TE exonization provides a promising way for functional expansion of the plant proteome.

  6. Stepwise Evolution of Coral Biomineralization Revealed with Genome-Wide Proteomics and Transcriptomics.

    PubMed

    Takeuchi, Takeshi; Yamada, Lixy; Shinzato, Chuya; Sawada, Hitoshi; Satoh, Noriyuki

    2016-01-01

    Despite the importance of stony corals in many research fields related to global issues, such as marine ecology, climate change, paleoclimatogy, and metazoan evolution, very little is known about the evolutionary origin of coral skeleton formation. In order to investigate the evolution of coral biomineralization, we have identified skeletal organic matrix proteins (SOMPs) in the skeletal proteome of the scleractinian coral, Acropora digitifera, for which large genomic and transcriptomic datasets are available. Scrupulous gene annotation was conducted based on comparisons of functional domain structures among metazoans. We found that SOMPs include not only coral-specific proteins, but also protein families that are widely conserved among cnidarians and other metazoans. We also identified several conserved transmembrane proteins in the skeletal proteome. Gene expression analysis revealed that expression of these conserved genes continues throughout development. Therefore, these genes are involved not only skeleton formation, but also in basic cellular functions, such as cell-cell interaction and signaling. On the other hand, genes encoding coral-specific proteins, including extracellular matrix domain-containing proteins, galaxins, and acidic proteins, were prominently expressed in post-settlement stages, indicating their role in skeleton formation. Taken together, the process of coral skeleton formation is hypothesized as: 1) formation of initial extracellular matrix between epithelial cells and substrate, employing pre-existing transmembrane proteins; 2) additional extracellular matrix formation using novel proteins that have emerged by domain shuffling and rapid molecular evolution and; 3) calcification controlled by coral-specific SOMPs.

  7. Genomic and transcriptomic heterogeneity of colorectal tumours arising in Lynch syndrome.

    PubMed

    Binder, Hans; Hopp, Lydia; Schweiger, Michal R; Hoffmann, Steve; Jühling, Frank; Kerick, Martin; Timmermann, Bernd; Siebert, Susann; Grimm, Christina; Nersisyan, Lilit; Arakelyan, Arsen; Herberg, Maria; Buske, Peter; Loeffler-Wirth, Henry; Rosolowski, Maciej; Engel, Christoph; Przybilla, Jens; Peifer, Martin; Friedrichs, Nicolaus; Moeslein, Gabriela; Odenthal, Margarete; Hussong, Michelle; Peters, Sophia; Holzapfel, Stefanie; Nattermann, Jacob; Hueneburg, Robert; Schmiegel, Wolff; Royer-Pokora, Brigitte; Aretz, Stefan; Kloth, Michael; Kloor, Matthias; Buettner, Reinhard; Galle, Jörg; Loeffler, Markus

    2017-10-01

    Colorectal cancer (CRC) arising in Lynch syndrome (LS) comprises tumours with constitutional mutations in DNA mismatch repair genes. There is still a lack of whole-genome and transcriptome studies of LS-CRC to address questions about similarities and differences in mutation and gene expression characteristics between LS-CRC and sporadic CRC, about the molecular heterogeneity of LS-CRC, and about specific mechanisms of LS-CRC genesis linked to dysfunctional mismatch repair in LS colonic mucosa and the possible role of immune editing. Here, we provide a first molecular characterization of LS tumours and of matched tumour-distant reference colonic mucosa based on whole-genome DNA-sequencing and RNA-sequencing analyses. Our data support two subgroups of LS-CRCs, G1 and G2, whereby G1 tumours show a higher number of somatic mutations, a higher amount of microsatellite slippage, and a different mutation spectrum. The gene expression phenotypes support this difference. Reference mucosa of G1 shows a strong immune response associated with the expression of HLA and immune checkpoint genes and the invasion of CD4+ T cells. Such an immune response is not observed in LS tumours, G2 reference and normal (non-Lynch) mucosa, and sporadic CRC. We hypothesize that G1 tumours are edited for escape from a highly immunogenic microenvironment via loss of HLA presentation and T-cell exhaustion. In contrast, G2 tumours seem to develop in a less immunogenic microenvironment where tumour-promoting inflammation parallels tumourigenesis. Larger studies on non-neoplastic mucosa tissue of mutation carriers are required to better understand the early phases of emerging tumours. Copyright © 2017 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. Copyright © 2017 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

  8. Merkel Cell Polyomavirus Exhibits Dominant Control of the Tumor Genome and Transcriptome in Virus-Associated Merkel Cell Carcinoma

    PubMed Central

    Starrett, Gabriel J.; Marcelus, Christina; Cantalupo, Paul G.; Katz, Joshua P.; Cheng, Jingwei; Akagi, Keiko; Thakuria, Manisha; Rabinowits, Guilherme; Wang, Linda C.; Symer, David E.; Pipas, James M.

    2017-01-01

    ABSTRACT Merkel cell polyomavirus is the primary etiological agent of the aggressive skin cancer Merkel cell carcinoma (MCC). Recent studies have revealed that UV radiation is the primary mechanism for somatic mutagenesis in nonviral forms of MCC. Here, we analyze the whole transcriptomes and genomes of primary MCC tumors. Our study reveals that virus-associated tumors have minimally altered genomes compared to non-virus-associated tumors, which are dominated by UV-mediated mutations. Although virus-associated tumors contain relatively small mutation burdens, they exhibit a distinct mutation signature with observable transcriptionally biased kataegic events. In addition, viral integration sites overlap focal genome amplifications in virus-associated tumors, suggesting a potential mechanism for these events. Collectively, our studies indicate that Merkel cell polyomavirus is capable of hijacking cellular processes and driving tumorigenesis to the same severity as tens of thousands of somatic genome alterations. PMID:28049147

  9. The Genomics, Epigenomics, and Transcriptomics of HPV-Associated Oropharyngeal Cancer--Understanding the Basis of a Rapidly Evolving Disease.

    PubMed

    Lechner, M; Fenton, T R

    2016-01-01

    Human papillomavirus (HPV) has been shown to represent a major independent risk factor for head and neck squamous cell cancer, in particular for oropharyngeal carcinoma. This type of cancer is rapidly evolving in the Western world, with rising trends particularly in the young, and represents a distinct epidemiological, clinical, and molecular entity. It is the aim of this review to give a detailed description of genomic, epigenomic, transcriptomic, and posttranscriptional changes that underlie the phenotype of this deadly disease. The review will also link these changes and examine what is known about the interactions between the host genome and viral genome, and investigate changes specific for the viral genome. These data are then integrated into an updated model of HPV-induced head and neck carcinogenesis. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Genomics, transcriptomics and proteomics: enabling insights into social evolution and disease challenges for managed and wild bees.

    PubMed

    Trapp, Judith; McAfee, Alison; Foster, Leonard J

    2017-02-01

    Globally, there are over 20 000 bee species (Hymenoptera: Apoidea: Anthophila) with a host of biologically fascinating characteristics. Although they have long been studied as models for social evolution, recent challenges to bee health (mainly diseases and pesticides) have gathered the attention of both public and research communities. Genome sequences of twelve bee species are now complete or under progress, facilitating the application of additional 'omic technologies. Here, we review recent developments in honey bee and native bee research in the genomic era. We discuss the progress in genome sequencing and functional annotation, followed by the enabled comparative genomics, proteomics and transcriptomics applications regarding social evolution and health. Finally, we end with comments on future challenges in the postgenomic era.

  11. Comparative and Transcriptome Analyses Uncover Key Aspects of Coding- and Long Noncoding RNAs in Flatworm Mitochondrial Genomes

    PubMed Central

    Ross, Eric; Blair, David; Guerrero-Hernández, Carlos; Alvarado, Alejandro Sánchez

    2016-01-01

    Exploiting the conservation of various features of mitochondrial genomes has been instrumental in resolving phylogenetic relationships. Despite extensive sequence evidence, it has not previously been possible to conclusively resolve some key aspects of flatworm mitochondrial genomes, including generally conserved traits, such as start codons, noncoding regions, the full complement of tRNAs, and whether ATP8 is, or is not, encoded by this extranuclear genome. In an effort to address these difficulties, we sought to determine the mitochondrial transcriptomes and genomes of sexual and asexual taxa of freshwater triclads, a group previously poorly represented in flatworm mitogenomic studies. We have discovered evidence for an alternative start codon, an extended cox1 gene, a previously undescribed conserved open reading frame, long noncoding RNAs, and a highly conserved gene order across the large evolutionary distances represented within the triclads. Our findings contribute to the expansion and refinement of mitogenomics to address evolutionary issues in this diverse group of animals. PMID:26921295

  12. Transcriptome complexity in cardiac development and diseases--an expanding universe between genome and phenome.

    PubMed

    Gao, Chen; Wang, Yibin

    2014-01-01

    With the advancement of transcriptome profiling by micro-arrays and high-throughput RNA-sequencing, transcriptome complexity and its dynamics are revealed at different levels in cardiovascular development and diseases. In this review, we will highlight the recent progress in our knowledge of cardiovascular transcriptome complexity contributed by RNA splicing, RNA editing and noncoding RNAs. The emerging importance of many of these previously under-explored aspects of gene regulation in cardiovascular development and pathology will be discussed.

  13. Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

    PubMed

    Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry

    2016-11-01

    Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.

  14. Modeling cancer metabolism on a genome scale.

    PubMed

    Yizhak, Keren; Chaneton, Barbara; Gottlieb, Eyal; Ruppin, Eytan

    2015-06-30

    Cancer cells have fundamentally altered cellular metabolism that is associated with their tumorigenicity and malignancy. In addition to the widely studied Warburg effect, several new key metabolic alterations in cancer have been established over the last decade, leading to the recognition that altered tumor metabolism is one of the hallmarks of cancer. Deciphering the full scope and functional implications of the dysregulated metabolism in cancer requires both the advancement of a variety of omics measurements and the advancement of computational approaches for the analysis and contextualization of the accumulated data. Encouragingly, while the metabolic network is highly interconnected and complex, it is at the same time probably the best characterized cellular network. Following, this review discusses the challenges that genome-scale modeling of cancer metabolism has been facing. We survey several recent studies demonstrating the first strides that have been done, testifying to the value of this approach in portraying a network-level view of the cancer metabolism and in identifying novel drug targets and biomarkers. Finally, we outline a few new steps that may further advance this field.

  15. Modeling cancer metabolism on a genome scale

    PubMed Central

    Yizhak, Keren; Chaneton, Barbara; Gottlieb, Eyal; Ruppin, Eytan

    2015-01-01

    Cancer cells have fundamentally altered cellular metabolism that is associated with their tumorigenicity and malignancy. In addition to the widely studied Warburg effect, several new key metabolic alterations in cancer have been established over the last decade, leading to the recognition that altered tumor metabolism is one of the hallmarks of cancer. Deciphering the full scope and functional implications of the dysregulated metabolism in cancer requires both the advancement of a variety of omics measurements and the advancement of computational approaches for the analysis and contextualization of the accumulated data. Encouragingly, while the metabolic network is highly interconnected and complex, it is at the same time probably the best characterized cellular network. Following, this review discusses the challenges that genome-scale modeling of cancer metabolism has been facing. We survey several recent studies demonstrating the first strides that have been done, testifying to the value of this approach in portraying a network-level view of the cancer metabolism and in identifying novel drug targets and biomarkers. Finally, we outline a few new steps that may further advance this field. PMID:26130389

  16. Genome Wide Transcriptome Analysis of Dendritic Cells Identifies Genes with Altered Expression in Psoriasis

    PubMed Central

    Szász, András; Tubak, Vilmos; Kemény, Lajos; Kondorosi, Éva; Nagy, István

    2013-01-01

    Activation of dendritic cells by different pathogens induces the secretion of proinflammatory mediators resulting in local inflammation. Importantly, innate immunity must be properly controlled, as its continuous activation leads to the development of chronic inflammatory diseases such as psoriasis. Lipopolysaccharide (LPS) or peptidoglycan (PGN) induced tolerance, a phenomenon of transient unresponsiveness of cells to repeated or prolonged stimulation, proved valuable model for the study of chronic inflammation. Thus, the aim of this study was the identification of the transcriptional diversity of primary human immature dendritic cells (iDCs) upon PGN induced tolerance. Using SAGE-Seq approach, a tag-based transcriptome sequencing method, we investigated gene expression changes of primary human iDCs upon stimulation or restimulation with Staphylococcus aureus derived PGN, a widely used TLR2 ligand. Based on the expression pattern of the altered genes, we identified non-tolerizeable and tolerizeable genes. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (Kegg) analysis showed marked enrichment of immune-, cell cycle- and apoptosis related genes. In parallel to the marked induction of proinflammatory mediators, negative feedback regulators of innate immunity, such as TNFAIP3, TNFAIP8, Tyro3 and Mer are markedly downregulated in tolerant cells. We also demonstrate, that the expression pattern of TNFAIP3 and TNFAIP8 is altered in both lesional, and non-lesional skin of psoriatic patients. Finally, we show that pretreatment of immature dendritic cells with anti-TNF-α inhibits the expression of IL-6 and CCL1 in tolerant iDCs and partially releases the suppression of TNFAIP8. Our findings suggest that after PGN stimulation/restimulation the host cell utilizes different mechanisms in order to maintain critical balance between inflammation and tolerance. Importantly, the transcriptome sequencing of stimulated/restimulated iDCs identified numerous genes with

  17. Genome-wide transcriptome analysis of soybean primary root under varying water-deficit conditions.

    PubMed

    Song, Li; Prince, Silvas; Valliyodan, Babu; Joshi, Trupti; Maldonado dos Santos, Joao V; Wang, Jiaojiao; Lin, Li; Wan, Jinrong; Wang, Yongqin; Xu, Dong; Nguyen, Henry T

    2016-01-15

    Soybean is a major crop that provides an important source of protein and oil to humans and animals, but its production can be dramatically decreased by the occurrence of drought stress. Soybeans can survive drought stress if there is a robust and deep root system at the early vegetative growth stage. However, little is known about the genome-wide molecular mechanisms contributing to soybean root system architecture. This study was performed to gain knowledge on transcriptome changes and related molecular mechanisms contributing to soybean root development under water limited conditions. The soybean Williams 82 genotype was subjected to very mild stress (VMS), mild stress (MS) and severe stress (SS) conditions, as well as recovery from the severe stress after re-watering (SR). In total, 6,609 genes in the roots showed differential expression patterns in response to different water-deficit stress levels. Genes involved in hormone (Auxin/Ethylene), carbohydrate, and cell wall-related metabolism (XTH/lipid/flavonoids/lignin) pathways were differentially regulated in the soybean root system. Several transcription factors (TFs) regulating root growth and responses under varying water-deficit conditions were identified and the expression patterns of six TFs were found to be common across the stress levels. Further analysis on the whole plant level led to the finding of tissue-specific or water-deficit levels specific regulation of transcription factors. Analysis of the over-represented motif of different gene groups revealed several new cis-elements associated with different levels of water deficit. The expression patterns of 18 genes were confirmed byquantitative reverse transcription polymerase chain reaction method and demonstrated the accuracy and effectiveness of RNA-Seq. The primary root specific transcriptome in soybean can enable a better understanding of the root response to water deficit conditions. The genes detected in root tissues that were associated with

  18. Identification of genes for controlling swine adipose deposition by integrating transcriptome, whole-genome resequencing, and quantitative trait loci data

    PubMed Central

    Xing, Kai; Zhu, Feng; Zhai, LiWei; Chen, ShaoKang; Tan, Zhen; Sun, YangYang; Hou, ZhuoCheng; Wang, ChuDuan

    2016-01-01

    Backfat thickness is strongly associated with meat quality, fattening efficiency, reproductive performance, and immunity in pigs. Fat storage and fatty acid synthesis mainly occur in adipose tissue. Therefore, we used a high-throughput massively parallel sequencing approach to identify transcriptomes in adipose tissue, and whole-genome differences from three full-sibling pairs of pigs with opposite (high and low) backfat thickness phenotypes. We obtained an average of 38.69 million reads for six samples, 78.68% of which were annotated in the reference genome. Eighty-nine overlapping differentially expressed genes were identified among the three pair comparisons. Whole-genome resequencing also detected multiple genetic variations between the pools of DNA from the two groups. Compared with the animal quantitative trait loci (QTL) database, 20 differentially expressed genes were matched to the QTLs associated with fatness in pigs. Our technique of integrating transcriptome, whole-genome resequencing, and QTL database information provided a rich source of important differentially expressed genes and variations. Associate analysis between selected SNPs and backfat thickness revealed that two SNPs and one haplotype of ME1 significantly affected fat deposition in pigs. Moreover, genetic analysis confirmed that variations in the differentially expressed genes may affect fat deposition. PMID:26996612

  19. Ensembl Genomes 2013: scaling up access to genome-wide data

    USDA-ARS?s Scientific Manuscript database

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...

  20. Molecular classification of diffuse cerebral WHO grade II/III gliomas using genome- and transcriptome-wide profiling improves stratification of prognostically distinct patient groups.

    PubMed

    Weller, Michael; Weber, Ruthild G; Willscher, Edith; Riehmer, Vera; Hentschel, Bettina; Kreuz, Markus; Felsberg, Jörg; Beyer, Ulrike; Löffler-Wirth, Henry; Kaulich, Kerstin; Steinbach, Joachim P; Hartmann, Christian; Gramatzki, Dorothee; Schramm, Johannes; Westphal, Manfred; Schackert, Gabriele; Simon, Matthias; Martens, Tobias; Boström, Jan; Hagel, Christian; Sabel, Michael; Krex, Dietmar; Tonn, Jörg C; Wick, Wolfgang; Noell, Susan; Schlegel, Uwe; Radlwimmer, Bernhard; Pietsch, Torsten; Loeffler, Markus; von Deimling, Andreas; Binder, Hans; Reifenberger, Guido

    2015-05-01

    Cerebral gliomas of World Health Organization (WHO) grade II and III represent a major challenge in terms of histological classification and clinical management. Here, we asked whether large-scale genomic and transcriptomic profiling improves the definition of prognostically distinct entities. We performed microarray-based genome- and transcriptome-wide analyses of primary tumor samples from a prospective German Glioma Network cohort of 137 patients with cerebral gliomas, including 61 WHO grade II and 76 WHO grade III tumors. Integrative bioinformatic analyses were employed to define molecular subgroups, which were then related to histology, molecular biomarkers, including isocitrate dehydrogenase 1 or 2 (IDH1/2) mutation, 1p/19q co-deletion and telomerase reverse transcriptase (TERT) promoter mutations, and patient outcome. Genomic profiling identified five distinct glioma groups, including three IDH1/2 mutant and two IDH1/2 wild-type groups. Expression profiling revealed evidence for eight transcriptionally different groups (five IDH1/2 mutant, three IDH1/2 wild type), which were only partially linked to the genomic groups. Correlation of DNA-based molecular stratification with clinical outcome allowed to define three major prognostic groups with characteristic genomic aberrations. The best prognosis was found in patients with IDH1/2 mutant and 1p/19q co-deleted tumors. Patients with IDH1/2 wild-type gliomas and glioblastoma-like genomic alterations, including gain on chromosome arm 7q (+7q), loss on chromosome arm 10q (-10q), TERT promoter mutation and oncogene amplification, displayed the worst outcome. Intermediate survival was seen in patients with IDH1/2 mutant, but 1p/19q intact, mostly astrocytic gliomas, and in patients with IDH1/2 wild-type gliomas lacking the +7q/-10q genotype and TERT promoter mutation. This molecular subgrouping stratified patients into prognostically distinct groups better than histological classification. Addition of gene expression

  1. Automated multiplex genome-scale engineering in yeast

    PubMed Central

    Si, Tong; Chao, Ran; Min, Yuhao; Wu, Yuying; Ren, Wen; Zhao, Huimin

    2017-01-01

    Genome-scale engineering is indispensable in understanding and engineering microorganisms, but the current tools are mainly limited to bacterial systems. Here we report an automated platform for multiplex genome-scale engineering in Saccharomyces cerevisiae, an important eukaryotic model and widely used microbial cell factory. Standardized genetic parts encoding overexpression and knockdown mutations of >90% yeast genes are created in a single step from a full-length cDNA library. With the aid of CRISPR-Cas, these genetic parts are iteratively integrated into the repetitive genomic sequences in a modular manner using robotic automation. This system allows functional mapping and multiplex optimization on a genome scale for diverse phenotypes including cellulase expression, isobutanol production, glycerol utilization and acetic acid tolerance, and may greatly accelerate future genome-scale engineering endeavours in yeast. PMID:28469255

  2. Novel Tools for Conservation Genomics: Comparing Two High-Throughput Approaches for SNP Discovery in the Transcriptome of the European Hake

    PubMed Central

    Milano, Ilaria; Babbucci, Massimiliano; Panitz, Frank; Ogden, Rob; Nielsen, Rasmus O.; Taylor, Martin I.; Helyar, Sarah J.; Carvalho, Gary R.; Espiñeira, Montserrat; Atanassova, Miroslava; Tinti, Fausto; Maes, Gregory E.; Patarnello, Tomaso; Bargelloni, Luca

    2011-01-01

    The growing accessibility to genomic resources using next-generation sequencing (NGS) technologies has revolutionized the application of molecular genetic tools to ecology and evolutionary studies in non-model organisms. Here we present the case study of the European hake (Merluccius merluccius), one of the most important demersal resources of European fisheries. Two sequencing platforms, the Roche 454 FLX (454) and the Illumina Genome Analyzer (GAII), were used for Single Nucleotide Polymorphisms (SNPs) discovery in the hake muscle transcriptome. De novo transcriptome assembly into unique contigs, annotation, and in silico SNP detection were carried out in parallel for 454 and GAII sequence data. High-throughput genotyping using the Illumina GoldenGate assay was performed for validating 1,536 putative SNPs. Validation results were analysed to compare the performances of 454 and GAII methods and to evaluate the role of several variables (e.g. sequencing depth, intron-exon structure, sequence quality and annotation). Despite well-known differences in sequence length and throughput, the two approaches showed similar assay conversion rates (approximately 43%) and percentages of polymorphic loci (67.5% and 63.3% for GAII and 454, respectively). Both NGS platforms therefore demonstrated to be suitable for large scale identification of SNPs in transcribed regions of non-model species, although the lack of a reference genome profoundly affects the genotyping success rate. The overall efficiency, however, can be improved using strict quality and filtering criteria for SNP selection (sequence quality, intron-exon structure, target region score). PMID:22132191

  3. Guitar: An R/Bioconductor Package for Gene Annotation Guided Transcriptomic Analysis of RNA-Related Genomic Features.

    PubMed

    Cui, Xiaodong; Wei, Zhen; Zhang, Lin; Liu, Hui; Sun, Lei; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia

    2016-01-01

    Biological features, such as genes and transcription factor binding sites, are often denoted with genome-based coordinates as the genomic features. While genome-based representation is usually very effective in correlating various biological features, it can be tedious to examine the relationship between RNA-related genomic features and the landmarks of RNA transcripts with existing tools due to the difficulty in the conversion between genome-based coordinates and RNA-based coordinates. We developed here an open source Guitar R/Bioconductor package for sketching the transcriptomic view of RNA-related biological features represented by genome based coordinates. Internally, Guitar package extracts the standardized RNA coordinates with respect to the landmarks of RNA transcripts, with which hundreds of millions of RNA-related genomic features can then be efficiently analyzed within minutes. We demonstrated the usage of Guitar package in analyzing posttranscriptional RNA modifications (5-methylcytosine and N6-methyladenosine) derived from high-throughput sequencing approaches (MeRIP-Seq and RNA BS-Seq) and show that RNA 5-methylcytosine (m(5)C) is enriched in 5'UTR. The newly developed Guitar R/Bioconductor package achieves stable performance on the data tested and revealed novel biological insights. It will effectively facilitate the analysis of RNA methylation data and other RNA-related biological features in the future.

  4. Genome-wide transcriptome analysis of fluoroquinolone resistance in clinical isolates of Escherichia coli.

    PubMed

    Yamane, Takashi; Enokida, Hideki; Hayami, Hiroshi; Kawahara, Motoshi; Nakagawa, Masayuki

    2012-04-01

    Coincident with their worldwide use, resistance to fluoroquinolones in Escherichia coli has increased. To identify the gene expression profiles underlying fluoroquinolone resistance, we carried out genome-wide transcriptome analysis of fluoroquinolone-sensitive E. coli. Four fluoroquinolone-sensitive E. coli and five fluoroquinolone-resistant E. coli clinical isolates were subjected to complementary deoxyribonucleic acid microarray analysis. Some upregulated genes' expression was verified by real-time polymerase chain reaction using 104 E. coli clinical isolates, and minimum inhibitory concentration tests were carried out by using their transformants. A total of 40 genes were significantly upregulated in fluoroquinolone-resistant E. coli isolates (P < 0.05). The expression of phage shock protein operons, which are involved in biofilm formation, was markedly upregulated in our profile of fluoroquinolone-resistant E. coli. One of the phage shock protein operons, pspC, was significantly upregulated in 50 fluoroquinolone-resistant E. coli isolates (P < 0.0001). The expression of type I fimbriae genes, which are pilus operons involved in biofilm formation, were markedly downregulated in fluoroquinolone-resistant E. coli. Deoxyribonucleic acid adenine methyltransferase (dam), which represses type I fimbriae genes, was significantly upregulated in the clinical fluoroquinolone-resistant E. coli isolates (P = 0.007). We established pspC- and dam-expressing E. coli transformants from fluoroquinolone-sensitive E. coli, and the minimum inhibitory concentration tests showed that the transformants acquired fluoroquinolone resistance, suggesting that upregulation of these genes contributes to acquiring fluoroquinolone resistance. Upregulation of psp operones and dam underlying pilus operons downregulation might be associated with fluoroquinolone resistance in E. coli. © 2011 The Japanese Urological Association.

  5. A genomic and transcriptomic approach to investigate the blue pigment phenotype in Pseudomonas fluorescens.

    PubMed

    Andreani, Nadia Andrea; Carraro, Lisa; Martino, Maria Elena; Fondi, Marco; Fasolato, Luca; Miotto, Giovanni; Magro, Massimiliano; Vianello, Fabio; Cardazzo, Barbara

    2015-11-20

    Pseudomonas fluorescens is a well-known food spoiler, able to cause serious economic losses in the food industry due to its ability to produce many extracellular, and often thermostable, compounds. The most outstanding spoilage events involving P. fluorescens were blue discoloration of several food stuffs, mainly dairy products. The bacteria involved in such high-profile cases have been identified as belonging to a clearly distinct phylogenetic cluster of the P. fluorescens group. Although the blue pigment has recently been investigated in several studies, the biosynthetic pathway leading to the pigment formation, as well as its chemical nature, remain challenging and unsolved points. In the present paper, genomic and transcriptomic data of 4 P. fluorescens strains (2 blue-pigmenting strains and 2 non-pigmenting strains) were analyzed to evaluate the presence and the expression of blue strain-specific genes. In particular, the pangenome analysis showed the presence in the blue-pigmenting strains of two copies of genes involved in the tryptophan biosynthesis pathway (including trpABCDF). The global expression profiling of blue-pigmenting strains versus non-pigmenting strains showed a general up-regulation of genes involved in iron uptake and a down-regulation of genes involved in primary metabolism. Chromogenic reaction of the blue-pigmenting bacterial cells with Kovac's reagent indicated an indole-derivative as the precursor of the blue pigment. Finally, solubility tests and MALDI-TOF mass spectrometry analysis of the isolated pigment suggested that its molecular structure is very probably a hydrophobic indigo analog.

  6. Potential evolution of neurosurgical treatment paradigms for craniopharyngioma based on genomic and transcriptomic characteristics.

    PubMed

    Robinson, Leslie C; Santagata, Sandro; Hankinson, Todd C

    2016-12-01

    The recent genomic and transcriptomic characterization of human craniopharyngiomas has provided important insights into the pathogenesis of these tumors and supports that these tumor types are distinct entities. Critically, the insights provided by these data offer the potential for the introduction of novel therapies and surgical treatment paradigms for these tumors, which are associated with high morbidity rates and morbid conditions. Mutations in the CTNNB1 gene are primary drivers of adamantinomatous craniopharyngioma (ACP) and lead to the accumulation of β-catenin protein in a subset of the nuclei within the neoplastic epithelium of these tumors. Dysregulation of epidermal growth factor receptor (EGFR) and of sonic hedgehog (SHH) signaling in ACP suggest that paracrine oncogenic mechanisms may underlie ACP growth and implicate these signaling pathways as potential targets for therapeutic intervention using directed therapies. Recent work shows that ACP cells have primary cilia, further supporting the potential importance of SHH signaling in the pathogenesis of these tumors. While further preclinical data are needed, directed therapies could defer, or replace, the need for radiation therapy and/or allow for less aggressive surgical interventions. Furthermore, the prospect for reliable control of cystic disease without the need for surgery now exists. Studies of papillary craniopharyngioma (PCP) are more clinically advanced than those for ACP. The vast majority of PCPs harbor the BRAF(v600e) mutation. There are now 2 reports of patients with PCP that had dramatic therapeutic responses to targeted agents. Ongoing clinical and research studies promise to not only advance our understanding of these challenging tumors but to offer new approaches for patient management.

  7. Oxidative Stress and Heat-Shock Responses in Desulfovibrio vulgaris by Genome-Wide Transcriptomic Analysis

    SciTech Connect

    Zhang, Weiwen; Culley, David E.; Hogan, Mike; Vitiritti, Luigi; Brockman, Fred J.

    2006-05-30

    Abstract Sulfate-reducing bacteria, like Desulfovibrio vulgaris have developed a set of reactions allowing them to survive in environments. To obtain further knowledge of the protecting mechanisms employed in D. vulgaris against the oxidative stress and heat shock, we performed a genome-wide transcriptomic analysis to determine the cellular responses to both stimuli. The results showed that 130 genes were responsive to oxidative stress, while 427 genes responsive to heat-shock, respectively. Functional analyses suggested that the genes regulated were involved in a variety of cellular functions. Metabolic analysis showed that amino acid biosynthetic pathways were induced by both oxidative stress and heat shock treatments, while fatty acid metabolism, purine and cofactor biosynthesis were induced by heat shock only. Rubrerythrin gene (rbR) were upregulated by the oxidative stress, suggesting its important role in the oxidative resistance, whereas the expression of rubredoxin oxidoreductase (rbO), superoxide ismutase (sodB) and catalase (katA) genes were not subjected to regulation by oxidative stress in D. vulgaris. In addition, the results showed that thioredoxin reductase (trxB) was responsive to oxidative stress, suggesting the thiol-specific redox system might be involved in oxidative protection in D. vulgaris. Comparison of cellular responses to oxidative stress and heat-shock allowed the identification of 66 genes that showed a similar drastic response to both environmental stimuli, implying that they might be part of the general stress response (GSR) network in D. vulgaris, which was further supported by the finding of a conserved motif upstream these common-responsive genes.

  8. Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research.

    PubMed

    Dong, Lingli; Liu, Hongfang; Zhang, Juncheng; Yang, Shuangjuan; Kong, Guanyi; Chu, Jeffrey S C; Chen, Nansheng; Wang, Daowen

    2015-12-09

    The large and complex hexaploid genome has greatly hindered genomics studies of common wheat (Triticum aestivum, AABBDD). Here, we investigated transcripts in common wheat developing caryopses using the emerging single-molecule real-time (SMRT) sequencing technology PacBio RSII, and assessed the resultant data for improving common wheat genome annotation and grain transcriptome research. We obtained 197,709 full-length non-chimeric (FLNC) reads, 74.6 % of which were estimated to carry complete open reading frame. A total of 91,881 high-quality FLNC reads were identified and mapped to 16,188 chromosomal loci, corresponding to 13,162 known genes and 3026 new genes not annotated previously. Although some FLNC reads could not be unambiguously mapped to the current draft genome sequence, many of them are likely useful for studying highly similar homoeologous or paralogous loci or for improving chromosomal contig assembly in further research. The 91,881 high-quality FLNC reads represented 22,768 unique transcripts, 9591 of which were newly discovered. We found 180 transcripts each spanning two or three previously annotated adjacent loci, suggesting that they should be merged to form correct gene models. Finally, our data facilitated the identification of 6030 genes differentially regulated during caryopsis development, and full-length transcripts for 72 transcribed gluten gene members that are important for the end-use quality control of common wheat. Our work demonstrated the value of PacBio transcript sequencing for improving common wheat genome annotation through uncovering the loci and full-length transcripts not discovered previously. The resource obtained may aid further structural genomics and grain transcriptome studies of common wheat.

  9. Capturing the response of Clostridium acetobutylicum to chemical stressors using a regulated genome-scale metabolic model

    DOE PAGES

    Dash, Satyakam; Mueller, Thomas J.; Venkataramanan, Keerthi P.; ...

    2014-10-14

    Clostridia are anaerobic Gram-positive Firmicutes containing broad and flexible systems for substrate utilization, which have been used successfully to produce a range of industrial compounds. Clostridium acetobutylicum has been used to produce butanol on an industrial scale through acetone-butanol-ethanol (ABE) fermentation. A genome-scale metabolic (GSM) model is a powerful tool for understanding the metabolic capacities of an organism and developing metabolic engineering strategies for strain development. The integration of stress related specific transcriptomics information with the GSM model provides opportunities for elucidating the focal points of regulation.

  10. Genomic and transcriptomic hallmarks of poorly differentiated and anaplastic thyroid cancers

    PubMed Central

    Ibrahimpasic, Tihana; Boucai, Laura; Shah, Ronak H.; Dogan, Snjezana; Ricarte-Filho, Julio C.; Krishnamoorthy, Gnana P.; Schultz, Nikolaus; Berger, Michael F.; Sander, Chris; Taylor, Barry S.; Ghossein, Ronald; Ganly, Ian; Fagin, James A.

    2016-01-01

    BACKGROUND. Poorly differentiated thyroid cancer (PDTC) and anaplastic thyroid cancer (ATC) are rare and frequently lethal tumors that so far have not been subjected to comprehensive genetic characterization. METHODS. We performed next-generation sequencing of 341 cancer genes from 117 patient-derived PDTCs and ATCs and analyzed the transcriptome of a representative subset of 37 tumors. Results were analyzed in the context of The Cancer Genome Atlas study (TCGA study) of papillary thyroid cancers (PTC). RESULTS. Compared to PDTCs, ATCs had a greater mutation burden, including a higher frequency of mutations in TP53, TERT promoter, PI3K/AKT/mTOR pathway effectors, SWI/SNF subunits, and histone methyltransferases. BRAF and RAS were the predominant drivers and dictated distinct tropism for nodal versus distant metastases in PDTC. RAS and BRAF sharply distinguished between PDTCs defined by the Turin (PDTC-Turin) versus MSKCC (PDTC-MSK) criteria, respectively. Mutations of EIF1AX, a component of the translational preinitiation complex, were markedly enriched in PDTCs and ATCs and had a striking pattern of co-occurrence with RAS mutations. While TERT promoter mutations were rare and subclonal in PTCs, they were clonal and highly prevalent in advanced cancers. Application of the TCGA-derived BRAF-RAS score (a measure of MAPK transcriptional output) revealed a preserved relationship with BRAF/RAS mutation in PDTCs, whereas ATCs were BRAF-like irrespective of driver mutation. CONCLUSIONS. These data support a model of tumorigenesis whereby PDTCs and ATCs arise from well-differentiated tumors through the accumulation of key additional genetic abnormalities, many of which have prognostic and possible therapeutic relevance. The widespread genomic disruptions in ATC compared with PDTC underscore their greater virulence and higher mortality. FUNDING. This work was supported in part by NIH grants CA50706, CA72597, P50-CA72012, P30-CA008748, and 5T32-CA160001; the Lefkovsky Family

  11. Transcriptomic and proteomic analyses on the supercooling ability and mining of antifreeze proteins of the Chinese white wax scale insect.

    PubMed

    Yu, Shu-Hui; Yang, Pu; Sun, Tao; Qi, Qian; Wang, Xue-Qing; Chen, Xiao-Ming; Feng, Ying; Liu, Bo-Wen

    2016-06-01

    The Chinese white wax scale insect, Ericerus pela, can survive at extremely low temperatures, and some overwintering individuals exhibit supercooling at temperatures below -30°C. To investigate the deep supercooling ability of E. pela, transcriptomic and proteomic analyses were performed to delineate the major gene and protein families responsible for the deep supercooling ability of overwintering females. Gene Ontology (GO) classification and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis indicated that genes involved in the mitogen-activated protein kinase, calcium, and PI3K-Akt signaling pathways and pathways associated with the biosynthesis of soluble sugars, sugar alcohols and free amino acids were dominant. Proteins responsible for low-temperature stress, such as cold acclimation proteins, glycerol biosynthesis-related enzymes and heat shock proteins (HSPs) were identified. However, no antifreeze proteins (AFPs) were identified through sequence similarity search methods. A random forest approach identified 388 putative AFPs in the proteome. The AFP gene ep-afp was expressed in Escherichia coli, and the expressed protein exhibited a thermal hysteresis activity of 0.97°C, suggesting its potential role in the deep supercooling ability of E. pela.

  12. Genomic analysis and temperature-dependent transcriptome profiles of the rhizosphere originating strain Pseudomonas aeruginosa M18

    PubMed Central

    2011-01-01

    Background Our previously published reports have described an effective biocontrol agent named Pseudomonas sp. M18 as its 16S rDNA sequence and several regulator genes share homologous sequences with those of P. aeruginosa, but there are several unusual phenotypic features. This study aims to explore its strain specific genomic features and gene expression patterns at different temperatures. Results The complete M18 genome is composed of a single chromosome of 6,327,754 base pairs containing 5684 open reading frames. Seven genomic islands, including two novel prophages and five specific non-phage islands were identified besides the conserved P. aeruginosa core genome. Each prophage contains a putative chitinase coding gene, and the prophage II contains a capB gene encoding a putative cold stress protein. The non-phage genomic islands contain genes responsible for pyoluteorin biosynthesis, environmental substance degradation and type I and III restriction-modification systems. Compared with other P. aeruginosa strains, the fewest number (3) of insertion sequences and the most number (3) of clustered regularly interspaced short palindromic repeats in M18 genome may contribute to the relative genome stability. Although the M18 genome is most closely related to that of P. aeruginosa strain LESB58, the strain M18 is more susceptible to several antimicrobial agents and easier to be erased in a mouse acute lung infection model than the strain LESB58. The whole M18 transcriptomic analysis indicated that 10.6% of the expressed genes are temperature-dependent, with 22 genes up-regulated at 28°C in three non-phage genomic islands and one prophage but none at 37°C. Conclusions The P. aeruginosa strain M18 has evolved its specific genomic structures and temperature dependent expression patterns to meet the requirement of its fitness and competitiveness under selective pressures imposed on the strain in rhizosphere niche. PMID:21884571

  13. Genome Sequence and Transcriptome Analysis of the Radioresistant Bacterium Deinococcus gobiensis: Insights into the Extreme Environmental Adaptations

    PubMed Central

    Zhang, Wei; Lu, Wei; Wang, Jin; Yang, Mingkun; Zhao, Peng; Tang, Ran; Li, Xinna; Hao, Yanhua; Zhou, Zhengfu; Zhan, Yuhua; Yu, Haiying; Teng, Chao; Yan, Yongliang; Ping, Shuzhen; Wang, Yingdian; Lin, Min

    2012-01-01

    The desert is an excellent model for studying evolution under extreme environments. We present here the complete genome and ultraviolet (UV) radiation-induced transcriptome of Deinococcus gobiensis I-0, which was isolated from the cold Gobi desert and shows higher tolerance to gamma radiation and UV light than all other known microorganisms. Nearly half of the genes in the genome encode proteins of unknown function, suggesting that the extreme resistance phenotype may be attributed to unknown genes and pathways. D. gobiensis also contains a surprisingly large number of horizontally acquired genes and predicted mobile elements of different classes, which is indicative of adaptation to extreme environments through genomic plasticity. High-resolution RNA-Seq transcriptome analyses indicated that 30 regulatory proteins, including several well-known regulators and uncharacterized protein kinases, and 13 noncoding RNAs were induced immediately after UV irradiation. Particularly interesting is the UV irradiation induction of the phrB and recB genes involved in photoreactivation and recombinational repair, respectively. These proteins likely include key players in the immediate global transcriptional response to UV irradiation. Our results help to explain the exceptional ability of D. gobiensis to withstand environmental extremes of the Gobi desert, and highlight the metabolic features of this organism that have biotechnological potential. PMID:22470573

  14. Large-Scale Transcriptome Analysis of Two Sugarcane Genotypes Contrasting for Lignin Content

    PubMed Central

    Vicentini, Renato; Bottcher, Alexandra; Brito, Michael dos Santos; dos Santos, Adriana Brombini; Creste, Silvana; Landell, Marcos Guimarães de Andrade; Cesarino, Igor; Mazzafera, Paulo

    2015-01-01

    Sugarcane is an important crop worldwide for sugar and first generation ethanol production. Recently, the residue of sugarcane mills, named bagasse, has been considered a promising lignocellulosic biomass to produce the second-generation ethanol. Lignin is a major factor limiting the use of bagasse and other plant lignocellulosic materials to produce second-generation ethanol. Lignin biosynthesis pathway is a complex network and changes in the expression of genes of this pathway have in general led to diverse and undesirable impacts on plant structure and physiology. Despite its economic importance, sugarcane genome was still not sequenced. In this study a high-throughput transcriptome evaluation of two sugarcane genotypes contrasting for lignin content was carried out. We generated a set of 85,151 transcripts of sugarcane using RNA-seq and de novo assembling. More than 2,000 transcripts showed differential expression between the genotypes, including several genes involved in the lignin biosynthetic pathway. This information can give valuable knowledge on the lignin biosynthesis and its interactions with other metabolic pathways in the complex sugarcane genome. PMID:26241317

  15. A Differential Genome-Wide Transcriptome Analysis: Impact of Cellular Copper on Complex Biological Processes like Aging and Development

    PubMed Central

    Servos, Jörg; Hamann, Andrea; Grimm, Carolin; Osiewacz, Heinz D.

    2012-01-01

    The regulation of cellular copper homeostasis is crucial in biology. Impairments lead to severe dysfunctions and are known to affect aging and development. Previously, a loss-of-function mutation in the gene encoding the copper-sensing and copper-regulated transcription factor GRISEA of the filamentous fungus Podospora anserina was reported to lead to cellular copper depletion and a pleiotropic phenotype with hypopigmentation of the mycelium and the ascospores, affected fertility and increased lifespan by approximately 60% when compared to the wild type. This phenotype is linked to a switch from a copper-dependent standard to an alternative respiration leading to both a reduced generation of reactive oxygen species (ROS) and of adenosine triphosphate (ATP). We performed a genome-wide comparative transcriptome analysis of a wild-type strain and the copper-depleted grisea mutant. We unambiguously assigned 9,700 sequences of the transcriptome in both strains to the more than 10,600 predicted and annotated open reading frames of the P. anserina genome indicating 90% coverage of the transcriptome. 4,752 of the transcripts differed significantly in abundance with 1,156 transcripts differing at least 3-fold. Selected genes were investigated by qRT-PCR analyses. Apart from this general characterization we analyzed the data with special emphasis on molecular pathways related to the grisea mutation taking advantage of the available complete genomic sequence of P. anserina. This analysis verified but also corrected conclusions from earlier data obtained by single gene analysis, identified new candidates of factors as part of the cellular copper homeostasis system including target genes of transcription factor GRISEA, and provides a rich reference source of quantitative data for further in detail investigations. Overall, the present study demonstrates the importance of systems biology approaches also in cases were mutations in single genes are analyzed to explain the

  16. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae)

    DOE PAGES

    Hovde, Blake T.; Deodato, Chloe R.; Hunsperger, Heather M.; ...

    2015-09-23

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. Themore » nuclear genome of C. tobin is small (59 Mb), compact (~40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two “red” RuBisCO activases that are shared across many algal lineages. In conclusion, the Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.« less

  17. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae).

    PubMed

    Hovde, Blake T; Deodato, Chloe R; Hunsperger, Heather M; Ryken, Scott A; Yost, Will; Jha, Ramesh K; Patterson, Johnathan; Monnat, Raymond J; Barlow, Steven B; Starkenburg, Shawn R; Cattolico, Rose Ann

    2015-01-01

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb), compact (∼ 40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two "red" RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.

  18. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae)

    PubMed Central

    Hovde, Blake T.; Deodato, Chloe R.; Hunsperger, Heather M.; Ryken, Scott A.; Yost, Will; Jha, Ramesh K.; Patterson, Johnathan; Monnat, Raymond J.; Barlow, Steven B.; Starkenburg, Shawn R.; Cattolico, Rose Ann

    2015-01-01

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb), compact (∼40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two “red” RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes. PMID:26397803

  19. Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae)

    SciTech Connect

    Hovde, Blake T.; Deodato, Chloe R.; Hunsperger, Heather M.; Ryken, Scott A.; Yost, Will; Jha, Ramesh K.; Patterson, Johnathan; Monnat, Raymond J.; Barlow, Steven B.; Starkenburg, Shawn R.; Cattolico, Rose Ann; Richardson, Paul M.

    2015-09-23

    Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb), compact (~40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two “red” RuBisCO activases that are shared across many algal lineages. In conclusion, the Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes.

  20. Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome

    PubMed Central

    Honaas, Loren A.; Wafula, Eric K.; Wickett, Norman J.; Der, Joshua P.; Zhang, Yeting; Edger, Patrick P.; Altman, Naomi S.; Pires, J. Chris; Leebens-Mack, James H.; dePamphilis, Claude W.

    2016-01-01

    Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1) proportion of reads mapping to an assembly 2) recovery of conserved, widely expressed genes, 3) N50 length statistics, and 4) the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation. PMID:26731733

  1. Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome.

    PubMed

    Honaas, Loren A; Wafula, Eric K; Wickett, Norman J; Der, Joshua P; Zhang, Yeting; Edger, Patrick P; Altman, Naomi S; Pires, J Chris; Leebens-Mack, James H; dePamphilis, Claude W

    2016-01-01

    Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1) proportion of reads mapping to an assembly 2) recovery of conserved, widely expressed genes, 3) N50 length statistics, and 4) the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation.

  2. Genome and Transcriptome Analysis of the Fungal Pathogen Fusarium oxysporum f. sp. cubense Causing Banana Vascular Wilt Disease

    PubMed Central

    Zeng, Huicai; Fan, Dingding; Zhu, Yabin; Feng, Yue; Wang, Guofen; Peng, Chunfang; Jiang, Xuanting; Zhou, Dajie; Ni, Peixiang; Liang, Changcong; Liu, Lei; Wang, Jun; Mao, Chao

    2014-01-01

    Background The asexual fungus Fusarium oxysporum f. sp. cubense (Foc) causing vascular wilt disease is one of the most devastating pathogens of banana (Musa spp.). To understand the molecular underpinning of pathogenicity in Foc, the genomes and transcriptomes of two Foc isolates were sequenced. Methodology/Principal Findings Genome analysis revealed that the genome structures of race 1 and race 4 isolates were highly syntenic with those of F. oxysporum f. sp. lycopersici strain Fol4287. A large number of putative virulence associated genes were identified in both Foc genomes, including genes putatively involved in root attachment, cell degradation, detoxification of toxin, transport, secondary metabolites biosynthesis and signal transductions. Importantly, relative to the Foc race 1 isolate (Foc1), the Foc race 4 isolate (Foc4) has evolved with some expanded gene families of transporters and transcription factors for transport of toxins and nutrients that may facilitate its ability to adapt to host environments and contribute to pathogenicity to banana. Transcriptome analysis disclosed a significant difference in transcriptional responses between Foc1 and Foc4 at 48 h post inoculation to the banana ‘Brazil’ in comparison with the vegetative growth stage. Of particular note, more virulence-associated genes were up regulated in Foc4 than in Foc1. Several signaling pathways like the mitogen-activated protein kinase Fmk1 mediated invasion growth pathway, the FGA1-mediated G protein signaling pathway and a pathogenicity associated two-component system were activated in Foc4 rather than in Foc1. Together, these differences in gene content and transcription response between Foc1 and Foc4 might account for variation in their virulence during infection of the banana variety ‘Brazil’. Conclusions/Significance Foc genome sequences will facilitate us to identify pathogenicity mechanism involved in the banana vascular wilt disease development. These will thus advance

  3. Whole genome sequencing and comparative transcriptome analysis of a novel seawater adapted, salt-resistant rice cultivar - sea rice 86.

    PubMed

    Chen, Risheng; Cheng, Yunfeng; Han, Suying; Van Handel, Ben; Dong, Ling; Li, Xinmin; Xie, Xiaoqing

    2017-08-23

    Rice (Oryza sativa) is critical for human nutrition worldwide. Due to a growing population, cultivars that produce high yields in high salinity soil are of major importance. Here we describe the discovery and molecular characterization of a novel sea water adapted rice strain, Sea Rice 86 (SR86). SR86 can produce nutritious grains when grown in high salinity soil. Compared to a salt resistant rice cultivar, Yanfen 47 (YF47), SR86 grows in environments with up to 3X the salt content, and produces grains with significantly higher nutrient content in 12 measured components, including 2.9X calcium and 20X dietary fiber. Whole genome sequencing demonstrated that SR86 is a relatively ancient indica subspecies, phylogenetically close to the divergence point of the major rice varietals. SR86 has 12 chromosomes with a total genome size of 373,130,791 bps, slightly smaller than other sequenced rice genomes. Via comparison with 3000 rice genomes, we identified 42,359 putative unique, high impact variants in SR86. Transcriptome analysis of SR86 grown under normal and high saline conditions identified a large number of differentially expressed and salt-induced genes. Many of those genes fall into several gene families that have established or suggested roles in salt tolerance, while others represent potentially novel mediators of salt adaptation. Whole genome sequencing and transcriptome analysis of SR86 has laid a foundation for further molecular characterization of several desirable traits in this novel rice cultivar. A number of candidate genes related to salt adaptation identified in this study will be valuable for further functional investigation.

  4. Integrating Transcriptome and Genome Re-Sequencing Data to Identify Key Genes and Mutations Affecting Chicken Eggshell Qualities

    PubMed Central

    Liu, Long; Zheng, Chuan Wei; Wang, De He; Hou, Zhuo Cheng; Ning, Zhong Hua

    2015-01-01

    Eggshell damages lead to economic losses in the egg production industry and are a threat to human health. We examined 49-wk-old Rhode Island White hens (Gallus gallus) that laid eggs having shells with significantly different strengths and thicknesses. We used HiSeq 2000 (Illumina) sequencing to characterize the chicken transcriptome and whole genome to identify the key genes and genetic mutations associated with eggshell calcification. We identified a total of 14,234 genes expressed in the chicken uterus, representing 89% of all annotated chicken genes. A total of 889 differentially expressed genes were identified by comparing low eggshell strength (LES) and normal eggshell strength (NES) genomes. The DEGs are enriched in calcification-related processes, including calcium ion transport and calcium signaling pathways as reveled by gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis. Some important matrix proteins, such as OC-116, LTF and SPP1, were also expressed differentially between two groups. A total of 3,671,919 single-nucleotide polymorphisms (SNPs) and 508,035 Indels were detected in protein coding genes by whole-genome re-sequencing, including 1775 non-synonymous variations and 19 frame-shift Indels in DEGs. SNPs and Indels found in this study could be further investigated for eggshell traits. This is the first report to integrate the transcriptome and genome re-sequencing to target the genetic variations which decreased the eggshell qualities. These findings further advance our understanding of eggshell calcification in the chicken uterus. PMID:25974068

  5. Identification of large-scale genomic variation in cancer genomes using in silico reference models

    PubMed Central

    Killcoyne, Sarah; del Sol, Antonio

    2016-01-01

    Identifying large-scale structural variation in cancer genomes continues to be a challenge to researchers. Current methods rely on genome alignments based on a reference that can be a poor fit to highly variant and complex tumor genomes. To address this challenge we developed a method that uses available breakpoint information to generate models of structural variations. We use these models as references to align previously unmapped and discordant reads from a genome. By using these models to align unmapped reads, we show that our method can help to identify large-scale variations that have been previously missed. PMID:26264669

  6. Reference-Free Population Genomics from Next-Generation Transcriptome Data and the Vertebrate–Invertebrate Gap

    PubMed Central

    Glémin, Sylvain; Bierne, Nicolas; Carneiro, Miguel; Nabholz, Benoit; Lourenco, Joao M.; Alves, Paulo C.; Ballenghien, Marion; Faivre, Nicolas; Belkhir, Khalid; Cahais, Vincent; Loire, Etienne; Bernard, Aurélien; Galtier, Nicolas

    2013-01-01

    In animals, the population genomic literature is dominated by two taxa, namely mammals and drosophilids, in which fully sequenced, well-annotated genomes have been available for years. Data from other metazoan phyla are scarce, probably because the vast majority of living species still lack a closely related reference genome. Here we achieve de novo, reference-free population genomic analysis from wild samples in five non-model animal species, based on next-generation sequencing transcriptome data. We introduce a pipe-line for cDNA assembly, read mapping, SNP/genotype calling, and data cleaning, with specific focus on the issue of hidden paralogy detection. In two species for which a reference genome is available, similar results were obtained whether the reference was used or not, demonstrating the robustness of our de novo inferences. The population genomic profile of a hare, a turtle, an oyster, a tunicate, and a termite were found to be intermediate between those of human and Drosophila, indicating that the discordant genomic diversity patterns that have been reported between these two species do not reflect a generalized vertebrate versus invertebrate gap. The genomic average diversity was generally higher in invertebrates than in vertebrates (with the notable exception of termite), in agreement with the notion that population size tends to be larger in the former than in the latter. The non-synonymous to synonymous ratio, however, did not differ significantly between vertebrates and invertebrates, even though it was negatively correlated with genetic diversity within each of the two groups. This study opens promising perspective regarding genome-wide population analyses of non-model organisms and the influence of population size on non-synonymous versus synonymous diversity. PMID:23593039

  7. Local Adaptation at the Transcriptome Level in Brown Trout: Evidence from Early Life History Temperature Genomic Reaction Norms

    PubMed Central

    Meier, Kristian; Hansen, Michael Møller; Normandeau, Eric; Mensberg, Karen-Lise D.; Frydenberg, Jane; Larsen, Peter Foged; Bekkevold, Dorte; Bernatchez, Louis

    2014-01-01

    Local adaptation and its underlying molecular basis has long been a key focus in evolutionary biology. There has recently been increased interest in the evolutionary role of plasticity and the molecular mechanisms underlying local adaptation. Using transcriptome analysis, we assessed differences in gene expression profiles for three brown trout (Salmo trutta) populations, one resident and two anadromous, experiencing different temperature regimes in the wild. The study was based on an F2 generation raised in a common garden setting. A previous study of the F1 generation revealed different reaction norms and significantly higher QST than FST among populations for two early life-history traits. In the present study we investigated if genomic reaction norm patterns were also present at the transcriptome level. Eggs from the three populations were incubated at two temperatures (5 and 8 degrees C) representing conditions encountered in the local environments. Global gene expression for fry at the stage of first feeding was analysed using a 32k cDNA microarray. The results revealed differences in gene expression between populations and temperatures and population × temperature interactions, the latter indicating locally adapted reaction norms. Moreover, the reaction norms paralleled those observed previously at early life-history traits. We identified 90 cDNA clones among the genes with an interaction effect that were differently expressed between the ecologically divergent populations. These included genes involved in immune- and stress response. We observed less plasticity in the resident as compared to the anadromous populations, possibly reflecting that the degree of environmental heterogeneity encountered by individuals throughout their life cycle will select for variable level of phenotypic plasticity at the transcriptome level. Our study demonstrates the usefulness of transcriptome approaches to identify genes with different temperature reaction norms. The

  8. Genome-scale resources for Thermoanaerobacterium saccharolyticum

    DOE PAGES

    Currie, Devin H.; Raman, Babu; Gowen, Christopher M.; ...

    2015-06-26

    Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. For this research, a major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation.

  9. Scaling, crumpled wires, and genome packing in virions

    NASA Astrophysics Data System (ADS)

    de Holanda, V. H.; Gomes, M. A. F.

    2016-12-01

    The packing of a genome in virions is a topic of intense current interest in biology and biological physics. The area is dominated by allometric scaling relations that connect, e.g., the length of the encapsulated genome and the size of the corresponding virion capsid. Here we report scaling laws obtained from extensive experiments of packing of a macroscopic wire within rigid three-dimensional spherical and nonspherical cavities that can shed light on the details of the genome packing in virions. We show that these results obtained with crumpled wires are comparable to those from a large compilation of biological data from several classes of virions.

  10. Using Genome-Scale Models to Predict Biological Capabilities

    PubMed Central

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods at the genome-scale have been under development since the first whole genome sequences appeared in the mid-1990s. A few years ago this approach began to demonstrate the ability to predict a range of cellular functions including cellular growth capabilities on various substrates and the effect of gene knockouts at the genome-scale. Thus, much interest has developed in understanding and applying these methods to areas such as metabolic engineering, antibiotic design, and organismal and enzyme evolution. This primer will get you started. PMID:26000478

  11. Complete genome sequence and transcriptomics analyses reveal pigment biosynthesis and regulatory mechanisms in an industrial strain, Monascus purpureus YY-1.

    PubMed

    Yang, Yue; Liu, Bin; Du, Xinjun; Li, Ping; Liang, Bin; Cheng, Xiaozhen; Du, Liangcheng; Huang, Di; Wang, Lei; Wang, Shuo

    2015-02-09

    Monascus has been used to produce natural colorants and food supplements for more than one thousand years, and approximately more than one billion people eat Monascus-fermented products during their daily life. In this study, using next-generation sequencing and optical mapping approaches, a 24.1-Mb complete genome of an industrial strain, Monascus purpureus YY-1, was obtained. This genome consists of eight chromosomes and 7,491 genes. Phylogenetic analysis at the genome level provides convincing evidence for the evolutionary position of M. purpureus. We provide the first comprehensive prediction of the biosynthetic pathway for Monascus pigment. Comparative genomic analyses show that the genome of M. purpureus is 13.6-40% smaller than those of closely related filamentous fungi and has undergone significant gene losses, most of which likely occurred during its specialized adaptation to starch-based foods. Comparative transcriptome analysis reveals that carbon starvation stress, resulting from the use of relatively low-quality carbon sources, contributes to the high yield of pigments by repressing central carbon metabolism and augmenting the acetyl-CoA pool. Our work provides important insights into the evolution of this economically important fungus and lays a foundation for future genetic manipulation and engineering of this strain.

  12. LEMONS – A Tool for the Identification of Splice Junctions in Transcriptomes of Organisms Lacking Reference Genomes

    PubMed Central

    Bouskila, Amos; Chorev, Michal; Carmel, Liran; Mishmar, Dan

    2015-01-01

    RNA-seq is becoming a preferred tool for genomics studies of model and non-model organisms. However, DNA-based analysis of organisms lacking sequenced genomes cannot rely on RNA-seq data alone to isolate most genes of interest, as DNA codes both exons and introns. With this in mind, we designed a novel tool, LEMONS, that exploits the evolutionary conservation of both exon/intron boundary positions and splice junction recognition signals to produce high throughput splice-junction predictions in the absence of a reference genome. When tested on multiple annotated vertebrate mRNA data, LEMONS accurately identified 87% (average) of the splice-junctions. LEMONS was then applied to our updated Mediterranean chameleon transcriptome, which lacks a reference genome, and predicted a total of 90,820 exon-exon junctions. We experimentally verified these splice-junction predictions by amplifying and sequencing twenty randomly selected genes from chameleon DNA templates. Exons and introns were detected in 19 of 20 of the positions predicted by LEMONS. To the best of our knowledge, LEMONS is currently the only experimentally verified tool that can accurately predict splice-junctions in organisms that lack a reference genome. PMID:26606265

  13. Impact of a short-term exposure to spaceflight on the phenotype, genome, transcriptome and proteome of Escherichia coli

    NASA Astrophysics Data System (ADS)

    Li, Tianzhi; Chang, De; Xu, Huiwen; Chen, Jiapeng; Su, Longxiang; Guo, Yinghua; Chen, Zhenhong; Wang, Yajuan; Wang, Li; Wang, Junfeng; Fang, Xiangqun; Liu, Changting

    2015-07-01

    Escherichia coli (E. coli) is the most widely applied model organism in current biological science. As a widespread opportunistic pathogen, E. coli can survive not only by symbiosis with human, but also outside the host as well, which necessitates the evaluation of its response to the space environment. Therefore, to keep humans safe in space, it is necessary to understand how the bacteria respond to this environment. Despite extensive investigations for a few decades, the response of E. coli to the real space environment is still controversial. To better understand the mechanisms how E. coli overcomes harsh environments such as microgravity in space and to investigate whether these factors may induce pathogenic changes in E. coli that are potentially detrimental to astronauts, we conducted detailed genomics, transcriptomic and proteomic studies on E. coli that experienced 17 days of spaceflight. By comparing two flight strains LCT-EC52 and LCT-EC59 to a control strain LCT-EC106 that was cultured under the same temperature conditions on the ground, we identified metabolism changes, polymorphism changes, differentially expressed genes and proteins in the two flight strains. The flight strains differed from the control in the utilization of more than 30 carbon sources. Two single nucleotide polymorphisms (SNPs) and one deletion were identified in the flight strains. The expression level of more than 1000 genes altered in flight strains. Genes involved in chemotaxis, lipid metabolism and cell motility express differently. Moreover, the two flight strains also differed extensively from each other in terms of metabolism, transcriptome and proteome, indicating the impact of space environment on individual cells is heterogeneous and probably genotype-dependent. This study presents the first systematic profile of E. coli genome, transcriptome and proteome after spaceflight, which helps to elucidate the mechanism that controls the adaptation of microbes to the space

  14. The developing xylem transcriptome and genome-wide analysis of alternative splicing in Populus trichocarpa (black cottonwood) populations

    PubMed Central

    2013-01-01

    Background Alternative splicing (AS) of genes is an efficient means of generating variation in protein structure and function. AS variation has been observed between tissues, cell types, and different treatments in non-woody plants such as Arabidopsis thaliana (Arabidopsis) and rice. However, little is known about AS patterns in wood-forming tissues and how much AS variation exists within plant populations. Results Here we used high-throughput RNA sequencing to analyze the Populus trichocarpa (P. trichocarpa) xylem transcriptome in 20 individuals from different populations across much of its range in western North America. Deep transcriptome sequencing and mapping of reads to the P. trichocarpa reference genome identified a suite of xylem-expressed genes common to all accessions. Our analysis suggests that at least 36% of the xylem-expressed genes in P. trichocarpa are alternatively spliced. Extensive AS was observed in cell-wall biosynthesis related genes such as glycosyl transferases and C2H2 transcription factors. 27902 AS events were documented and most of these events were not conserved across individuals. Differences in isoform-specific read densities indicated that 7% and 13% of AS events showed significant differences between individuals within geographically separated southern and northern populations, a level that is in general agreement with AS variation in human populations. Conclusions This genome-wide analysis of alternative splicing reveals high levels of AS in P. trichocarpa and extensive inter-individual AS variation. We provide the most comprehensive analysis of AS in P. trichocarpa to date, which will serve as a valuable resource for the plant community to study transcriptome complexity and AS regulation during wood formation. PMID:23718132

  15. Comparative sequence analyses of genome and transcriptome reveal novel transcripts and variants in the Asian elephant Elephas maximus.

    PubMed

    Reddy, Puli Chandramouli; Sinha, Ishani; Kelkar, Ashwin; Habib, Farhat; Pradhan, Saurabh J; Sukumar, Raman; Galande, Sanjeev

    2015-12-01

    The Asian elephant Elephas maximus and the African elephant Loxodonta africana that diverged 5-7 million years ago exhibit differences in their physiology, behaviour and morphology. A comparative genomics approach would be useful and necessary for evolutionary and functional genetic studies of elephants. We performed sequencing of E. maximus and map to L. africana at ~15X coverage. Through comparative sequence analyses, we have identified Asian elephant specific homozygous, non-synonymous single nucleotide variants (SNVs) that map to 1514 protein coding genes, many of which are involved in olfaction. We also present the first report of a high-coverage transcriptome sequence in E. maximus from peripheral blood lymphocytes. We have identified 103 novel protein coding transcripts and 66-long non-coding (lnc)RNAs. We also report the presence of 181 protein domains unique to elephants when compared to other Afrotheria species. Each of these findings can be further investigated to gain a better understanding of functional differences unique to elephant species, as well as those unique to elephantids in comparison with other mammals. This work therefore provides a valuable resource to explore the immense research potential of comparative analyses of transcriptome and genome sequences in the Asian elephant.

  16. Comparative transcriptome assembly and genome-guided profiling for Brettanomyces bruxellensis LAMAP2480 during p-coumaric acid stress

    PubMed Central

    Godoy, Liliana; Vera-Wolf, Patricia; Martinez, Claudio; Ugalde, Juan A.; Ganga, María Angélica

    2016-01-01

    Brettanomyces bruxellensis has been described as the main contaminant yeast in wine production, due to its ability to convert the hydroxycinnamic acids naturally present in the grape phenolic derivatives, into volatile phenols. Currently, there are no studies in B. bruxellensis which explains the resistance mechanisms to hydroxycinnamic acids, and in particular to p-coumaric acid which is directly involved in alterations to wine. In this work, we performed a transcriptome analysis of B. bruxellensis LAMAP248rown in the presence and absence of p-coumaric acid during lag phase. Because of reported genetic variability among B. bruxellensis strains, to complement de novo assembly of the transcripts, we used the high-quality genome of B. bruxellensis AWRI1499, as well as the draft genomes of strains CBS2499 and0 g LAMAP2480. The results from the transcriptome analysis allowed us to propose a model in which the entrance of p-coumaric acid to the cell generates a generalized stress condition, in which the expression of proton pump and efflux of toxic compounds are induced. In addition, these mechanisms could be involved in the outflux of nitrogen compounds, such as amino acids, decreasing the overall concentration and triggering the expression of nitrogen metabolism genes. PMID:27678167

  17. Comparative transcriptome assembly and genome-guided profiling for Brettanomyces bruxellensis LAMAP2480 during p-coumaric acid stress.

    PubMed

    Godoy, Liliana; Vera-Wolf, Patricia; Martinez, Claudio; Ugalde, Juan A; Ganga, María Angélica

    2016-09-28

    Brettanomyces bruxellensis has been described as the main contaminant yeast in wine production, due to its ability to convert the hydroxycinnamic acids naturally present in the grape phenolic derivatives, into volatile phenols. Currently, there are no studies in B. bruxellensis which explains the resistance mechanisms to hydroxycinnamic acids, and in particular to p-coumaric acid which is directly involved in alterations to wine. In this work, we performed a transcriptome analysis of B. bruxellensis LAMAP248rown in the presence and absence of p-coumaric acid during lag phase. Because of reported genetic variability among B. bruxellensis strains, to complement de novo assembly of the transcripts, we used the high-quality genome of B. bruxellensis AWRI1499, as well as the draft genomes of strains CBS2499 and0 g LAMAP2480. The results from the transcriptome analysis allowed us to propose a model in which the entrance of p-coumaric acid to the cell generates a generalized stress condition, in which the expression of proton pump and efflux of toxic compounds are induced. In addition, these mechanisms could be involved in the outflux of nitrogen compounds, such as amino acids, decreasing the overall concentration and triggering the expression of nitrogen metabolism genes.

  18. Molecular diversification of peptide toxins from the tarantula Haplopelma hainanum (Ornithoctonus hainana) venom based on transcriptomic, peptidomic, and genomic analyses.

    PubMed

    Tang, Xing; Zhang, Yongqun; Hu, Weijun; Xu, Dehong; Tao, Huai; Yang, Xiaoxu; Li, Yan; Jiang, Liping; Liang, Songping

    2010-05-07

    The tarantula Haplopelma hainanum (Ornithoctonus hainana) is a very venomous spider found widely in the hilly areas of Hainan province in southern China. Its venom contains a variety of toxic components with different pharmacological properties. In the present study, we used a venomic strategy for high-throughput identification of tarantula-venom peptides from H. hainanum. This strategy includes three different approaches: (i) transcriptomics, that is, EST-based cloning and PCR-based cloning plus DNA sequencing; (ii) peptidomics, that is, off-line multiple dimensional liquid chromatography coupled with mass spectrometry (MDLC-MS) plus peptide sequencing (direct Edman sequencing and bottom-up mass spectrometric sequencing); (iii) genomics, that is, genomic DNA cloning plus DNA sequencing. About 420 peptide toxins were detected by mass spectrometry, and 272 peptide precursors were deduced from cDNA and genomic DNA sequences. After redundancy removal, 192 mature sequences were identified by three approaches. This is the largest number of peptide toxin sequences identified from a spider species so far. On the basis of precursor sequence identity, peptide toxins from the tarantula H. hainanum venom can be classified into 11 superfamilies (and related families). Our results revealed that gene duplication and focal hypermutation may be responsible for the enormous molecular diversity in spider peptide toxins. The current work is an initial overview for the study of tarantula-venom peptides in parallel transcriptomic, peptidomic, and genomic analyses. It is hoped that this work will also provide an effective guide for high-throughput identification of peptide toxins from other spider species, especially tarantula species.

  19. Genome-scale Analysis of Escherichia coli FNR Reveals Complex Features of Transcription Factor Binding

    PubMed Central

    Myers, Kevin S.; Yan, Huihuang; Ong, Irene M.; Chung, Dongjun; Liang, Kun; Tran, Frances; Keleş, Sündüz; Landick, Robert; Kiley, Patricia J.

    2013-01-01

    FNR is a well-studied global regulator of anaerobiosis, which is widely conserved across bacteria. Despite the importance of FNR and anaerobiosis in microbial lifestyles, the factors that influence its function on a genome-wide scale are poorly understood. Here, we report a functional genomic analysis of FNR action. We find that FNR occupancy at many target sites is strongly influenced by nucleoid-associated proteins (NAPs) that restrict access to many FNR binding sites. At a genome-wide level, only a subset of predicted FNR binding sites were bound under anaerobic fermentative conditions and many appeared to be masked by the NAPs H-NS, IHF and Fis. Similar assays in cells lacking H-NS and its paralog StpA showed increased FNR occupancy at sites bound by H-NS in WT strains, indicating that large regions of the genome are not readily accessible for FNR binding. Genome accessibility may also explain our finding that genome-wide FNR occupancy did not correlate with the match to consensus at binding sites, suggesting that significant variation in ChIP signal was attributable to cross-linking or immunoprecipitation efficiency rather than differences in binding affinities for FNR sites. Correlation of FNR ChIP-seq peaks with transcriptomic data showed that less than half of the FNR-regulated operons could be attributed to direct FNR binding. Conversely, FNR bound some promoters without regulating expression presumably requiring changes in activity of condition-specific transcription factors. Such combinatorial regulation may allow Escherichia coli to respond rapidly to environmental changes and confer an ecological advantage in the anaerobic but nutrient-fluctuating environment of the mammalian gut. PMID:23818864

  20. The OME Framework for genome-scale systems biology

    SciTech Connect

    Palsson, Bernhard O.; Ebrahim, Ali; Federowicz, Steve

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  1. Nicotiana attenuata Data Hub (NaDH): an integrative platform for exploring genomic, transcriptomic and metabolomic data in wild tobacco.

    PubMed

    Brockmöller, Thomas; Ling, Zhihao; Li, Dapeng; Gaquerel, Emmanuel; Baldwin, Ian T; Xu, Shuqing

    2017-01-13

    Nicotiana attenuata (coyote tobacco) is an ecological model for studying plant-environment interactions and plant gene function under real-world conditions. During the last decade, large amounts of genomic, transcriptomic and metabolomic data have been generated with this plant which has provided new insights into how native plants interact with herbivores, pollinators and microbes. However, an integrative and open access platform that allows for the efficient mining of these -omics data remained unavailable until now. We present the Nicotiana attenuata Data Hub (NaDH) as a centralized platform for integrating and visualizing genomic, phylogenomic, transcriptomic and metabolomic data in N. attenuata. The NaDH currently hosts collections of predicted protein coding sequences of 11 plant species, including two recently sequenced Nicotiana species, and their functional annotations, 222 microarray datasets from 10 different experiments, a transcriptomic atlas based on 20 RNA-seq expression profiles and a metabolomic atlas based on 895 metabolite spectra analyzed by mass spectrometry. We implemented several visualization tools, including a modified version of the Electronic Fluorescent Pictograph (eFP) browser, co-expression networks and the Interactive Tree Of Life (iTOL) for studying gene expression divergence among duplicated homologous. In addition, the NaDH allows researchers to query phylogenetic trees of 16,305 gene families and provides tools for analyzing their evolutionary history. Furthermore, we also implemented tools to identify co-expressed genes and metabolites, which can be used for predicting the functions of genes. Using the transcription factor NaMYB8 as an example, we illustrate that the tools and data in NaDH can facilitate identification of candidate genes involved in the biosynthesis of specialized metabolites. The NaDH provides interactive visualization and data analysis tools that integrate the expression and evolutionary history of genes in

  2. The mitochondrial genome and transcriptome of the basal dinoflagellate Hematodinium sp.: character evolution within the highly derived mitochondrial genomes of dinoflagellates.

    PubMed

    Jackson, C J; Gornik, S G; Waller, R F

    2012-01-01

    The sister phyla dinoflagellates and apicomplexans inherited a drastically reduced mitochondrial genome (mitochondrial DNA, mtDNA) containing only three protein-coding (cob, cox1, and cox3) genes and two ribosomal RNA (rRNA) genes. In apicomplexans, single copies of these genes are encoded on the smallest known mtDNA chromosome (6 kb). In dinoflagellates, however, the genome has undergone further substantial modifications, including massive genome amplification and recombination resulting in multiple copies of each gene and gene fragments linked in numerous combinations. Furthermore, protein-encoding genes have lost standard stop codons, trans-splicing of messenger RNAs (mRNAs) is required to generate complete cox3 transcripts, and extensive RNA editing recodes most genes. From taxa investigated to date, it is unclear when many of these unusual dinoflagellate mtDNA characters evolved. To address this question, we investigated the mitochondrial genome and transcriptome character states of the deep branching dinoflagellate Hematodinium sp. Genomic data show that like later-branching dinoflagellates Hematodinium sp. also contains an inflated, heavily recombined genome of multicopy genes and gene fragments. Although stop codons are also lacking for cox1 and cob, cox3 still encodes a conventional stop codon. Extensive editing of mRNAs also occurs in Hematodinium sp. The mtDNA of basal dinoflagellate Hematodinium sp. indicates that much of the mtDNA modification in dinoflagellates occurred early in this lineage, including genome amplification and recombination, and decreased use of standard stop codons. Trans-splicing, on the other hand, occurred after Hematodinium sp. diverged. Only RNA editing presents a nonlinear pattern of evolution in dinoflagellates as this process occurs in Hematodinium sp. but is absent in some later-branching taxa indicating that this process was either lost in some lineages or developed more than once during the evolution of the highly unusual

  3. Traumatic Brain Injury Induces Genome-Wide Transcriptomic, Methylomic, and Network Perturbations in Brain and Blood Predicting Neurological Disorders.

    PubMed

    Meng, Qingying; Zhuang, Yumei; Ying, Zhe; Agrawal, Rahul; Yang, Xia; Gomez-Pinilla, Fernando

    2017-02-01

    The complexity of the traumatic brain injury (TBI) pathology, particularly concussive injury, is a serious obstacle for diagnosis, treatment, and long-term prognosis. Here we utilize modern systems biology in a rodent model of concussive injury to gain a thorough view of the impact of TBI on fundamental aspects of gene regulation, which have the potential to drive or alter the course of the TBI pathology. TBI perturbed epigenomic programming, transcriptional activities (expression level and alternative splicing), and the organization of genes in networks centered around genes such as Anax2, Ogn, and Fmod. Transcriptomic signatures in the hippocampus are involved in neuronal signaling, metabolism, inflammation, and blood function, and they overlap with those in leukocytes from peripheral blood. The homology between genomic signatures from blood and brain elicited by TBI provides proof of concept information for development of biomarkers of TBI based on composite genomic patterns. By intersecting with human genome-wide association studies, many TBI signature genes and network regulators identified in our rodent model were causally associated with brain disorders with relevant link to TBI. The overall results show that concussive brain injury reprograms genes which could lead to predisposition to neurological and psychiatric disorders, and that genomic information from peripheral leukocytes has the potential to predict TBI pathogenesis in the brain. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  4. Assessment of the Diversity of Dairy Lactococcus lactis subsp. lactis Isolates by an Integrated Approach Combining Phenotypic, Genomic, and Transcriptomic Analyses ▿ †

    PubMed Central

    Tan-a-ram, Punthip; Cardoso, Tamara; Daveran-Mingot, Marie-Line; Kanchanatawee, Sunthorn; Loubière, Pascal; Girbal, Laurence; Cocaign-Bousquet, Muriel

    2011-01-01

    The intrasubspecies diversity of six strains of Lactococcus lactis subsp. lactis was investigated at the genomic level and in terms of phenotypic and transcriptomic profiles in an ultrafiltration cheese model. The six strains were isolated from various sources, but all exhibited a dairy phenotype (growth in ultrafiltration cheese model and high acidification rate). The six strains exhibited similar behaviors in terms of growth during cheese ripening, while different acidification capabilities were detected. Even if all strains displayed large genomic similarities, sharing a large core genome of almost 2,000 genes, the expression of this core genome directly in the cheese matrix revealed major strain-specific differences that potentially could account for the observed different acidification capabilities. This work demonstrated that significant transcriptomic polymorphisms exist even among Lactococcus lactis subsp. lactis strains with the same dairy origin. PMID:21131529

  5. Large-scale and global features of complex genomic signals

    NASA Astrophysics Data System (ADS)

    Cristea, Paul D.

    2003-10-01

    The paper briefly reviews the methodology of the symbolic nucleic sequence conversion into genomic signals and presents large scale and global features of the resulting genomic signals. Whole chromosomes or whole genomes are converted into complex signals and phase analysis is performed. The phase, cumulated phase and unwrapped phase of genomic signals are studied as tools for revealing important features of to the first and second order statistics of nucleotide distribution along DNA strands. It is shown that the unwrapped phase displays an almost linear variation along whole chromosomes. The property holds for all the investigated genomes, being shared by both prokaryotes and eukaryotes, while the magnitude and sign of the unwrapped phase slope is specific for each taxon and chromosome. The comparison between the behavior of the cumulated phase and of the unwrapped phase across the putative origins and termini of the replichores suggests a model of the 'patchy' structure of the chromosomes.

  6. Genome-wide comparative transcriptome analysis of CMS-D2 and its maintainer and restorer lines in upland cotton.

    PubMed

    Wu, Jianyong; Zhang, Meng; Zhang, Bingbing; Zhang, Xuexian; Guo, Liping; Qi, Tingxiang; Wang, Hailin; Zhang, Jinfa; Xing, Chaozhu

    2017-06-08

    Cytoplasmic male sterility (CMS) conferred by the cytoplasm from Gossypium harknessii (D2) is an important system for hybrid seed production in Upland cotton (G. hirsutum). The male sterility of CMS-D2 (i.e., A line) can be restored to fertility by a restorer (i.e., R line) carrying the restorer gene Rf1 transferred from the D2 nuclear genome. However, the molecular mechanisms of CMS-D2 and its restoration are poorly understood. In this study, a genome-wide comparative transcriptome analysis was performed to identify differentially expressed genes (DEGs) in flower buds among the isogenic fertile R line and sterile A line derived from a backcross population (BC8F1) and the recurrent parent, i.e., the maintainer (B line). A total of 1464 DEGs were identified among the three isogenic lines, and the Rf1-carrying Chr_D05 and its homeologous Chr_A05 had more DEGs than other chromosomes. The results of GO and KEGG enrichment analysis showed differences in circadian rhythm between the fertile and sterile lines. Eleven DEGs were selected for validation using qRT-PCR, confirming the accuracy of the RNA-seq results. Through genome-wide comparative transcriptome analysis, the differential expression profiles of CMS-D2 and its maintainer and restorer lines in Upland cotton were identified. Our results provide an important foundation for further studies into the molecular mechanisms of the interactions between the restorer gene Rf1 and the CMS-D2 cytoplasm.

  7. Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018

    PubMed Central

    2011-01-01

    Background Clostridium acetobutylicum, a gram-positive and spore-forming anaerobe, is a major strain for the fermentative production of acetone, butanol and ethanol. But a previously isolated hyper-butanol producing strain C. acetobutylicum EA 2018 does not produce spores and has greater capability of solvent production, especially for butanol, than the type strain C. acetobutylicum ATCC 824. Results Complete genome of C. acetobutylicum EA 2018 was sequenced using Roche 454 pyrosequencing. Genomic comparison with ATCC 824 identified many variations which may contribute to the hyper-butanol producing characteristics in the EA 2018 strain, including a total of 46 deletion sites and 26 insertion sites. In addition, transcriptomic profiling of gene expression in EA 2018 relative to that of ATCC824 revealed expression-level changes of several key genes related to solvent formation. For example, spo0A and adhEII have higher expression level, and most of the acid formation related genes have lower expression level in EA 2018. Interestingly, the results also showed that the variation in CEA_G2622 (CAC2613 in ATCC 824), a putative transcriptional regulator involved in xylose utilization, might accelerate utilization of substrate xylose. Conclusions Comparative analysis of C. acetobutylicum hyper-butanol producing strain EA 2018 and type strain ATCC 824 at both genomic and transcriptomic levels, for the first time, provides molecular-level understanding of non-sporulation, higher solvent production and enhanced xylose utilization in the mutant EA 2018. The information could be valuable for further genetic modification of C. acetobutylicum for more effective butanol production. PMID:21284892

  8. Reconstructing genome-scale metabolic models with merlin.

    PubMed

    Dias, Oscar; Rocha, Miguel; Ferreira, Eugénio C; Rocha, Isabel

    2015-04-30

    The Metabolic Models Reconstruction Using Genome-Scale Information (merlin) tool is a user-friendly Java application that aids the reconstruction of genome-scale metabolic models for any organism that has its genome sequenced. It performs the major steps of the reconstruction process, including the functional genomic annotation of the whole genome and subsequent construction of the portfolio of reactions. Moreover, merlin includes tools for the identification and annotation of genes encoding transport proteins, generating the transport reactions for those carriers. It also performs the compartmentalisation of the model, predicting the organelle localisation of the proteins encoded in the genome and thus the localisation of the metabolites involved in the reactions promoted by such enzymes. The gene-proteins-reactions (GPR) associations are automatically generated and included in the model. Finally, merlin expedites the transition from genomic data to draft metabolic models reconstructions exported in the SBML standard format, allowing the user to have a preliminary view of the biochemical network, which can be manually curated within the environment provided by merlin.

  9. Reconstructing genome-scale metabolic models with merlin

    PubMed Central

    Dias, Oscar; Rocha, Miguel; Ferreira, Eugénio C.; Rocha, Isabel

    2015-01-01

    The Metabolic Models Reconstruction Using Genome-Scale Information (merlin) tool is a user-friendly Java application that aids the reconstruction of genome-scale metabolic models for any organism that has its genome sequenced. It performs the major steps of the reconstruction process, including the functional genomic annotation of the whole genome and subsequent construction of the portfolio of reactions. Moreover, merlin includes tools for the identification and annotation of genes encoding transport proteins, generating the transport reactions for those carriers. It also performs the compartmentalisation of the model, predicting the organelle localisation of the proteins encoded in the genome and thus the localisation of the metabolites involved in the reactions promoted by such enzymes. The gene-proteins-reactions (GPR) associations are automatically generated and included in the model. Finally, merlin expedites the transition from genomic data to draft metabolic models reconstructions exported in the SBML standard format, allowing the user to have a preliminary view of the biochemical network, which can be manually curated within the environment provided by merlin. PMID:25845595

  10. Genomic and Transcriptomic Studies of an RDX (Hexahydro-1,3,5-Trinitro-1,3,5-Triazine)-Degrading Actinobacterium

    PubMed Central

    Chen, Hao-Ping; Zhu, Song-Hua; Casabon, Israël; Hallam, Steven J.; Crocker, Fiona H.; Mohn, William W.

    2012-01-01

    Whole-genome sequencing, transcriptomic analyses, and metabolic reconstruction were used to investigate Gordonia sp. strain KTR9's ability to catabolize a range of compounds, including explosives and steroids. Aspects of this mycolic acid-containing actinobacterium's catabolic potential were experimentally verified and compared with those of rhodococci and mycobacteria. PMID:22923396

  11. Characterization of the genome and transcriptome of the blue tit Cyanistes caeruleus: polymorphisms, sex-biased expression and selection signals.

    PubMed

    Mueller, Jakob C; Kuhl, Heiner; Timmermann, Bernd; Kempenaers, Bart

    2016-03-01

    Decoding genomic sequences and determining their variation within populations has potential to reveal adaptive processes and unravel the genetic basis of ecologically relevant trait variation within a species. The blue tit Cyanistes caeruleus--a long-time ecological model species--has been used to investigate fitness consequences of variation in mating and reproductive behaviour. However, very little is known about the underlying genetic changes due to natural and sexual selection in the genome of this songbird. As a step to bridge this gap, we assembled the first draft genome of a single blue tit, mapped the transcriptome of five females and five males to this reference, identified genomewide variants and performed sex-differential expression analysis in the gonads, brain and other tissues. In the gonads, we found a high number of sex-biased genes, and of those, a similar proportion were sex-limited (genes only expressed in one sex) in males and females. However, in the brain, the proportion of female-limited genes within the female-biased gene category (82%) was substantially higher than the proportion of male-limited genes within the male-biased category (6%). This suggests a predominant on-off switching mechanism for the female-limited genes. In addition, most male-biased genes were located on the Z-chromosome, indicating incomplete dosage compensation for the male-biased genes. We called more than 500,000 SNPs from the RNA-seq data. Heterozygote detection in the single reference individual was highly congruent between DNA-seq and RNA-seq calling. Using information from these polymorphisms, we identified potential selection signals in the genome. We list candidate genes which can be used for further sequencing and detailed selection studies, including genes potentially related to meiotic drive evolution. A public genome browser of the blue tit with the described information is available at http://public-genomes-ngs.molgen.mpg.de. © 2015 John Wiley & Sons Ltd.

  12. Genomic and Transcriptomic Associations Identify a New Insecticide Resistance Phenotype for the Selective Sweep at the Cyp6g1 Locus of Drosophila melanogaster

    PubMed Central

    Battlay, Paul; Schmidt, Joshua M.; Fournier-Level, Alexandre; Robin, Charles

    2016-01-01

    Scans of the Drosophila melanogaster genome have identified organophosphate resistance loci among those with the most pronounced signature of positive selection. In this study, the molecular basis of resistance to the organophosphate insecticide azinphos-methyl was investigated using the Drosophila Genetic Reference Panel, and genome-wide association. Recently released full transcriptome data were used to extend the utility of the Drosophila Genetic Reference Panel resource beyond traditional genome-wide association studies to allow systems genetics analyses of phenotypes. We found that both genomic and transcriptomic associations independently identified Cyp6g1, a gene involved in resistance to DDT and neonicotinoid insecticides, as the top candidate for azinphos-methyl resistance. This was verified by transgenically overexpressing Cyp6g1 using natural regulatory elements from a resistant allele, resulting in a 6.5-fold increase in resistance. We also identified four novel candidate genes associated with azinphos-methyl resistance, all of which are involved in either regulation of fat storage, or nervous system development. In Cyp6g1, we find a demonstrable resistance locus, a verification that transcriptome data can be used to identify variants associated with insecticide resistance, and an overlap between peaks of a genome-wide association study, and a genome-wide selective sweep analysis. PMID:27317781

  13. Discovery of Genes Related to Insecticide Resistance in Bactrocera dorsalis by Functional Genomic Analysis of a De Novo Assembled Transcriptome

    PubMed Central

    Hsu, Ju-Chun; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S.; Chen, Chien-Yu

    2012-01-01

    Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to

  14. Territorial Polymers and Large Scale Genome Organization

    NASA Astrophysics Data System (ADS)

    Grosberg, Alexander

    2012-02-01

    Chromatin fiber in interphase nucleus represents effectively a very long polymer packed in a restricted volume. Although polymer models of chromatin organization were considered, most of them disregard the fact that DNA has to stay not too entangled in order to function properly. One polymer model with no entanglements is the melt of unknotted unconcatenated rings. Extensive simulations indicate that rings in the melt at large length (monomer numbers) N approach the compact state, with gyration radius scaling as N^1/3, suggesting every ring being compact and segregated from the surrounding rings. The segregation is consistent with the known phenomenon of chromosome territories. Surface exponent β (describing the number of contacts between neighboring rings scaling as N^β) appears only slightly below unity, β 0.95. This suggests that the loop factor (probability to meet for two monomers linear distance s apart) should decay as s^-γ, where γ= 2 - β is slightly above one. The later result is consistent with HiC data on real human interphase chromosomes, and does not contradict to the older FISH data. The dynamics of rings in the melt indicates that the motion of one ring remains subdiffusive on the time scale well above the stress relaxation time.

  15. Genome-scale metabolic network reconstruction.

    PubMed

    Fondi, Marco; Liò, Pietro

    2015-01-01

    Bacterial metabolism is an important source of novel products/processes for everyday life and strong efforts are being undertaken to discover and exploit new usable substances of microbial origin. Computational modeling and in silico simulations are powerful tools in this context since they allow the exploration and a deeper understanding of bacterial metabolic circuits. Many approaches exist to quantitatively simulate chemical reaction fluxes within the whole microbial metabolism and, regardless of the technique of choice, metabolic model reconstruction is the first step in every modeling pipeline. Reconstructing a metabolic network consists in drafting the list of the biochemical reactions that an organism can carry out together with information on cellular boundaries, a biomass assembly reaction, and exchange fluxes with the external environment. Building up models able to represent the different functional cellular states is universally recognized as a tricky task that requires intensive manual effort and much additional information besides genome sequence. In this chapter we present a general protocol for metabolic reconstruction in bacteria and the main challenges encountered during this process.

  16. Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut.

    PubMed

    Armero, Alix; Baudouin, Luc; Bocs, Stéphanie; This, Dominique

    2017-01-01

    The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/).

  17. Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut

    PubMed Central

    Armero, Alix; Bocs, Stéphanie; This, Dominique

    2017-01-01

    The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/). PMID:28334050

  18. High-confidence coding and noncoding transcriptome maps

    PubMed Central

    2017-01-01

    The advent of high-throughput RNA sequencing (RNA-seq) has led to the discovery of unprecedentedly immense transcriptomes encoded by eukaryotic genomes. However, the transcriptome maps are still incomplete partly because they were mostly reconstructed based on RNA-seq reads that lack their orientations (known as unstranded reads) and certain boundary information. Methods to expand the usability of unstranded RNA-seq data by predetermining the orientation of the reads and precisely determining the boundaries of assembled transcripts could significantly benefit the quality of the resulting transcriptome maps. Here, we present a high-performing transcriptome assembly pipeline, called CAFE, that significantly improves the original assemblies, respectively assembled with stranded and/or unstranded RNA-seq data, by orienting unstranded reads using the maximum likelihood estimation and by integrating information about transcription start sites and cleavage and polyadenylation sites. Applying large-scale transcriptomic data comprising 230 billion RNA-seq reads from the ENCODE, Human BodyMap 2.0, The Cancer Genome Atlas, and GTEx projects, CAFE enabled us to predict the directions of about 220 billion unstranded reads, which led to the construction of more accurate transcriptome maps, comparable to the manually curated map, and a comprehensive lncRNA catalog that includes thousands of novel lncRNAs. Our pipeline should not only help to build comprehensive, precise transcriptome maps from complex genomes but also to expand the universe of noncoding genomes. PMID:28396519

  19. Genome-Wide Host-Pathogen Interaction Unveiled by Transcriptomic Response of Diamondback Moth to Fungal Infection.

    PubMed

    Chu, Zhen-Jian; Wang, Yu-Jun; Ying, Sheng-Hua; Wang, Xiao-Wei; Feng, Ming-Guang

    2016-01-01

    Genome-wide insight into insect pest response to the infection of Beauveria bassiana (fungal insect pathogen) is critical for genetic improvement of fungal insecticides but has been poorly explored. We constructed three pairs of transcriptomes of Plutella xylostella larvae at 24, 36 and 48 hours post treatment of infection (hptI) and of control (hptC) for insight into the host-pathogen interaction at genomic level. There were 2143, 3200 and 2967 host genes differentially expressed at 24, 36 and 48 hptI/hptC respectively. These infection-responsive genes (~15% of the host genome) were enriched in various immune processes, such as complement and coagulation cascades, protein digestion and absorption, and drug metabolism-cytochrome P450. Fungal penetration into cuticle and host defense reaction began at 24 hptI, followed by most intensive host immune response at 36 hptI and attenuated immunity at 48 hptI. Contrastingly, 44% of fungal genes were differentially expressed in the infection course and enriched in several biological processes, such as antioxidant activity, peroxidase activity and proteolysis. There were 1636 fungal genes co-expressed during 24-48 hptI, including 116 encoding putative secretion proteins. Our results provide novel insights into the insect-pathogen interaction and help to probe molecular mechanisms involved in the fungal infection to the global pest.

  20. Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen

    PubMed Central

    DiGuistini, Scott; Wang, Ye; Liao, Nancy Y.; Taylor, Greg; Tanguay, Philippe; Feau, Nicolas; Henrissat, Bernard; Chan, Simon K.; Hesse-Orce, Uljana; Alamouti, Sepideh Massoumi; Tsui, Clement K. M.; Docking, Roderick T.; Levasseur, Anthony; Haridas, Sajeet; Robertson, Gordon; Birol, Inanc; Holt, Robert A.; Marra, Marco A.; Hamelin, Richard C.; Hirst, Martin; Jones, Steven J. M.; Bohlmann, Jörg; Breuil, Colette

    2011-01-01

    In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera (Gc), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc, MPB, and lodgepole pine hosts, we sequenced the ∼30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc, and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ∼100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system. PMID:21262841

  1. Genome and transcriptome analyses of the mountain pine beetle-fungal symbiont Grosmannia clavigera, a lodgepole pine pathogen.

    PubMed

    DiGuistini, Scott; Wang, Ye; Liao, Nancy Y; Taylor, Greg; Tanguay, Philippe; Feau, Nicolas; Henrissat, Bernard; Chan, Simon K; Hesse-Orce, Uljana; Alamouti, Sepideh Massoumi; Tsui, Clement K M; Docking, Roderick T; Levasseur, Anthony; Haridas, Sajeet; Robertson, Gordon; Birol, Inanc; Holt, Robert A; Marra, Marco A; Hamelin, Richard C; Hirst, Martin; Jones, Steven J M; Bohlmann, Jörg; Breuil, Colette

    2011-02-08

    In western North America, the current outbreak of the mountain pine beetle (MPB) and its microbial associates has destroyed wide areas of lodgepole pine forest, including more than 16 million hectares in British Columbia. Grosmannia clavigera (Gc), a critical component of the outbreak, is a symbiont of the MPB and a pathogen of pine trees. To better understand the interactions between Gc, MPB, and lodgepole pine hosts, we sequenced the ∼30-Mb Gc genome and assembled it into 18 supercontigs. We predict 8,314 protein-coding genes, and support the gene models with proteome, expressed sequence tag, and RNA-seq data. We establish that Gc is heterothallic, and report evidence for repeat-induced point mutation. We report insights, from genome and transcriptome analyses, into how Gc tolerates conifer-defense chemicals, including oleoresin terpenoids, as they colonize a host tree. RNA-seq data indicate that terpenoids induce a substantial antimicrobial stress in Gc, and suggest that the fungus may detoxify these chemicals by using them as a carbon source. Terpenoid treatment strongly activated a ∼100-kb region of the Gc genome that contains a set of genes that may be important for detoxification of these host-defense chemicals. This work is a major step toward understanding the biological interactions between the tripartite MPB/fungus/forest system.

  2. Mapping of chimpanzee full-length cDNAs onto the human genome unveils large potential divergence of the transcriptome.

    PubMed

    Sakate, Ryuichi; Suto, Yumiko; Imanishi, Tadashi; Tanoue, Tetsuya; Hida, Munetomo; Hayasaka, Ikuo; Kusuda, Jun; Gojobori, Takashi; Hashimoto, Katsuyuki; Hirai, Momoki

    2007-09-01

    The genetic basis of the phenotypic difference between human and chimpanzee is one of the most actively pursued issues in current genomics. Although the genomic divergence between the two species has been described, the transcriptomic divergence has not been well documented. Thus, we newly sequenced and analyzed chimpanzee full-length cDNAs (FLcDNAs) representing 87 protein-coding genes. The number of nucleotide substitutions and sites of insertions/deletions (indels) was counted as a measure of sequence divergence between the chimpanzee FLcDNAs and the human genome onto which the FLcDNAs were mapped. Difference in transcription start/termination sites (TSSs/TTSs) and alternative splicing (AS) exons was also counted as a measure of structural divergence between the chimpanzee FLcDNAs and their orthologous human transcripts (NCBI RefSeq). As a result, we found that transposons (Alu) and repetitive segments caused large indels, which strikingly increased the average amount of sequence divergence up to more than 2% in the 3'-UTRs. Moreover, 20 out of the 87 transcripts contained more than 10% structural divergence in length. In particular, two-thirds of the structural divergence was found in the 3'-UTRs, and variable transcription start sites were conspicuous in the 5'-UTRs. As both transcriptional and translational efficiency were supposed to be related to 5'- and 3'-UTR sequences, these results lead to the idea that the difference in gene regulation can be a major cause of the difference in phenotype between human and chimpanzee.

  3. Complete Genome Sequence and Transcriptomic Analysis of the Novel Pathogen Elizabethkingia anophelis in Response to Oxidative Stress.

    PubMed

    Li, Yingying; Liu, Yang; Chew, Su Chuen; Tay, Martin; Salido, May Margarette Santillan; Teo, Jeanette; Lauro, Federico M; Givskov, Michael; Yang, Liang

    2015-05-26

    Elizabethkingia anophelis is an emerging pathogen that can cause life-threatening infections in neonates, severely immunocompromised and postoperative patients. The lack of genomic information on E. anophelis hinders our understanding of its mechanisms of pathogenesis. Here, we report the first complete genome sequence of E. anophelis NUHP1 and assess its response to oxidative stress. Elizabethkingia anophelis NUHP1 has a circular genome of 4,369,828 base pairs and 4,141 predicted coding sequences. Sequence analysis indicates that E. anophelis has well-developed systems for scavenging iron and stress response. Many putative virulence factors and antibiotic resistance genes were identified, underscoring potential host-pathogen interactions and antibiotic resistance. RNA-sequencing-based transcriptome profiling indicates that expressions of genes involved in synthesis of an yersiniabactin-like iron siderophore and heme utilization are highly induced as a protective mechanism toward oxidative stress caused by hydrogen peroxide treatment. Chrome azurol sulfonate assay verified that siderophore production of E. anophelis is increased in the presence of oxidative stress. We further showed that hemoglobin facilitates the growth, hydrogen peroxide tolerance, cell attachment, and biofilm formation of E. anophelis NUHP1. Our study suggests that siderophore production and heme uptake pathways might play essential roles in stress response and virulence of the emerging pathogen E. anophelis.

  4. Complete Genome Sequence and Transcriptomic Analysis of the Novel Pathogen Elizabethkingia anophelis in Response to Oxidative Stress

    PubMed Central

    Li, Yingying; Liu, Yang; Chew, Su Chuen; Tay, Martin; Salido, May Margarette Santillan; Teo, Jeanette; Lauro, Federico M.; Givskov, Michael; Yang, Liang

    2015-01-01

    Elizabethkingia anophelis is an emerging pathogen that can cause life-threatening infections in neonates, severely immunocompromised and postoperative patients. The lack of genomic information on E. anophelis hinders our understanding of its mechanisms of pathogenesis. Here, we report the first complete genome sequence of E. anophelis NUHP1 and assess its response to oxidative stress. Elizabethkingia anophelis NUHP1 has a circular genome of 4,369,828 base pairs and 4,141 predicted coding sequences. Sequence analysis indicates that E. anophelis has well-developed systems for scavenging iron and stress response. Many putative virulence factors and antibiotic resistance genes were identified, underscoring potential host–pathogen interactions and antibiotic resistance. RNA-sequencing-based transcriptome profiling indicates that expressions of genes involved in synthesis of an yersiniabactin-like iron siderophore and heme utilization are highly induced as a protective mechanism toward oxidative stress caused by hydrogen peroxide treatment. Chrome azurol sulfonate assay verified that siderophore production of E. anophelis is increased in the presence of oxidative stress. We further showed that hemoglobin facilitates the growth, hydrogen peroxide tolerance, cell attachment, and biofilm formation of E. anophelis NUHP1. Our study suggests that siderophore production and heme uptake pathways might play essential roles in stress response and virulence of the emerging pathogen E. anophelis. PMID:26019164

  5. Genome-Wide Host-Pathogen Interaction Unveiled by Transcriptomic Response of Diamondback Moth to Fungal Infection

    PubMed Central

    Chu, Zhen-Jian; Wang, Yu-Jun; Ying, Sheng-Hua; Wang, Xiao-Wei; Feng, Ming-Guang

    2016-01-01

    Genome-wide insight into insect pest response to the infection of Beauveria bassiana (fungal insect pathogen) is critical for genetic improvement of fungal insecticides but has been poorly explored. We constructed three pairs of transcriptomes of Plutella xylostella larvae at 24, 36 and 48 hours post treatment of infection (hptI) and of control (hptC) for insight into the host-pathogen interaction at genomic level. There were 2143, 3200 and 2967 host genes differentially expressed at 24, 36 and 48 hptI/hptC respectively. These infection-responsive genes (~15% of the host genome) were enriched in various immune processes, such as complement and coagulation cascades, protein digestion and absorption, and drug metabolism-cytochrome P450. Fungal penetration into cuticle and host defense reaction began at 24 hptI, followed by most intensive host immune response at 36 hptI and attenuated immunity at 48 hptI. Contrastingly, 44% of fungal genes were differentially expressed in the infection course and enriched in several biological processes, such as antioxidant activity, peroxidase activity and proteolysis. There were 1636 fungal genes co-expressed during 24–48 hptI, including 116 encoding putative secretion proteins. Our results provide novel insights into the insect-pathogen interaction and help to probe molecular mechanisms involved in the fungal infection to the global pest. PMID:27043942

  6. A blow to the fly - Lucilia cuprina draft genome and transcriptome to support advances in biology and biotechnology.

    PubMed

    Anstead, Clare A; Batterham, Philip; Korhonen, Pasi K; Young, Neil D; Hall, Ross S; Bowles, Vernon M; Richards, Stephen; Scott, Maxwell J; Gasser, Robin B

    2016-01-01

    The blow fly, Lucilia cuprina (Wiedemann, 1830) is a parasitic insect of major global economic importance. Maggots of this fly parasitize the skin of animal hosts, feed on excretions and tissues, and cause severe disease (flystrike or myiasis). Although there has been considerable research on L. cuprina over the years, little is understood about the molecular biology, biochemistry and genetics of this parasitic fly, as well as its relationship with its hosts and the disease that it causes. This situation might change with the recent report of the draft genome and transcriptome of this blow fly, which has given new and global insights into its biology, interactions with the host animal and aspects of insecticide resistance at the molecular level. This genomic resource will likely enable many fundamental and applied research areas in the future. The present article gives a background on L. cuprina and myiasis, a brief account of past and current treatment, prevention and control approaches, and provides a perspective on the impact that the L. cuprina genome should have on future research of this and related parasitic flies, and the design of new and improved interventions for myiasis. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Brain transcriptome of the violet-eared waxbill Uraeginthus granatina and recent evolution in the songbird genome

    PubMed Central

    Balakrishnan, Christopher N.; Chapus, Charles; Brewer, Michael S.; Clayton, David F.

    2013-01-01

    Songbirds are important models for the study of social behaviour and communication. To complement the recent genome sequencing of the domesticated zebra finch, we sequenced the brain transcriptome of a closely related songbird species, the violet-eared waxbill (Uraeginthus granatina). Both the zebra finch and violet-eared waxbill are members of the family Estrildidae, but differ markedly in their social behaviour. Using Roche 454 RNA sequencing, we generated an assembly and annotation of 11 084 waxbill orthologues of 17 475 zebra finch genes (64%), with an average transcript length of 1555 bp. We also identified 5985 single nucleotide polymorphisms (SNPs) of potential utility for future population genomic studies. Comparing the two species, we found evidence for rapid protein evolution (ω) and low polymorphism of the avian Z sex chromosome, consistent with prior studies of more divergent avian species. An intriguing outlier was putative chromosome 4A, which showed a high density of SNPs and low evolutionary rate relative to other chromosomes. Genome-wide ω was identical in zebra finch and violet-eared waxbill lineages, suggesting a similar demographic history with efficient purifying natural selection. Further comparisons of these and other estrildid finches may provide insights into the evolutionary neurogenomics of social behaviour. PMID:24004662

  8. Characterization of the mechanism of prolonged adaptation to osmotic stress of Jeotgalibacillus malaysiensis via genome and transcriptome sequencing analyses

    PubMed Central

    Yaakop, Amira Suriaty; Chan, Kok-Gan; Ee, Robson; Lim, Yan Lue; Lee, Siew-Kim; Manan, Fazilah Abd; Goh, Kian Mau

    2016-01-01

    Jeotgalibacillus malaysiensis, a moderate halophilic bacterium isolated from a pelagic area, can endure higher concentrations of sodium chloride (NaCl) than other Jeotgalibacillus type strains. In this study, we therefore chose to sequence and assemble the entire J. malaysiensis genome. This is the first report to provide a detailed analysis of the genomic features of J. malaysiensis, and to perform genetic comparisons between this microorganism and other halophiles. J. malaysiensis encodes a native megaplasmid (pJeoMA), which is greater than 600 kilobases in size, that is absent from other sequenced species of Jeotgalibacillus. Subsequently, RNA-Seq-based transcriptome analysis was utilised to examine adaptations of J. malaysiensis to osmotic stress. Specifically, the eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) and KEGG (Kyoto Encyclopaedia of Genes and Genomes) databases were used to elucidate the overall effects of osmotic stress on the organism. Generally, saline stress significantly affected carbohydrate, energy, and amino acid metabolism, as well as fatty acid biosynthesis. Our findings also indicate that J. malaysiensis adopted a combination of approaches, including the uptake or synthesis of osmoprotectants, for surviving salt stress. Among these, proline synthesis appeared to be the preferred method for withstanding prolonged osmotic stress in J. malaysiensis. PMID:27641516

  9. Ecological venomics: How genomics, transcriptomics and proteomics can shed new light on the ecology and evolution of venom.

    PubMed

    Sunagar, Kartik; Morgenstern, David; Reitzel, Adam M; Moran, Yehu

    2016-03-01

    Animal venom is a complex cocktail of bioactive chemicals that traditionally drew interest mostly from biochemists and pharmacologists. However, in recent years the evolutionary and ecological importance of venom is realized as this trait has direct and strong influence on interactions between species. Moreover, venom content can be modulated by environmental factors. Like many other fields of biology, venom research has been revolutionized in recent years by the introduction of systems biology approaches, i.e., genomics, transcriptomics and proteomics. The employment of these methods in venom research is known as 'venomics'. In this review we describe the history and recent advancements of venomics and discuss how they are employed in studying venom in general and in particular in the context of evolutionary ecology. We also discuss the pitfalls and challenges of venomics and what the future may hold for this emerging scientific field. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Genomic and transcriptome analysis of triclosan response of a multidrug-resistant Acinetobacter baumannii strain, MDR-ZJ06.

    PubMed

    Pi, Borui; Yu, Dongliang; Hua, Xiaoting; Ruan, Zhi; Yu, Yunsong

    2017-03-01

    During the last decade, an increasing amount of attention has focused on the potential threat of triclosan to both the human body and environmental ecology. However, the role of triclosan in the development of drug resistance and cross resistance is still in dispute ascribed to largely unknown of triclosan resistance mechanism. In this work, Acinetobacter baumannii MDR-ZJ06, a multidrug-resistant strain, was induced by triclosan, and the genomic variation and transcriptional levels were investigated, respectively. The comparative transcriptomic analysis found that several general protective mechanisms were enhanced under the triclosan condition, including responses to reactive oxygen species and cell membrane damage. Meanwhile, all of the detected fifteen single nucleotide polymorphisms were not directly associated triclosan tolerance. In summary, this work revealed the crucial role of the general stress response in A. baumannii under a triclosan stress condition, which informs a more comprehensive understanding of the role of triclosan in the spread of drug-resistant bacteria.

  11. Genome-wide expression profiling of the transcriptomes of four Paulownia tomentosa accessions in response to drought.

    PubMed

    Dong, Yanpeng; Fan, Guoqiang; Deng, Minjie; Xu, Enkai; Zhao, Zhenli

    2014-10-01

    Paulownia tomentosa is an important foundation forest tree species in semiarid areas. The lack of genetic information hinders research into the mechanisms involved in its response to abiotic stresses. Here, short-read sequencing technology (Illumina) was used to de novo assemble the transcriptome on P. tomentosa. A total of 99,218 unigenes with a mean length of 949 nucleotides were assembled. 68,295 unigenes were selected and the functions of their products were predicted using Clusters of Orthologous Groups, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes annotations. Afterwards, hundreds of genes involved in drought response were identified. Twelve putative drought response genes were analyzed by quantitative real-time polymerase chain reaction. This study provides a dataset of genes and inherent biochemical pathways, which will help in understanding the mechanisms of the water-deficit response in P. tomentosa. To our knowledge, this is the first study to highlight the genetic makeup of P. tomentosa.

  12. The Discovery of Novel Genomic, Transcriptomic, and Proteomic Biomarkers in Cardiovascular and Peripheral Vascular Disease: The State of the Art

    PubMed Central

    de Franciscis, Stefano; Metzinger, Laurent; Serra, Raffaele

    2016-01-01

    Cardiovascular disease (CD) and peripheral vascular disease (PVD) are leading causes of mortality and morbidity in western countries and also responsible of a huge burden in terms of disability, functional decline, and healthcare costs. Biomarkers are measurable biological elements that reflect particular physiological or pathological states or predisposition towards diseases and they are currently widely studied in medicine and especially in CD. In this context, biomarkers can also be used to assess the severity or the evolution of several diseases, as well as the effectiveness of particular therapies. Genomics, transcriptomics, and proteomics have opened new windows on disease phenomena and may permit in the next future an effective development of novel diagnostic and prognostic medicine in order to better prevent or treat CD. This review will consider the current evidence of novel biomarkers with clear implications in the improvement of risk assessment, prevention strategies, and medical decision making in the field of CD. PMID:27298828

  13. The Mediterranean scorpion Mesobuthus gibbosus (Scorpiones, Buthidae): transcriptome analysis and organization of the genome encoding chlorotoxin-like peptides

    PubMed Central

    2014-01-01

    Background Transcrof toxin genes of scorpion species have been published. Up to this moment, no information on the gene characterization of M. gibbosus is available. Results This study provides the first insight into gene expression in venom glands from M. gibbosus scorpion. A cDNA library was generated from the venom glands and subsequently analyzed (301 clones). Sequences from 177 high-quality ESTs were grouped as 48 Mgib sequences, of those 48 sequences, 40 (29 “singletons” and 11 “contigs”) correspond with one or more ESTs. We identified putative precursor sequences and were grouped them in different categories (39 unique transcripts, one with alternative reading frames), resulting in the identification of 12 new toxin-like and 5 antimicrobial precursors (transcripts). The analysis of the gene families revealed several new components categorized among various toxin families with effect on ion channels. Sequence analysis of a new KTx precursor provides evidence to validate a new KTx subfamily (α-KTx 27.x). A second part of this work involves the genomic organization of three Meg-chlorotoxin-like genes (ClTxs). Genomic DNA sequence reveals close similarities (presence of one same-phase intron) with the sole genomic organization of chlorotoxins ever reported (from M. martensii). Conclusions Transcriptome analysis is a powerful strategy that provides complete information of the gene expression and molecular diversity of the venom glands (telson). In this work, we generated the first catalogue of the gene expression and genomic organization of toxins from M. gibbosus. Our result represents a relevant contribution to the knowledge of toxin transcripts and complementary information related with other cell function proteins and venom peptide transcripts. The genomic organization of the chlorotoxin genes may help to understand the diversity of this gene family. PMID:24746279

  14. Genome-enabled transcriptomics reveals archaeal populations that drive nitrification in a deep-sea hydrothermal plume

    PubMed Central

    Baker, Brett J; Lesniewski, Ryan A; Dick, Gregory J

    2012-01-01

    Ammonia-oxidizing Archaea (AOA) are among the most abundant microorganisms in the oceans and have crucial roles in biogeochemical cycling of nitrogen and carbon. To better understand AOA inhabiting the deep sea, we obtained community genomic and transcriptomic data from ammonium-rich hydrothermal plumes in the Guaymas Basin (GB) and from surrounding deep waters of the Gulf of California. Among the most abundant and active lineages in the sequence data were marine group I (MGI) Archaea related to the cultured autotrophic ammonia-oxidizer, Nitrosopumilus maritimus. Assembly of MGI genomic fragments yielded 2.9 Mb of sequence containing seven 16S rRNA genes (95.4–98.4% similar to N. maritimus), including two near-complete genomes and several lower-abundance variants. Equal copy numbers of MGI 16S rRNA genes and ammonia monooxygenase genes and transcription of ammonia oxidation genes indicates that all of these genotypes actively oxidize ammonia. De novo genomic assembly revealed the functional potential of MGI populations and enhanced interpretation of metatranscriptomic data. Physiological distinction from N. maritimus is evident in the transcription of novel genes, including genes for urea utilization, suggesting an alternative source of ammonia. We were also able to determine which genotypes are most active in the plume. Transcripts involved in nitrification were more prominent in the plume and were among the most abundant transcripts in the community. These unique data sets reveal populations of deep-sea AOA thriving in the ammonium-rich GB that are related to surface types, but with key genomic and physiological differences. PMID:22695863

  15. Large-scale investigation of genomic markers for severe periodontitis.

    PubMed

    Suzuki, Asami; Ji, Guijin; Numabe, Yukihiro; Ishii, Keisuke; Muramatsu, Masaaki; Kamoi, Kyuichi

    2004-09-01

    The purpose of the present study was to investigate the genomic markers for periodontitis, using large-scale single-nucleotide polymorphism (SNP) association studies comparing healthy volunteers and patients with periodontitis. Genomic DNA was obtained from 19 healthy volunteers and 22 patients with severe periodontitis, all of whom were Japanese. The subjects were genotyped at 637 SNPs in 244 genes on a large scale, using the TaqMan polymerase chain reaction (PCR) system. Statistically significant differences in allele and genotype frequencies were analyzed with Fisher's exact test. We found statistically significant differences (P < 0.01) between the healthy volunteers and patients with severe periodontitis in the following genes; gonadotropin-releasing hormone 1 (GNRH1), phosphatidylinositol 3-kinase regulatory 1 (PIK3R1), dipeptidylpeptidase 4 (DPP4), fibrinogen-like 2 (FGL2), and calcitonin receptor (CALCR). These results suggest that SNPs in the GNRH1, PIK3R1, DPP4, FGL2, and CALCR genes are genomic markers for severe periodontitis. Our findings indicate the necessity of analyzing SNPs in genes on a large scale (i.e., genome-wide approach), to identify genomic markers for periodontitis.

  16. Genome-wide transcriptome analysis of genes involved in flavonoid biosynthesis between red and white strains of Magnolia sprengeri pamp.

    PubMed

    Shi, Shou-Guo; Yang, Mei; Zhang, Min; Wang, Ping; Kang, Yong-Xiang; Liu, Jian-Jun

    2014-08-23

    Magnolia sprengeri Pamp is one of the most highly valuable medicinal and ornamental plants of the Magnolia Family. The natural color of M. sprengeri is variable. The complete genome sequence of M. sprengeri is not available; therefore we sequenced the transcriptome of white and red petals of M. sprengeri using Illumina technology. We focused on the identity of structural and regulatory genes encoding the enzymes involved in the determination of flower color. We sequenced and annotated a reference transcriptome for M. sprengeri, and aimed to capture the transcriptional determinanats of flower color. We sequenced a normalized cDNA library of white and red petals using Illumina technology. The resulting reads were assembled into 77,048 unique sequences, of which 28,243 could be annotated by Gene Ontology (GO) analysis, while 48,805 transcripts lacked GO annotation. The main enzymes involved in the flavonoid biosynthesis, such as phenylalanine ammonia-Lyase, cinnamat-4-Hydroxylase, dihydroflavonol-4-reductase, flavanone 3-hydroxylase, flavonoid-3'-hydroxylase, flavonol synthase, chalcone synthase and anthocyanidin synthase, were identified in the transcriptome. A total of 270 transcription factors were sorted into three families, including MYB, bHLH and WD40 types. Among these transcription factors, eight showed 4-fold or greater changes in transcript abundance in red petals compared with white petals. High-performance liquid chromatography analysis of anthocyanin compositions showed that the main anthocyanin in the petals of M. sprengeri is cyanidin-3-O-glucoside chloride and its content in red petals was 26-fold higher than that in white petals. This study presents the first next-generation sequencing effort and transcriptome analysis of a non-model plant from the Family Magnoliaceae. Genes encoding key enzymes were identified and the metabolic pathways involved in biosynthesis and catabolism of M. sprengeri flavonoids were reconstructed. Identification of these

  17. Analysis of CATMA transcriptome data identifies hundreds of novel functional genes and improves gene models in the Arabidopsis genome

    PubMed Central

    Aubourg, Sébastien; Martin-Magniette, Marie-Laure; Brunaud, Véronique; Taconnat, Ludivine; Bitton, Frédérique; Balzergue, Sandrine; Jullien, Pauline E; Ingouff, Mathieu; Thareau, Vincent; Schiex, Thomas; Lecharny, Alain; Renou, Jean-Pierre

    2007-01-01

    Background Since the finishing of the sequencing of the Arabidopsis thaliana genome, the Arabidopsis community and the annotator centers have been working on the improvement of gene annotation at the structural and functional levels. In this context, we have used the large CATMA resource on the Arabidopsis transcriptome to search for genes missed by different annotation processes. Probes on the CATMA microarrays are specific gene sequence tags (GSTs) based on the CDS models predicted by the Eugene software. Among the 24 576 CATMA