comprehensive expressed sequence: Topics by Science.gov

Sample records for comprehensive expressed sequence

NEIBank: Genomics and bioinformatics resources for vision research

PubMed Central

Peterson, Katherine; Gao, James; Buchoff, Patee; Jaworski, Cynthia; Bowes-Rickman, Catherine; Ebright, Jessica N.; Hauser, Michael A.; Hoover, David

2008-01-01

NEIBank is an integrated resource for genomics and bioinformatics in vision research. It includes expressed sequence tag (EST) data and sequence-verified cDNA clones for multiple eye tissues of several species, web-based access to human eye-specific SAGE data through EyeSAGE, and comprehensive, annotated databases of known human eye disease genes and candidate disease gene loci. All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser. NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology. PMID:18648525
Mobile Genome Express (MGE): A comprehensive automatic genetic analyses pipeline with a mobile device.

PubMed

Yoon, Jun-Hee; Kim, Thomas W; Mendez, Pedro; Jablons, David M; Kim, Il-Jin

2017-01-01

The development of next-generation sequencing (NGS) technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE) that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.
The Spatial and Temporal Transcriptomic Landscapes of Ginseng, Panax ginseng C. A. Meyer.

PubMed

Wang, Kangyu; Jiang, Shicui; Sun, Chunyu; Lin, Yanping; Yin, Rui; Wang, Yi; Zhang, Meiping

2015-12-11

Ginseng, including Asian ginseng (Panax ginseng C. A. Meyer) and American ginseng (P. quinquefolius L.), is one of the most important medicinal herbs in Asia and North America, but significantly understudied. This study sequenced and characterized the transcriptomes and expression profiles of genes expressed in 14 tissues and four different aged roots of Asian ginseng. A total of 265.2 million 100-bp clean reads were generated using the high-throughput sequencing platform HiSeq 2000, representing >8.3x of the 3.2-Gb ginseng genome. From the sequences, 248,993 unigenes were assembled for whole plant, 61,912-113,456 unigenes for each tissue and 54,444-65,412 unigenes for different year-old roots. We comprehensively analyzed the unigene sets and gene expression profiles. We found that the number of genes allocated to each functional category is stable across tissues or developmental stages, while the expression profiles of different genes of a gene family or involved in ginsenoside biosynthesis dramatically diversified spatially and temporally. These results provide an overall insight into the spatial and temporal transcriptome dynamics and landscapes of Asian ginseng, and comprehensive resources for advanced research and breeding of ginseng and related species.
De novo assembled expressed gene catalog of a fast-growing Eucalyptus tree produced by Illumina mRNA-Seq

PubMed Central

2010-01-01

Background De novo assembly of transcript sequences produced by short-read DNA sequencing technologies offers a rapid approach to obtain expressed gene catalogs for non-model organisms. A draft genome sequence will be produced in 2010 for a Eucalyptus tree species (E. grandis) representing the most important hardwood fibre crop in the world. Genome annotation of this valuable woody plant and genetic dissection of its superior growth and productivity will be greatly facilitated by the availability of a comprehensive collection of expressed gene sequences from multiple tissues and organs. Results We present an extensive expressed gene catalog for a commercially grown E. grandis × E. urophylla hybrid clone constructed using only Illumina mRNA-Seq technology and de novo assembly. A total of 18,894 transcript-derived contigs, a large proportion of which represent full-length protein coding genes were assembled and annotated. Analysis of assembly quality, length and diversity show that this dataset represent the most comprehensive expressed gene catalog for any Eucalyptus tree. mRNA-Seq analysis furthermore allowed digital expression profiling of all of the assembled transcripts across diverse xylogenic and non-xylogenic tissues, which is invaluable for ascribing putative gene functions. Conclusions De novo assembly of Illumina mRNA-Seq reads is an efficient approach for transcriptome sequencing and profiling in Eucalyptus and other non-model organisms. The transcriptome resource (Eucspresso, http://eucspresso.bi.up.ac.za/) generated by this study will be of value for genomic analysis of woody biomass production in Eucalyptus and for comparative genomic analysis of growth and development in woody and herbaceous plants. PMID:21122097
Transcriptome assembly, gene annotation and tissue gene expression atlas of the rainbow trout

USDA-ARS?s Scientific Manuscript database

Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complimented by transcriptome information that will enhance genome assembly and annotation. Previously, we reported a transcriptome reference sequence using a 19X coverage of Sanger and 454-pyrosequencing dat...
Evaluating the Pedagogic Value of Multi-word Expressions Based on EFL Teachers' and Advanced Learners' Value Judgments

ERIC Educational Resources Information Center

Omidian, Taha; Shahriari, Hesamoddin; Ghonsooly, Behzad

2017-01-01

Multi-word expressions play an important role in second language acquisition, comprehension, and production. Therefore, there is great need for a list of frequent, useful multi-word expressions in language teaching classrooms. Despite multiple attempts at defining multi-word sequences, researchers and teaching experts are divided over the nature…
Transcriptome Assembly, Gene Annotation and Tissue Gene Expression Atlas of the Rainbow Trout

PubMed Central

Salem, Mohamed; Paneru, Bam; Al-Tobasei, Rafet; Abdouni, Fatima; Thorgaard, Gary H.; Rexroad, Caird E.; Yao, Jianbo

2015-01-01

Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome. PMID:25793877
Plant Omics Data Center: An Integrated Web Repository for Interspecies Gene Expression Networks with NLP-Based Curation

PubMed Central

Ohyanagi, Hajime; Takano, Tomoyuki; Terashima, Shin; Kobayashi, Masaaki; Kanno, Maasa; Morimoto, Kyoko; Kanegae, Hiromi; Sasaki, Yohei; Saito, Misa; Asano, Satomi; Ozaki, Soichi; Kudo, Toru; Yokoyama, Koji; Aya, Koichiro; Suwabe, Keita; Suzuki, Go; Aoki, Koh; Kubo, Yasutaka; Watanabe, Masao; Matsuoka, Makoto; Yano, Kentaro

2015-01-01

Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources. PMID:25505034
The 3of5 web application for complex and comprehensive pattern matching in protein sequences.

PubMed

Seiler, Markus; Mehrle, Alexander; Poustka, Annemarie; Wiemann, Stefan

2006-03-16

The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of the user-friendly terminology. Limitations arise for queries with the increasing complexity of patterns and are accompanied by requirements for enhanced capabilities. This is especially true for patterns containing ambiguous characters and positions and/or length ambiguities. We have implemented the 3of5 web application in order to enable complex pattern matching in protein sequences. 3of5 is named after a special use of its main feature, the novel n-of-m pattern type. This feature allows for an extensive specification of variable patterns where the individual elements may vary in their position, order, and content within a defined stretch of sequence. The number of distinct elements can be constrained by operators, and individual characters may be excluded. The n-of-m pattern type can be combined with common regular expression terms and thus also allows for a comprehensive description of complex patterns. 3of5 increases the fidelity of pattern matching and finds ALL possible solutions in protein sequences in cases of length-ambiguous patterns instead of simply reporting the longest or shortest hits. Grouping and combined search for patterns provides a hierarchical arrangement of larger patterns sets. The algorithm is implemented as internet application and freely accessible. The application is available at http://dkfz.de/mga2/3of5/3of5.html. The 3of5 application offers an extended vocabulary for the definition of search patterns and thus allows the user to comprehensively specify and identify peptide patterns with variable elements. The n-of-m pattern type offers an improved accuracy for pattern matching in combination with the ability to find all solutions, without compromising the user friendliness of regular expression terms.
Revealing impaired pathways in the an11 mutant by high-throughput characterization of Petunia axillaris and Petunia inflata transcriptomes.

PubMed

Zenoni, Sara; D'Agostino, Nunzio; Tornielli, Giovanni B; Quattrocchio, Francesca; Chiusano, Maria L; Koes, Ronald; Zethof, Jan; Guzzo, Flavia; Delledonne, Massimo; Frusciante, Luigi; Gerats, Tom; Pezzotti, Mario

2011-10-01

Petunia is an excellent model system, especially for genetic, physiological and molecular studies. Thus far, however, genome-wide expression analysis has been applied rarely because of the lack of sequence information. We applied next-generation sequencing to generate, through de novo read assembly, a large catalogue of transcripts for Petunia axillaris and Petunia inflata. On the basis of both transcriptomes, comprehensive microarray chips for gene expression analysis were established and used for the analysis of global- and organ-specific gene expression in Petunia axillaris and Petunia inflata and to explore the molecular basis of the seed coat defects in a Petunia hybrida mutant, anthocyanin 11 (an11), lacking a WD40-repeat (WDR) transcription regulator. Among the transcripts differentially expressed in an11 seeds compared with wild type, many expected targets of AN11 were found but also several interesting new candidates that might play a role in morphogenesis of the seed coat. Our results validate the combination of next-generation sequencing with microarray analyses strategies to identify the transcriptome of two petunia species without previous knowledge of their genome, and to develop comprehensive chips as useful tools for the analysis of gene expression in P. axillaris, P. inflata and P. hybrida. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.
Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench

PubMed Central

Beckers, Matthew; Mohorianu, Irina; Stocks, Matthew; Applegate, Christopher; Dalmay, Tamas; Moulton, Vincent

2017-01-01

Recently, high-throughput sequencing (HTS) has revealed compelling details about the small RNA (sRNA) population in eukaryotes. These 20 to 25 nt noncoding RNAs can influence gene expression by acting as guides for the sequence-specific regulatory mechanism known as RNA silencing. The increase in sequencing depth and number of samples per project enables a better understanding of the role sRNAs play by facilitating the study of expression patterns. However, the intricacy of the biological hypotheses coupled with a lack of appropriate tools often leads to inadequate mining of the available data and thus, an incomplete description of the biological mechanisms involved. To enable a comprehensive study of differential expression in sRNA data sets, we present a new interactive pipeline that guides researchers through the various stages of data preprocessing and analysis. This includes various tools, some of which we specifically developed for sRNA analysis, for quality checking and normalization of sRNA samples as well as tools for the detection of differentially expressed sRNAs and identification of the resulting expression patterns. The pipeline is available within the UEA sRNA Workbench, a user-friendly software package for the processing of sRNA data sets. We demonstrate the use of the pipeline on a H. sapiens data set; additional examples on a B. terrestris data set and on an A. thaliana data set are described in the Supplemental Information. A comparison with existing approaches is also included, which exemplifies some of the issues that need to be addressed for sRNA analysis and how the new pipeline may be used to do this. PMID:28289155
Plant Omics Data Center: an integrated web repository for interspecies gene expression networks with NLP-based curation.

PubMed

Ohyanagi, Hajime; Takano, Tomoyuki; Terashima, Shin; Kobayashi, Masaaki; Kanno, Maasa; Morimoto, Kyoko; Kanegae, Hiromi; Sasaki, Yohei; Saito, Misa; Asano, Satomi; Ozaki, Soichi; Kudo, Toru; Yokoyama, Koji; Aya, Koichiro; Suwabe, Keita; Suzuki, Go; Aoki, Koh; Kubo, Yasutaka; Watanabe, Masao; Matsuoka, Makoto; Yano, Kentaro

2015-01-01

Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
MicroTrout: A comprehensive, genome-wide miRNA target prediction framework for rainbow trout, Oncorhynchus mykiss.

PubMed

Mennigen, Jan A; Zhang, Dapeng

2016-12-01

Rainbow trout represent an important teleost research model and aquaculture species. As such, rainbow trout are employed in diverse areas of biological research, including basic biological disciplines such as comparative physiology, toxicology, and, since rainbow trout have undergone both teleost- and salmonid-specific rounds of genome duplication, molecular evolution. In recent years, microRNAs (miRNAs, small non-protein coding RNAs) have emerged as important posttranscriptional regulators of gene expression in animals. Given the increasingly recognized importance of miRNAs as an additional layer in the regulation of gene expression and hence biological function, recent efforts using RNA- and genome sequencing approaches have resulted in the creation of several resources for the construction of a comprehensive repertoire of rainbow trout miRNAs and isomiRs (variant miRNA sequences that all appear to derive from the same gene but vary in sequence due to post-transcriptional processing). Importantly, through the recent publication of the rainbow trout genome (Berthelot et al., 2014), mRNA 3'UTR information has become available, allowing for the first time the genome-wide prediction of miRNA-target RNA relationships in this species. We here report the creation of the microtrout database, a comprehensive resource for rainbow trout miRNA and annotated 3'UTRs. The comprehensive database was used to implement an algorithm to predict genome-wide rainbow trout-specific miRNA-mRNA target relationships, generating an improved predictive framework over previously published approaches. This work will serve as a useful framework and sequence resource to experimentally address the role of miRNAs in several research areas using the rainbow trout model, examples of which are discussed. Copyright © 2016 Elsevier Inc. All rights reserved.
PlantTFDB: a comprehensive plant transcription factor database

PubMed Central

Guo, An-Yuan; Chen, Xin; Gao, Ge; Zhang, He; Zhu, Qi-Hui; Liu, Xiao-Chuan; Zhong, Ying-Fu; Gu, Xiaocheng; He, Kun; Luo, Jingchu

2008-01-01

Transcription factors (TFs) play key roles in controlling gene expression. Systematic identification and annotation of TFs, followed by construction of TF databases may serve as useful resources for studying the function and evolution of transcription factors. We developed a comprehensive plant transcription factor database PlantTFDB (http://planttfdb.cbi.pku.edu.cn), which contains 26 402 TFs predicted from 22 species, including five model organisms with available whole genome sequence and 17 plants with available EST sequences. To provide comprehensive information for those putative TFs, we made extensive annotation at both family and gene levels. A brief introduction and key references were presented for each family. Functional domain information and cross-references to various well-known public databases were available for each identified TF. In addition, we predicted putative orthologs of those TFs among the 22 species. PlantTFDB has a simple interface to allow users to search the database by IDs or free texts, to make sequence similarity search against TFs of all or individual species, and to download TF sequences for local analysis. PMID:17933783
Comprehensive Profiling of the Androgen Receptor in Liquid Biopsies from Castration-resistant Prostate Cancer Reveals Novel Intra-AR Structural Variation and Splice Variant Expression Patterns.

PubMed

De Laere, Bram; van Dam, Pieter-Jan; Whitington, Tom; Mayrhofer, Markus; Diaz, Emanuela Henao; Van den Eynden, Gert; Vandebroek, Jean; Del-Favero, Jurgen; Van Laere, Steven; Dirix, Luc; Grönberg, Henrik; Lindberg, Johan

2017-08-01

Expression of the androgen receptor splice variant 7 (AR-V7) is associated with poor response to second-line endocrine therapy in castration-resistant prostate cancer (CRPC). However, a large fraction of nonresponding patients are AR-V7-negative. To investigate if a comprehensive liquid biopsy-based AR profile may improve patient stratification in the context of second-line endocrine therapy. Peripheral blood was collected from patients with CRPC (n=30) before initiation of a new line of systemic therapy. We performed profiling of circulating tumour DNA via low-pass whole-genome sequencing and targeted sequencing of the entire AR gene, including introns. Targeted RNA sequencing was performed on enriched circulating tumour cell fractions to assess the expression levels of seven AR splice variants (ARVs). Somatic AR variations, including copy-number alterations, structural variations, and point mutations, were combined with ARV expression patterns and correlated to clinicopathologic parameters. Collectively, any AR perturbation, including ARV, was detected in 25/30 patients. Surprisingly, intra-AR structural variation was present in 15/30 patients, of whom 14 expressed ARVs. The majority of ARV-positive patients expressed multiple ARVs, with AR-V3 the most abundantly expressed. The presence of any ARV was associated with progression-free survival after second-line endocrine treatment (hazard ratio 4.53, 95% confidence interval 1.424-14.41; p=0.0105). Six out of 17 poor responders were AR-V7-negative, but four carried other AR perturbations. Comprehensive AR profiling, which is feasible using liquid biopsies, is necessary to increase our understanding of the mechanisms underpinning resistance to endocrine treatment. Alterations in the androgen receptor are associated with endocrine treatment outcomes. This study demonstrates that it is possible to identify different types of alterations via simple blood draws. Follow-up studies are needed to determine the effect of such alterations on hormonal therapy. Copyright © 2017 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Characterization of microRNAs Expressed during Secondary Wall Biosynthesis in Acacia mangium

PubMed Central

Ong, Seong Siang; Wickneswari, Ratnam

2012-01-01

MicroRNAs (miRNAs) play critical regulatory roles by acting as sequence specific guide during secondary wall formation in woody and non-woody species. Although thousands of plant miRNAs have been sequenced, there is no comprehensive view of miRNA mediated gene regulatory network to provide profound biological insights into the regulation of xylem development. Herein, we report the involvement of six highly conserved amg-miRNA families (amg-miR166, amg-miR172, amg-miR168, amg-miR159, amg-miR394, and amg-miR156) as the potential regulatory sequences of secondary cell wall biosynthesis. Within this highly conserved amg-miRNA family, only amg-miR166 exhibited strong differences in expression between phloem and xylem tissue. The functional characterization of amg-miR166 targets in various tissues revealed three groups of HD-ZIP III: ATHB8, ATHB15, and REVOLUTA which play pivotal roles in xylem development. Although these three groups vary in their functions, -psRNA target analysis indicated that miRNA target sequences of the nine different members of HD-ZIP III are always conserved. We found that precursor structures of amg-miR166 undergo exhaustive sequence variation even within members of the same family. Gene expression analysis showed three key lignin pathway genes: C4H, CAD, and CCoAOMT were upregulated in compression wood where a cascade of miRNAs was downregulated. This study offers a comprehensive analysis on the involvement of highly conserved miRNAs implicated in the secondary wall formation of woody plants. PMID:23251324
iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.

PubMed

Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi

2018-01-01

We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.
Necklace: combining reference and assembled transcriptomes for more comprehensive RNA-Seq analysis.

PubMed

Davidson, Nadia M; Oshlack, Alicia

2018-05-01

RNA sequencing (RNA-seq) analyses can benefit from performing a genome-guided and de novo assembly, in particular for species where the reference genome or the annotation is incomplete. However, tools for integrating an assembled transcriptome with reference annotation are lacking. Necklace is a software pipeline that runs genome-guided and de novo assembly and combines the resulting transcriptomes with reference genome annotations. Necklace constructs a compact but comprehensive superTranscriptome out of the assembled and reference data. Reads are subsequently aligned and counted in preparation for differential expression testing. Necklace allows a comprehensive transcriptome to be built from a combination of assembled and annotated transcripts, which results in a more comprehensive transcriptome for the majority of organisms. In addition RNA-seq data are mapped back to this newly created superTranscript reference to enable differential expression testing with standard methods.
Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance

PubMed Central

Plantegenet, Stephanie; Weber, Johann; Goldstein, Darlene R; Zeller, Georg; Nussbaumer, Cindy; Thomas, Jérôme; Weigel, Detlef; Harshman, Keith; Hardtke, Christian S

2009-01-01

In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5′ regulatory sequence variation in the corresponding genes is indeed increased. However, ∼42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL. PMID:19225455
Determination of differential gene expression profiles in superficial and deeper zones of mature rat articular cartilage using RNA sequencing of laser microdissected tissue specimens.

PubMed

Mori, Yoshifumi; Chung, Ung-Il; Tanaka, Sakae; Saito, Taku

2014-01-01

Superficial zone (SFZ) cells, which are morphologically and functionally distinct from chondrocytes in deeper zones, play important roles in the maintenance of articular cartilage. Here, we established an easy and reliable method for performance of laser microdissection (LMD) on cryosections of mature rat articular cartilage using an adhesive membrane. We further examined gene expression profiles in the SFZ and the deeper zones of articular cartilage by performing RNA sequencing (RNA-seq). We validated sample collection methods, RNA amplification and the RNA-seq data using real-time RT-PCR. The combined data provide comprehensive information regarding genes specifically expressed in the SFZ or deeper zones, as well as a useful protocol for expression analysis of microsamples of hard tissues.

Identification and characterization of microRNAs in white and brown alpaca skin

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are small, non-coding 21–25 nt RNA molecules that play an important role in regulating gene expression. Little is known about the expression profiles and functions of miRNAs in skin and their role in pigmentation. Alpacas have more than 22 natural coat colors, more than any other fiber producing species. To better understand the role of miRNAs in control of coat color we performed a comprehensive analysis of miRNA expression profiles in skin of white versus brown alpacas. Results Two small RNA libraries from white alpaca (WA) and brown alpaca (BA) skin were sequenced with the aid of Illumina sequencing technology. 272 and 267 conserved miRNAs were obtained from the WA and BA skin libraries, respectively. Of these conserved miRNAs, 35 and 13 were more abundant in WA and BA skin, respectively. The targets of these miRNAs were predicted and grouped based on Gene Ontology and KEGG pathway analysis. Many predicted target genes for these miRNAs are involved in the melanogenesis pathway controlling pigmentation. In addition to the conserved miRNAs, we also obtained 22 potentially novel miRNAs from the WA and BA skin libraries. Conclusion This study represents the first comprehensive survey of miRNAs expressed in skin of animals of different coat colors by deep sequencing analysis. We discovered a collection of miRNAs that are differentially expressed in WA and BA skin. The results suggest important potential functions of miRNAs in coat color regulation. PMID:23067000
An expressed sequence tag (EST) data mining strategy succeeding in the discovery of new G-protein coupled receptors.

PubMed

Wittenberger, T; Schaller, H C; Hellebrand, S

2001-03-30

We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families. Copyright 2001 Academic Press.
Characterization of Adelphocoris suturalis (Hemiptera: Miridae) Transcriptome from Different Developmental Stages

NASA Astrophysics Data System (ADS)

Tian, Caihong; Tek Tay, Wee; Feng, Hongqiang; Wang, Ying; Hu, Yongmin; Li, Guoping

2015-06-01

Adelphocoris suturalis is one of the most serious pest insects of Bt cotton in China, however its molecular genetics, biochemistry and physiology are poorly understood. We used high throughput sequencing platform to perform de novo transcriptome assembly and gene expression analyses across different developmental stages (eggs, 2nd and 5th instar nymphs, female and male adults). We obtained 20 GB of clean data and revealed 88,614 unigenes, including 23,830 clusters and 64,784 singletons. These unigene sequences were annotated and classified by Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases. A large number of differentially expressed genes were discovered through pairwise comparisons between these developmental stages. Gene expression profiles were dramatically different between life stage transitions, with some of these most differentially expressed genes being associated with sex difference, metabolism and development. Quantitative real-time PCR results confirm deep-sequencing findings based on relative expression levels of nine randomly selected genes. Furthermore, over 791,390 single nucleotide polymorphisms and 2,682 potential simple sequence repeats were identified. Our study provided comprehensive transcriptional gene expression information for A. suturalis that will form the basis to better understanding of development pathways, hormone biosynthesis, sex differences and wing formation in mirid bugs.
Characterization of Adelphocoris suturalis (Hemiptera: Miridae) Transcriptome from Different Developmental Stages

PubMed Central

Tian, Caihong; Tek Tay, Wee; Feng, Hongqiang; Wang, Ying; Hu, Yongmin; Li, Guoping

2015-01-01

Adelphocoris suturalis is one of the most serious pest insects of Bt cotton in China, however its molecular genetics, biochemistry and physiology are poorly understood. We used high throughput sequencing platform to perform de novo transcriptome assembly and gene expression analyses across different developmental stages (eggs, 2nd and 5th instar nymphs, female and male adults). We obtained 20 GB of clean data and revealed 88,614 unigenes, including 23,830 clusters and 64,784 singletons. These unigene sequences were annotated and classified by Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases. A large number of differentially expressed genes were discovered through pairwise comparisons between these developmental stages. Gene expression profiles were dramatically different between life stage transitions, with some of these most differentially expressed genes being associated with sex difference, metabolism and development. Quantitative real-time PCR results confirm deep-sequencing findings based on relative expression levels of nine randomly selected genes. Furthermore, over 791,390 single nucleotide polymorphisms and 2,682 potential simple sequence repeats were identified. Our study provided comprehensive transcriptional gene expression information for A. suturalis that will form the basis to better understanding of development pathways, hormone biosynthesis, sex differences and wing formation in mirid bugs. PMID:26047353
Identification of differentially expressed genes in cucumber (Cucumis sativus L.) root under waterlogging stress by digital gene expression profile.

PubMed

Qi, Xiao-Hua; Xu, Xue-Wen; Lin, Xiao-Jian; Zhang, Wen-Jie; Chen, Xue-Hao

2012-03-01

High-throughput tag-sequencing (Tag-seq) analysis based on the Solexa Genome Analyzer platform was applied to analyze the gene expression profiling of cucumber plant at 5 time points over a 24h period of waterlogging treatment. Approximately 5.8 million total clean sequence tags per library were obtained with 143013 distinct clean tag sequences. Approximately 23.69%-29.61% of the distinct clean tags were mapped unambiguously to the unigene database, and 53.78%-60.66% of the distinct clean tags were mapped to the cucumber genome database. Analysis of the differentially expressed genes revealed that most of the genes were down-regulated in the waterlogging stages, and the differentially expressed genes mainly linked to carbon metabolism, photosynthesis, reactive oxygen species generation/scavenging, and hormone synthesis/signaling. Finally, quantitative real-time polymerase chain reaction using nine genes independently verified the tag-mapped results. This present study reveals the comprehensive mechanisms of waterlogging-responsive transcription in cucumber. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Action starring narratives and events: Structure and inference in visual narrative comprehension

PubMed Central

Cohn, Neil; Wittenberg, Eva

2015-01-01

Studies of discourse have long placed focus on the inference generated by information that is not overtly expressed, and theories of visual narrative comprehension similarly focused on the inference generated between juxtaposed panels. Within the visual language of comics, star-shaped “flashes” commonly signify impacts, but can be enlarged to the size of a whole panel that can omit all other representational information. These “action star” panels depict a narrative culmination (a “Peak”), but have content which readers must infer, thereby posing a challenge to theories of inference generation in visual narratives that focus only on the semantic changes between juxtaposed images. This paper shows that action stars demand more inference than depicted events, and that they are more coherent in narrative sequences than scrambled sequences (Experiment 1). In addition, action stars play a felicitous narrative role in the sequence (Experiment 2). Together, these results suggest that visual narratives use conventionalized depictions that demand the generation of inferences while retaining narrative coherence of a visual sequence. PMID:26709362
Action starring narratives and events: Structure and inference in visual narrative comprehension.

PubMed

Cohn, Neil; Wittenberg, Eva

Studies of discourse have long placed focus on the inference generated by information that is not overtly expressed, and theories of visual narrative comprehension similarly focused on the inference generated between juxtaposed panels. Within the visual language of comics, star-shaped "flashes" commonly signify impacts, but can be enlarged to the size of a whole panel that can omit all other representational information. These "action star" panels depict a narrative culmination (a "Peak"), but have content which readers must infer, thereby posing a challenge to theories of inference generation in visual narratives that focus only on the semantic changes between juxtaposed images. This paper shows that action stars demand more inference than depicted events, and that they are more coherent in narrative sequences than scrambled sequences (Experiment 1). In addition, action stars play a felicitous narrative role in the sequence (Experiment 2). Together, these results suggest that visual narratives use conventionalized depictions that demand the generation of inferences while retaining narrative coherence of a visual sequence.
Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

PubMed Central

Harris, R. Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P.; Hong, Chibo; Downey, Sara L.; Johnson, Brett E.; Fouse, Shaun D.; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J.; Gu, Junchen; Echipare, Lorigail; O’Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B.; Bernstein, Bradley E.; Hawkins, R. David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q.; Haussler, David; Ecker, Joseph; Li, Wei; Farnham, Peggy J.; Waterland, Robert A.; Meissner, Alexander; Marra, Marco A.; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F.

2010-01-01

Sequencing-based DNA methylation profiling methods are comprehensive and, as accuracy and affordability improve, will increasingly supplant microarrays for genome-scale analyses. Here, four sequencing-based methodologies were applied to biological replicates of human embryonic stem cells to compare their CpG coverage genome-wide and in transposons, resolution, cost, concordance and its relationship with CpG density and genomic context. The two bisulfite methods reached concordance of 82% for CpG methylation levels and 99% for non-CpG cytosine methylation levels. Using binary methylation calls, two enrichment methods were 99% concordant, while regions assessed by all four methods were 97% concordant. To achieve comprehensive methylome coverage while reducing cost, an approach integrating two complementary methods was examined. The integrative methylome profile along with histone methylation, RNA, and SNP profiles derived from the sequence reads allowed genome-wide assessment of allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635
Global characterization of copy number variants in epilepsy patients from whole genome sequencing

PubMed Central

Meloche, Caroline; Andrade, Danielle M.; Lafreniere, Ron G.; Gravel, Micheline; Spiegelman, Dan; Dionne-Laporte, Alexandre; Boelman, Cyrus; Hamdan, Fadi F.; Michaud, Jacques L.; Rouleau, Guy; Minassian, Berge A.; Bourque, Guillaume; Cossette, Patrick

2018-01-01

Epilepsy will affect nearly 3% of people at some point during their lifetime. Previous copy number variants (CNVs) studies of epilepsy have used array-based technology and were restricted to the detection of large or exonic events. In contrast, whole-genome sequencing (WGS) has the potential to more comprehensively profile CNVs but existing analytic methods suffer from limited accuracy. We show that this is in part due to the non-uniformity of read coverage, even after intra-sample normalization. To improve on this, we developed PopSV, an algorithm that uses multiple samples to control for technical variation and enables the robust detection of CNVs. Using WGS and PopSV, we performed a comprehensive characterization of CNVs in 198 individuals affected with epilepsy and 301 controls. For both large and small variants, we found an enrichment of rare exonic events in epilepsy patients, especially in genes with predicted loss-of-function intolerance. Notably, this genome-wide survey also revealed an enrichment of rare non-coding CNVs near previously known epilepsy genes. This enrichment was strongest for non-coding CNVs located within 100 Kbp of an epilepsy gene and in regions associated with changes in the gene expression, such as expression QTLs or DNase I hypersensitive sites. Finally, we report on 21 potentially damaging events that could be associated with known or new candidate epilepsy genes. Our results suggest that comprehensive sequence-based profiling of CNVs could help explain a larger fraction of epilepsy cases. PMID:29649218
Transcriptome sequencing and de novo analysis of the copepod Calanus sinicus using 454 GS FLX.

PubMed

Ning, Juan; Wang, Minxiao; Li, Chaolun; Sun, Song

2013-01-01

Despite their species abundance and primary economic importance, genomic information about copepods is still limited. In particular, genomic resources are lacking for the copepod Calanus sinicus, which is a dominant species in the coastal waters of East Asia. In this study, we performed de novo transcriptome sequencing to produce a large number of expressed sequence tags for the copepod C. sinicus. Copepodid larvae and adults were used as the basic material for transcriptome sequencing. Using 454 pyrosequencing, a total of 1,470,799 reads were obtained, which were assembled into 56,809 high quality expressed sequence tags. Based on their sequence similarity to known proteins, about 14,000 different genes were identified, including members of all major conserved signaling pathways. Transcripts that were putatively involved with growth, lipid metabolism, molting, and diapause were also identified among these genes. Differentially expressed genes related to several processes were found in C. sinicus copepodid larvae and adults. We detected 284,154 single nucleotide polymorphisms (SNPs) that provide a resource for gene function studies. Our data provide the most comprehensive transcriptome resource available for C. sinicus. This resource allowed us to identify genes associated with primary physiological processes and SNPs in coding regions, which facilitated the quantitative analysis of differential gene expression. These data should provide foundation for future genetic and genomic studies of this and related species.
RNA sequencing reveals sexually dimorphic gene expression before gonadal differentiation in chicken and allows comprehensive annotation of the W-chromosome

PubMed Central

2013-01-01

Background Birds have a ZZ male: ZW female sex chromosome system and while the Z-linked DMRT1 gene is necessary for testis development, the exact mechanism of sex determination in birds remains unsolved. This is partly due to the poor annotation of the W chromosome, which is speculated to carry a female determinant. Few genes have been mapped to the W and little is known of their expression. Results We used RNA-seq to produce a comprehensive profile of gene expression in chicken blastoderms and embryonic gonads prior to sexual differentiation. We found robust sexually dimorphic gene expression in both tissues pre-dating gonadogenesis, including sex-linked and autosomal genes. This supports the hypothesis that sexual differentiation at the molecular level is at least partly cell autonomous in birds. Different sets of genes were sexually dimorphic in the two tissues, indicating that molecular sexual differentiation is tissue specific. Further analyses allowed the assembly of full-length transcripts for 26 W chromosome genes, providing a view of the W transcriptome in embryonic tissues. This is the first extensive analysis of W-linked genes and their expression profiles in early avian embryos. Conclusion Sexual differentiation at the molecular level is established in chicken early in embryogenesis, before gonadal sex differentiation. We find that the W chromosome is more transcriptionally active than previously thought, expand the number of known genes to 26 and present complete coding sequences for these W genes. This includes two novel W-linked sequences and three small RNAs reassigned to the W from the Un_Random chromosome. PMID:23531366
Characteristics of the Lotus japonicus gene repertoire deduced from large-scale expressed sequence tag (EST) analysis.

PubMed

Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

2004-02-01

To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
Genome-wide identification of conserved microRNA and their response to drought stress in Dongxiang wild rice (Oryza rufipogon Griff.).

PubMed

Zhang, Fantao; Luo, Xiangdong; Zhou, Yi; Xie, Jiankun

2016-04-01

To identify drought stress-responsive conserved microRNA (miRNA) from Dongxiang wild rice (Oryza rufipogon Griff., DXWR) on a genome-wide scale, high-throughput sequencing technology was used to sequence libraries of DXWR samples, treated with and without drought stress. 505 conserved miRNAs corresponding to 215 families were identified. 17 were significantly down-regulated and 16 were up-regulated under drought stress. Stem-loop qRT-PCR revealed the same expression patterns as high-throughput sequencing, suggesting the accuracy of the sequencing result was high. Potential target genes of the drought-responsive miRNA were predicted to be involved in diverse biological processes. Furthermore, 16 miRNA families were first identified to be involved in drought stress response from plants. These results present a comprehensive view of the conserved miRNA and their expression patterns under drought stress for DXWR, which will provide valuable information and sequence resources for future basis studies.
Comprehensive evaluation of AmpliSeq transcriptome, a novel targeted whole transcriptome RNA sequencing methodology for global gene expression analysis.

PubMed

Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich

2015-12-16

Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.
The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

PubMed

Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

2011-01-01

The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
Long Term Follow up of the Delayed Effects of Acute Radiation Exposure in Primates

DTIC Science & Technology

2017-10-01

66 of 94 We will then use shRNAs and/or CRISPR constructs targeting the gene of interest to knock down its expression in stem cells prior to...DLBCLs Mutational profiling identifies 150 driver genes Gene expression identifies sub- groups including cell of origin Unbiased CRISPR screen...Exome sequencing in 1,001 DLBCL patients comprehensively identifies 150 driver genes d Unbiased CRISPR screen in DLBCL cell lines identifies essential
A detailed gene expression study of the Miscanthus genus reveals changes in the transcriptome associated with the rejuvenation of spring rhizomes.

PubMed

Barling, Adam; Swaminathan, Kankshita; Mitros, Therese; James, Brandon T; Morris, Juliette; Ngamboma, Ornella; Hall, Megan C; Kirkpatrick, Jessica; Alabady, Magdy; Spence, Ashley K; Hudson, Matthew E; Rokhsar, Daniel S; Moose, Stephen P

2013-12-09

The Miscanthus genus of perennial C4 grasses contains promising biofuel crops for temperate climates. However, few genomic resources exist for Miscanthus, which limits understanding of its interesting biology and future genetic improvement. A comprehensive catalog of expressed sequences were generated from a variety of Miscanthus species and tissue types, with an emphasis on characterizing gene expression changes in spring compared to fall rhizomes. Illumina short read sequencing technology was used to produce transcriptome sequences from different tissues and organs during distinct developmental stages for multiple Miscanthus species, including Miscanthus sinensis, Miscanthus sacchariflorus, and their interspecific hybrid Miscanthus × giganteus. More than fifty billion base-pairs of Miscanthus transcript sequence were produced. Overall, 26,230 Sorghum gene models (i.e., ~ 96% of predicted Sorghum genes) had at least five Miscanthus reads mapped to them, suggesting that a large portion of the Miscanthus transcriptome is represented in this dataset. The Miscanthus × giganteus data was used to identify genes preferentially expressed in a single tissue, such as the spring rhizome, using Sorghum bicolor as a reference. Quantitative real-time PCR was used to verify examples of preferential expression predicted via RNA-Seq. Contiguous consensus transcript sequences were assembled for each species and annotated using InterProScan. Sequences from the assembled transcriptome were used to amplify genomic segments from a doubled haploid Miscanthus sinensis and from Miscanthus × giganteus to further disentangle the allelic and paralogous variations in genes. This large expressed sequence tag collection creates a valuable resource for the study of Miscanthus biology by providing detailed gene sequence information and tissue preferred expression patterns. We have successfully generated a database of transcriptome assemblies and demonstrated its use in the study of genes of interest. Analysis of gene expression profiles revealed biological pathways that exhibit altered regulation in spring compared to fall rhizomes, which are consistent with their different physiological functions. The expression profiles of the subterranean rhizome provides a better understanding of the biological activities of the underground stem structures that are essentials for perenniality and the storage or remobilization of carbon and nutrient resources.
Developmental changes in children's comprehension and explanation of spatial metaphors for time.

PubMed

Stites, Lauren J; Özçalişkan, Şeyda

2013-11-01

Time is frequently expressed with spatial motion, using one of three different metaphor types: moving-time, moving-ego, and sequence-as-position. Previous work shows that children can understand and explain moving-time metaphors by age five (Özçalışkan, 2005). In this study, we focus on all three metaphor types for time, and ask whether metaphor type has an effect on children's metaphor comprehension and explanation abilities. Analysis of the responses of three- to six-year-old children and adults showed that comprehension and explanation of all three metaphor types emerge at an early age. Moreover, children's metaphor comprehension and explanation vary by metaphor type: children perform better in understanding and explaining metaphors that structure time in relation to the observer of time (moving-ego, moving-time) than metaphors that structure time without any relation to the observer of time (sequence-as-position-on-a-path). Our findings suggest that children's bodily experiences might play a role in their developing understanding of the abstract concept of time.
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi.

PubMed

Song, Zhangyong; Yin, Youping; Jiang, Shasha; Liu, Juanjuan; Chen, Huan; Wang, Zhongkang

2013-06-19

Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia.
Comprehensive Molecular Characterization of Urothelial Bladder Carcinoma

PubMed Central

2014-01-01

Urothelial carcinoma of the bladder is a common malignancy that causes approximately 150,000 deaths per year worldwide. To date, no molecularly targeted agents have been approved for the disease. As part of The Cancer Genome Atlas project, we report here an integrated analysis of 131 urothelial carcinomas to provide a comprehensive landscape of molecular alterations. There were statistically significant recurrent mutations in 32 genes, including multiple genes involved in cell cycle regulation, chromatin regulation, and kinase signaling pathways, as well as 9 genes not previously reported as significantly mutated in any cancer. RNA sequencing revealed four expression subtypes, two of which (papillary-like and basal/squamous-like) were also evident in miRNA sequencing and protein data. Whole-genome and RNA sequencing identified recurrent in-frame activating FGFR3-TACC3 fusions and expression or integration of several viruses (including HPV16) that are associated with gene inactivation. Our analyses identified potential therapeutic targets in 69% of the tumours, including 42% with targets in the PI3K/AKT/mTOR pathway and 45% with targets (including ERBB2) in the RTK/MAPK pathway. Chromatin regulatory genes were more frequently mutated in urothelial carcinoma than in any common cancer studied to date, suggesting the future possibility of targeted therapy for chromatin abnormalities. PMID:24476821

CyanoEXpress: A web database for exploration and visualisation of the integrated transcriptome of cyanobacterium Synechocystis sp. PCC6803.

PubMed

Hernandez-Prieto, Miguel A; Futschik, Matthias E

2012-01-01

Synechocystis sp. PCC6803 is one of the best studied cyanobacteria and an important model organism for our understanding of photosynthesis. The early availability of its complete genome sequence initiated numerous transcriptome studies, which have generated a wealth of expression data. Analysis of the accumulated data can be a powerful tool to study transcription in a comprehensive manner and to reveal underlying regulatory mechanisms, as well as to annotate genes whose functions are yet unknown. However, use of divergent microarray platforms, as well as distributed data storage make meta-analyses of Synechocystis expression data highly challenging, especially for researchers with limited bioinformatic expertise and resources. To facilitate utilisation of the accumulated expression data for a wider research community, we have developed CyanoEXpress, a web database for interactive exploration and visualisation of transcriptional response patterns in Synechocystis. CyanoEXpress currently comprises expression data for 3073 genes and 178 environmental and genetic perturbations obtained in 31 independent studies. At present, CyanoEXpress constitutes the most comprehensive collection of expression data available for Synechocystis and can be freely accessed. The database is available for free at http://cyanoexpress.sysbiolab.eu.
Expression profiling of the mouse early embryo: Reflections and Perspectives

PubMed Central

Ko, Minoru S. H.

2008-01-01

Laboratory mouse plays important role in our understanding of early mammalian development and provides invaluable model for human early embryos, which are difficult to study for ethical and technical reasons. Comprehensive collection of cDNA clones, their sequences, and complete genome sequence information, which have been accumulated over last two decades, have provided even more advantages to mouse models. Here the progress in global gene expression profiling in early mouse embryos and, to some extent, stem cells are reviewed and the future directions and challenges are discussed. The discussions include the restatement of global gene expression profiles as snapshot of cellular status, and subsequent distinction between the differentiation state and physiological state of the cells. The discussions then extend to the biological problems that can be addressed only through global expression profiling, which include: bird’s-eye view of global gene expression changes, molecular index for developmental potency, cell lineage trajectory, microarray-guided cell manipulation, and the possibility of delineating gene regulatory cascades and networks. PMID:16739220
Assembly and features of secondary metabolite biosynthetic gene clusters in Streptomyces ansochromogenes.

PubMed

Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong

2013-07-01

A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.
Comprehensive transcriptome profiling reveals long noncoding RNA expression and alternative splicing regulation during fruit development and ripening in kiwifruit (Actinidia chinensis)

USDA-ARS?s Scientific Manuscript database

Genomic and transcriptomic data on kiwifruit (Actinidia chinensis) in public databases are very limited despite its nutritional and economic value. Previously, we have constructed and sequenced nine fruit RNA-Seq libraries of A. chinensis cv. 'Hongyang' at immature, mature, and postharvest ripening...
In vitro manipulation of gene expression in larval Schistosoma: a model for postgenomic approaches in Trematoda

PubMed Central

YOSHINO, TIMOTHY P.; DINGUIRARD, NATHALIE; DE MORAES MOURÃO, MARINA

2013-01-01

SUMMARY With rapid developments in DNA and protein sequencing technologies, combined with powerful bioinformatics tools, a continued acceleration of gene identification in parasitic helminths is predicted, potentially leading to discovery of new drug and vaccine targets, enhanced diagnostics and insights into the complex biology underlying host-parasite interactions. For the schistosome blood flukes, with the recent completion of genome sequencing and comprehensive transcriptomic datasets, there has accumulated massive amounts of gene sequence data, for which, in the vast majority of cases, little is known about actual functions within the intact organism. In this review we attempt to bring together traditional in vitro cultivation approaches and recent emergent technologies of molecular genomics, transcriptomics and genetic manipulation to illustrate the considerable progress made in our understanding of trematode gene expression and function during development of the intramolluscan larval stages. Using several prominent trematode families (Schistosomatidae, Fasciolidae, Echinostomatidae), we have focused on the current status of in vitro larval isolation/cultivation as a source of valuable raw material supporting gene discovery efforts in model digeneans that include whole genome sequencing, transcript and protein expression profiling during larval development, and progress made in the in vitro manipulation of genes and their expression in larval trematodes using transgenic and RNA interference (RNAi) approaches. PMID:19961646
Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks

PubMed Central

Trapnell, Cole; Roberts, Adam; Goff, Loyal; Pertea, Geo; Kim, Daehwan; Kelley, David R; Pimentel, Harold; Salzberg, Steven L; Rinn, John L; Pachter, Lior

2012-01-01

Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions. This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol's execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ~1 h of hands-on time. PMID:22383036
Viral expression associated with gastrointestinal adenocarcinomas in TCGA high-throughput sequencing data

PubMed Central

2013-01-01

Background Up to 20% of cancers worldwide are thought to be associated with microbial pathogens, including bacteria and viruses. The widely used methods of viral infection detection are usually limited to a few a priori suspected viruses in one cancer type. To our knowledge, there have not been many broad screening approaches to address this problem more comprehensively. Methods In this study, we performed a comprehensive screening for viruses in nine common cancers using a multistep computational approach. Tumor transcriptome and genome sequencing data were available from The Cancer Genome Atlas (TCGA). Nine hundred fifty eight primary tumors in nine common cancers with poor prognosis were screened against a non-redundant database of virus sequences. DNA sequences from normal matched tissue specimens were used as controls to test whether each virus is associated with tumors. Results We identified human papilloma virus type 18 (HPV-18) and four human herpes viruses (HHV) types 4, 5, 6B, and 8, also known as EBV, CMV, roseola virus, and KSHV, in colon, rectal, and stomach adenocarcinomas. In total, 59% of screened gastrointestinal adenocarcinomas (GIA) were positive for at least one virus: 26% for EBV, 21% for CMV, 7% for HHV-6B, and 20% for HPV-18. Over 20% of tumors were co-infected with multiple viruses. Two viruses (EBV and CMV) were statistically significantly associated with colorectal cancers when compared to the matched healthy tissues from the same individuals (p = 0.02 and 0.03, respectively). HPV-18 was not detected in DNA, and thus, no association testing was possible. Nevertheless, HPV-18 expression patterns suggest viral integration in the host genome, consistent with the potentially oncogenic nature of HPV-18 in colorectal adenocarcinomas. The estimated counts of viral copies were below one per cell for all identified viruses and approached the detection limit. Conclusions Our comprehensive screening for viruses in multiple cancer types using next-generation sequencing data clearly demonstrates the presence of viral sequences in GIA. EBV, CMV, and HPV-18 are potentially causal for GIA, although their oncogenic role is yet to be established. PMID:24279398
A compact, in vivo screen of all 6-mers reveals drivers of tissue-specific expression and guides synthetic regulatory element design.

PubMed

Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav

2013-07-18

Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.
A Comprehensive Approach to Sequence-oriented IsomiR annotation (CASMIR): demonstration with IsomiR profiling in colorectal neoplasia.

PubMed

Wu, Chung Wah; Evans, Jared M; Huang, Shengbing; Mahoney, Douglas W; Dukek, Brian A; Taylor, William R; Yab, Tracy C; Smyrk, Thomas C; Jen, Jin; Kisiel, John B; Ahlquist, David A

2018-05-25

MicroRNA (miRNA) profiling is an important step in studying biological associations and identifying marker candidates. miRNA exists in isoforms, called isomiRs, which may exhibit distinct properties. With conventional profiling methods, limitations in assay and analysis platforms may compromise isomiR interrogation. We introduce a comprehensive approach to sequence-oriented isomiR annotation (CASMIR) to allow unbiased identification of global isomiRs from small RNA sequencing data. In this approach, small RNA reads are maintained as independent sequences instead of being summarized under miRNA names. IsomiR features are identified through step-wise local alignment against canonical forms and precursor sequences. Through customizing the reference database, CASMIR is applicable to isomiR annotation across species. To demonstrate its application, we investigated isomiR profiles in normal and neoplastic human colorectal epithelia. We also ran miRDeep2, a popular miRNA analysis algorithm to validate isomiRs annotated by CASMIR. With CASMIR, specific and biologically relevant isomiR patterns could be identified. We note that specific isomiRs are often more abundant than their canonical forms. We identify isomiRs that are commonly up-regulated in both colorectal cancer and advanced adenoma, and illustrate advantages in targeting isomiRs as potential biomarkers over canonical forms. Studying miRNAs at the isomiR level could reveal new insight into miRNA biology and inform assay design for specific isomiRs. CASMIR facilitates comprehensive annotation of isomiR features in small RNA sequencing data for isomiR profiling and differential expression analysis.
An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

PubMed Central

Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

2004-01-01

Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051
De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration

PubMed Central

2013-01-01

Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514
Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling.

PubMed

Łabaj, Paweł P; Leparc, Germán G; Linggi, Bryan E; Markillie, Lye Meng; Wiley, H Steven; Kreil, David P

2011-07-01

Measurement precision determines the power of any analysis to reliably identify significant signals, such as in screens for differential expression, independent of whether the experimental design incorporates replicates or not. With the compilation of large-scale RNA-Seq datasets with technical replicate samples, however, we can now, for the first time, perform a systematic analysis of the precision of expression level estimates from massively parallel sequencing technology. This then allows considerations for its improvement by computational or experimental means. We report on a comprehensive study of target identification and measurement precision, including their dependence on transcript expression levels, read depth and other parameters. In particular, an impressive recall of 84% of the estimated true transcript population could be achieved with 331 million 50 bp reads, with diminishing returns from longer read lengths and even less gains from increased sequencing depths. Most of the measurement power (75%) is spent on only 7% of the known transcriptome, however, making less strongly expressed transcripts harder to measure. Consequently, <30% of all transcripts could be quantified reliably with a relative error<20%. Based on established tools, we then introduce a new approach for mapping and analysing sequencing reads that yields substantially improved performance in gene expression profiling, increasing the number of transcripts that can reliably be quantified to over 40%. Extrapolations to higher sequencing depths highlight the need for efficient complementary steps. In discussion we outline possible experimental and computational strategies for further improvements in quantification precision. rnaseq10@boku.ac.at
Characterization of transcriptome dynamics during watermelon fruit development: sequencing, assembly, annotation and gene expression profiles

PubMed Central

2011-01-01

Background Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. Results We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression profiles helped elucidating molecular mechanisms governing these important quality-related traits during watermelon fruit development. Conclusion We have generated a large collection of watermelon ESTs, which represents a significant expansion of the current transcript catalog of watermelon and a valuable resource for future studies on the genomics of watermelon and other closely-related species. Digital expression analysis of this EST collection allowed us to identify a large set of genes that were differentially expressed during watermelon fruit development and ripening, which provide a rich source of candidates for future functional analysis and represent a valuable increase in our knowledge base of watermelon fruit biology. PMID:21936920
Characterization of transcriptome dynamics during watermelon fruit development: sequencing, assembly, annotation and gene expression profiles.

PubMed

Guo, Shaogui; Liu, Jingan; Zheng, Yi; Huang, Mingyun; Zhang, Haiying; Gong, Guoyi; He, Hongju; Ren, Yi; Zhong, Silin; Fei, Zhangjun; Xu, Yong

2011-09-21

Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression profiles helped elucidating molecular mechanisms governing these important quality-related traits during watermelon fruit development. We have generated a large collection of watermelon ESTs, which represents a significant expansion of the current transcript catalog of watermelon and a valuable resource for future studies on the genomics of watermelon and other closely-related species. Digital expression analysis of this EST collection allowed us to identify a large set of genes that were differentially expressed during watermelon fruit development and ripening, which provide a rich source of candidates for future functional analysis and represent a valuable increase in our knowledge base of watermelon fruit biology.
The language of geometry: Fast comprehension of geometrical primitives and rules in human adults and preschoolers.

PubMed

Amalric, Marie; Wang, Liping; Pica, Pierre; Figueira, Santiago; Sigman, Mariano; Dehaene, Stanislas

2017-01-01

During language processing, humans form complex embedded representations from sequential inputs. Here, we ask whether a "geometrical language" with recursive embedding also underlies the human ability to encode sequences of spatial locations. We introduce a novel paradigm in which subjects are exposed to a sequence of spatial locations on an octagon, and are asked to predict future locations. The sequences vary in complexity according to a well-defined language comprising elementary primitives and recursive rules. A detailed analysis of error patterns indicates that primitives of symmetry and rotation are spontaneously detected and used by adults, preschoolers, and adult members of an indigene group in the Amazon, the Munduruku, who have a restricted numerical and geometrical lexicon and limited access to schooling. Furthermore, subjects readily combine these geometrical primitives into hierarchically organized expressions. By evaluating a large set of such combinations, we obtained a first view of the language needed to account for the representation of visuospatial sequences in humans, and conclude that they encode visuospatial sequences by minimizing the complexity of the structured expressions that capture them.
The language of geometry: Fast comprehension of geometrical primitives and rules in human adults and preschoolers

PubMed Central

Amalric, Marie; Wang, Liping; Figueira, Santiago; Sigman, Mariano; Dehaene, Stanislas

2017-01-01

During language processing, humans form complex embedded representations from sequential inputs. Here, we ask whether a “geometrical language” with recursive embedding also underlies the human ability to encode sequences of spatial locations. We introduce a novel paradigm in which subjects are exposed to a sequence of spatial locations on an octagon, and are asked to predict future locations. The sequences vary in complexity according to a well-defined language comprising elementary primitives and recursive rules. A detailed analysis of error patterns indicates that primitives of symmetry and rotation are spontaneously detected and used by adults, preschoolers, and adult members of an indigene group in the Amazon, the Munduruku, who have a restricted numerical and geometrical lexicon and limited access to schooling. Furthermore, subjects readily combine these geometrical primitives into hierarchically organized expressions. By evaluating a large set of such combinations, we obtained a first view of the language needed to account for the representation of visuospatial sequences in humans, and conclude that they encode visuospatial sequences by minimizing the complexity of the structured expressions that capture them. PMID:28125595
Comprehensive discovery of noncoding RNAs in acute myeloid leukemia cell transcriptomes.

PubMed

Zhang, Jin; Griffith, Malachi; Miller, Christopher A; Griffith, Obi L; Spencer, David H; Walker, Jason R; Magrini, Vincent; McGrath, Sean D; Ly, Amy; Helton, Nichole M; Trissal, Maria; Link, Daniel C; Dang, Ha X; Larson, David E; Kulkarni, Shashikant; Cordes, Matthew G; Fronick, Catrina C; Fulton, Robert S; Klco, Jeffery M; Mardis, Elaine R; Ley, Timothy J; Wilson, Richard K; Maher, Christopher A

2017-11-01

To detect diverse and novel RNA species comprehensively, we compared deep small RNA and RNA sequencing (RNA-seq) methods applied to a primary acute myeloid leukemia (AML) sample. We were able to discover previously unannotated small RNAs using deep sequencing of a library method using broader insert size selection. We analyzed the long noncoding RNA (lncRNA) landscape in AML by comparing deep sequencing from multiple RNA-seq library construction methods for the sample that we studied and then integrating RNA-seq data from 179 AML cases. This identified lncRNAs that are completely novel, differentially expressed, and associated with specific AML subtypes. Our study revealed the complexity of the noncoding RNA transcriptome through a combined strategy of strand-specific small RNA and total RNA-seq. This dataset will serve as an invaluable resource for future RNA-based analyses. Copyright © 2017 ISEH – Society for Hematology and Stem Cells. Published by Elsevier Inc. All rights reserved.
A large scale analysis of cDNA in Arabidopsis thaliana: generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries.

PubMed

Asamizu, E; Nakamura, Y; Sato, S; Tabata, S

2000-06-30

For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
Toxicogenomics and Cancer Susceptibility: Advances with Next-Generation Sequencing

PubMed Central

Ning, Baitang; Su, Zhenqiang; Mei, Nan; Hong, Huixiao; Deng, Helen; Shi, Leming; Fuscoe, James C.; Tolleson, William H.

2017-01-01

The aim of this review is to comprehensively summarize the recent achievements in the field of toxicogenomics and cancer research regarding genetic-environmental interactions in carcinogenesis and detection of genetic aberrations in cancer genomes by next-generation sequencing technology. Cancer is primarily a genetic disease in which genetic factors and environmental stimuli interact to cause genetic and epigenetic aberrations in human cells. Mutations in the germline act as either high-penetrance alleles that strongly increase the risk of cancer development, or as low-penetrance alleles that mildly change an individual’s susceptibility to cancer. Somatic mutations, resulting from either DNA damage induced by exposure to environmental mutagens or from spontaneous errors in DNA replication or repair are involved in the development or progression of the cancer. Induced or spontaneous changes in the epigenome may also drive carcinogenesis. Advances in next-generation sequencing technology provide us opportunities to accurately, economically, and rapidly identify genetic variants, somatic mutations, gene expression profiles, and epigenetic alterations with single-base resolution. Whole genome sequencing, whole exome sequencing, and RNA sequencing of paired cancer and adjacent normal tissue present a comprehensive picture of the cancer genome. These new findings should benefit public health by providing insights in understanding cancer biology, and in improving cancer diagnosis and therapy. PMID:24875441
TranslatomeDB: a comprehensive database and cloud-based analysis platform for translatome sequencing data

PubMed Central

Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie

2018-01-01

Abstract Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. PMID:29106630

Divergent evolution of arrested development in the dauer stage of Caenorhabditis elegans and the infective stage of Heterodera glycines

PubMed Central

Elling, Axel A; Mitreva, Makedonka; Recknor, Justin; Gai, Xiaowu; Martin, John; Maier, Thomas R; McDermott, Jeffrey P; Hewezi, Tarek; McK Bird, David; Davis, Eric L; Hussey, Richard S; Nettleton, Dan; McCarter, James P; Baum, Thomas J

2007-01-01

Background The soybean cyst nematode Heterodera glycines is the most important parasite in soybean production worldwide. A comprehensive analysis of large-scale gene expression changes throughout the development of plant-parasitic nematodes has been lacking to date. Results We report an extensive genomic analysis of H. glycines, beginning with the generation of 20,100 expressed sequence tags (ESTs). In-depth analysis of these ESTs plus approximately 1,900 previously published sequences predicted 6,860 unique H. glycines genes and allowed a classification by function using InterProScan. Expression profiling of all 6,860 genes throughout the H. glycines life cycle was undertaken using the Affymetrix Soybean Genome Array GeneChip. Our data sets and results represent a comprehensive resource for molecular studies of H. glycines. Demonstrating the power of this resource, we were able to address whether arrested development in the Caenorhabditis elegans dauer larva and the H. glycines infective second-stage juvenile (J2) exhibits shared gene expression profiles. We determined that the gene expression profiles associated with the C. elegans dauer pathway are not uniformly conserved in H. glycines and that the expression profiles of genes for metabolic enzymes of C. elegans dauer larvae and H. glycines infective J2 are dissimilar. Conclusion Our results indicate that hallmark gene expression patterns and metabolism features are not shared in the developmentally arrested life stages of C. elegans and H. glycines, suggesting that developmental arrest in these two nematode species has undergone more divergent evolution than previously thought and pointing to the need for detailed genomic analyses of individual parasite species. PMID:17919324
Genome-wide transcriptome and expression profile analysis of Phalaenopsis during explant browning.

PubMed

Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei

2015-01-01

Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning.
Genome-Wide Transcriptome and Expression Profile Analysis of Phalaenopsis during Explant Browning

PubMed Central

Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei

2015-01-01

Background Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. Methodology/Principal Findings We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Conclusions/Significance Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning. PMID:25874455
A comprehensive transcript index of the human genome generated using microarrays and computational approaches

PubMed Central

Schadt, Eric E; Edwards, Stephen W; GuhaThakurta, Debraj; Holder, Dan; Ying, Lisa; Svetnik, Vladimir; Leonardson, Amy; Hart, Kyle W; Russell, Archie; Li, Guoya; Cavet, Guy; Castle, John; McDonagh, Paul; Kan, Zhengyan; Chen, Ronghua; Kasarskis, Andrew; Margarint, Mihai; Caceres, Ramon M; Johnson, Jason M; Armour, Christopher D; Garrett-Engele, Philip W; Tsinoremas, Nicholas F; Shoemaker, Daniel D

2004-01-01

Background Computational and microarray-based experimental approaches were used to generate a comprehensive transcript index for the human genome. Oligonucleotide probes designed from approximately 50,000 known and predicted transcript sequences from the human genome were used to survey transcription from a diverse set of 60 tissues and cell lines using ink-jet microarrays. Further, expression activity over at least six conditions was more generally assessed using genomic tiling arrays consisting of probes tiled through a repeat-masked version of the genomic sequence making up chromosomes 20 and 22. Results The combination of microarray data with extensive genome annotations resulted in a set of 28,456 experimentally supported transcripts. This set of high-confidence transcripts represents the first experimentally driven annotation of the human genome. In addition, the results from genomic tiling suggest that a large amount of transcription exists outside of annotated regions of the genome and serves as an example of how this activity could be measured on a genome-wide scale. Conclusions These data represent one of the most comprehensive assessments of transcriptional activity in the human genome and provide an atlas of human gene expression over a unique set of gene predictions. Before the annotation of the human genome is considered complete, however, the previously unannotated transcriptional activity throughout the genome must be fully characterized. PMID:15461792
Gene expression analysis of flax seed development

PubMed Central

2011-01-01

Background Flax, Linum usitatissimum L., is an important crop whose seed oil and stem fiber have multiple industrial applications. Flax seeds are also well-known for their nutritional attributes, viz., omega-3 fatty acids in the oil and lignans and mucilage from the seed coat. In spite of the importance of this crop, there are few molecular resources that can be utilized toward improving seed traits. Here, we describe flax embryo and seed development and generation of comprehensive genomic resources for the flax seed. Results We describe a large-scale generation and analysis of expressed sequences in various tissues. Collectively, the 13 libraries we have used provide a broad representation of genes active in developing embryos (globular, heart, torpedo, cotyledon and mature stages) seed coats (globular and torpedo stages) and endosperm (pooled globular to torpedo stages) and genes expressed in flowers, etiolated seedlings, leaves, and stem tissue. A total of 261,272 expressed sequence tags (EST) (GenBank accessions LIBEST_026995 to LIBEST_027011) were generated. These EST libraries included transcription factor genes that are typically expressed at low levels, indicating that the depth is adequate for in silico expression analysis. Assembly of the ESTs resulted in 30,640 unigenes and 82% of these could be identified on the basis of homology to known and hypothetical genes from other plants. When compared with fully sequenced plant genomes, the flax unigenes resembled poplar and castor bean more than grape, sorghum, rice or Arabidopsis. Nearly one-fifth of these (5,152) had no homologs in sequences reported for any organism, suggesting that this category represents genes that are likely unique to flax. Digital analyses revealed gene expression dynamics for the biosynthesis of a number of important seed constituents during seed development. Conclusions We have developed a foundational database of expressed sequences and collection of plasmid clones that comprise even low-expressed genes such as those encoding transcription factors. This has allowed us to delineate the spatio-temporal aspects of gene expression underlying the biosynthesis of a number of important seed constituents in flax. Flax belongs to a taxonomic group of diverse plants and the large sequence database will allow for evolutionary studies as well. PMID:21529361
Expression profiling of snoRNAs in normal hematopoiesis and AML

PubMed Central

Warner, Wayne A.; Spencer, David H.; Trissal, Maria; White, Brian S.; Helton, Nichole; Ley, Timothy J.

2018-01-01

Small nucleolar RNAs (snoRNAs) are noncoding RNAs that contribute to ribosome biogenesis and RNA splicing by modifying ribosomal RNA and spliceosome RNAs, respectively. We optimized a next-generation sequencing approach and a custom analysis pipeline to identify and quantify expression of snoRNAs in acute myeloid leukemia (AML) and normal hematopoietic cell populations. We show that snoRNAs are expressed in a lineage- and development-specific fashion during hematopoiesis. The most striking examples involve snoRNAs located in 2 imprinted loci, which are highly expressed in hematopoietic progenitors and downregulated during myeloid differentiation. Although most snoRNAs are expressed at similar levels in AML cells compared with CD34+, a subset of snoRNAs showed consistent differential expression, with the great majority of these being decreased in the AML samples. Analysis of host gene expression, splicing patterns, and whole-genome sequence data for mutational events did not identify transcriptional patterns or genetic alterations that account for these expression differences. These data provide a comprehensive analysis of the snoRNA transcriptome in normal and leukemic cells and should be helpful in the design of studies to define the contribution of snoRNAs to normal and malignant hematopoiesis. PMID:29365324
Whole-exome sequencing for RH genotyping and alloimmunization risk in children with sickle cell anemia

PubMed Central

Flanagan, Jonathan M.; Vege, Sunitha; Luban, Naomi L. C.; Brown, R. Clark; Ware, Russell E.; Westhoff, Connie M.

2017-01-01

RH genes are highly polymorphic and encode the most complex of the 35 human blood group systems. This genetic diversity contributes to Rh alloimmunization in patients with sickle cell anemia (SCA) and is not avoided by serologic Rh-matched red cell transfusions. Standard serologic testing does not distinguish variant Rh antigens. Single nucleotide polymorphism (SNP)–based DNA arrays detect many RHD and RHCE variants, but the number of alleles tested is limited. We explored a next-generation sequencing (NGS) approach using whole-exome sequencing (WES) in 27 Rh alloimmunized and 27 matched non-alloimmunized patients with SCA who received chronic red cell transfusions and were enrolled in a multicenter study. We demonstrate that WES provides a comprehensive RH genotype, identifies SNPs not interrogated by DNA array, and accurately determines RHD zygosity. Among this multicenter cohort, we demonstrate an association between an altered RH genotype and Rh alloimmunization: 52% of Rh immunized vs 19% of non-immunized patients expressed variant Rh without co-expression of the conventional protein. Our findings suggest that RH allele variation in patients with SCA is clinically relevant, and NGS technology can offer a comprehensive alternative to targeted SNP-based testing. This is particularly relevant as NGS data becomes more widely available and could provide the means for reducing Rh alloimmunization in children with SCA. PMID:29296782
Comprehensive Genetic Database of Expressed Sequence Tags for Coccolithophorids

NASA Astrophysics Data System (ADS)

Ranji, Mohammad; Hadaegh, Ahmad R.

Coccolithophorids are unicellular, marine, golden-brown, single-celled algae (Haptophyta) commonly found in near-surface waters in patchy distributions. They belong to the Phytoplankton family that is known to be responsible for much of the earth reproduction. Phytoplankton, just like plants live based on the energy obtained by Photosynthesis which produces oxygen. Substantial amount of oxygen in the earth's atmosphere is produced by Phytoplankton through Photosynthesis. The single-celled Emiliana Huxleyi is the most commonly known specie of Coccolithophorids and is known for extracting bicarbonate (HCO3) from its environment and producing calcium carbonate to form Coccoliths. Coccolithophorids are one of the world's primary producers, contributing about 15% of the average oceanic phytoplankton biomass to the oceans. They produce elaborate, minute calcite platelets (Coccoliths), covering the cell to form a Coccosphere and supplying up to 60% of the bulk pelagic calcite deposited on the sea floors. In order to understand the genetics of Coccolithophorid and the complexities of their biochemical reactions, we decided to build a database to store a complete profile of these organisms' genomes. Although a variety of such databases currently exist, (http://www.geneservice.co.uk/home/) none have yet been developed to comprehensively address the sequencing efforts underway by the Coccolithophorid research community. This database is called CocooExpress and is available to public (http://bioinfo.csusm.edu) for both data queries and sequence contribution.
Discovery of cashmere goat (Capra hircus) microRNAs in skin and hair follicles by Solexa sequencing.

PubMed

Yuan, Chao; Wang, Xiaolong; Geng, Rongqing; He, Xiaolin; Qu, Lei; Chen, Yulin

2013-07-28

MicroRNAs (miRNAs) are a large family of endogenous, non-coding RNAs, about 22 nucleotides long, which regulate gene expression through sequence-specific base pairing with target mRNAs. Extensive studies have shown that miRNA expression in the skin changes remarkably during distinct stages of the hair cycle in humans, mice, goats and sheep. In this study, the skin tissues were harvested from the three stages of hair follicle cycling (anagen, catagen and telogen) in a fibre-producing goat breed. In total, 63,109,004 raw reads were obtained by Solexa sequencing and 61,125,752 clean reads remained for the small RNA digitalisation analysis. This resulted in the identification of 399 conserved miRNAs; among these, 326 miRNAs were expressed in all three follicular cycling stages, whereas 3, 12 and 11 miRNAs were specifically expressed in anagen, catagen, and telogen, respectively. We also identified 172 potential novel miRNAs by Mireap, 36 miRNAs were expressed in all three cycling stages, whereas 23, 29 and 44 miRNAs were specifically expressed in anagen, catagen, and telogen, respectively. The expression level of five arbitrarily selected miRNAs was analyzed by quantitative PCR, and the results indicated that the expression patterns were consistent with the Solexa sequencing results. Gene Ontology and KEGG pathway analyses indicated that five major biological pathways (Metabolic pathways, Pathways in cancer, MAPK signalling pathway, Endocytosis and Focal adhesion) accounted for 23.08% of target genes among 278 biological functions, indicating that these pathways are likely to play significant roles during hair cycling. During all hair cycle stages of cashmere goats, a large number of conserved and novel miRNAs were identified through a high-throughput sequencing approach. This study enriches the Capra hircus miRNA databases and provides a comprehensive miRNA transcriptome profile in the skin of goats during the hair follicle cycle.
The DNA Methylome of Human Peripheral Blood Mononuclear Cells

PubMed Central

Ye, Mingzhi; Zheng, Hancheng; Yu, Jian; Wu, Honglong; Sun, Jihua; Zhang, Hongyu; Chen, Quan; Luo, Ruibang; Chen, Minfeng; He, Yinghua; Jin, Xin; Zhang, Qinghui; Yu, Chang; Zhou, Guangyu; Sun, Jinfeng; Huang, Yebo; Zheng, Huisong; Cao, Hongzhi; Zhou, Xiaoyu; Guo, Shicheng; Hu, Xueda; Li, Xin; Kristiansen, Karsten; Bolund, Lars; Xu, Jiujin; Wang, Wen; Yang, Huanming; Wang, Jian; Li, Ruiqiang; Beck, Stephan; Wang, Jun; Zhang, Xiuqing

2010-01-01

DNA methylation plays an important role in biological processes in human health and disease. Recent technological advances allow unbiased whole-genome DNA methylation (methylome) analysis to be carried out on human cells. Using whole-genome bisulfite sequencing at 24.7-fold coverage (12.3-fold per strand), we report a comprehensive (92.62%) methylome and analysis of the unique sequences in human peripheral blood mononuclear cells (PBMC) from the same Asian individual whose genome was deciphered in the YH project. PBMC constitute an important source for clinical blood tests world-wide. We found that 68.4% of CpG sites and <0.2% of non-CpG sites were methylated, demonstrating that non-CpG cytosine methylation is minor in human PBMC. Analysis of the PBMC methylome revealed a rich epigenomic landscape for 20 distinct genomic features, including regulatory, protein-coding, non-coding, RNA-coding, and repeat sequences. Integration of our methylome data with the YH genome sequence enabled a first comprehensive assessment of allele-specific methylation (ASM) between the two haploid methylomes of any individual and allowed the identification of 599 haploid differentially methylated regions (hDMRs) covering 287 genes. Of these, 76 genes had hDMRs within 2 kb of their transcriptional start sites of which >80% displayed allele-specific expression (ASE). These data demonstrate that ASM is a recurrent phenomenon and is highly correlated with ASE in human PBMCs. Together with recently reported similar studies, our study provides a comprehensive resource for future epigenomic research and confirms new sequencing technology as a paradigm for large-scale epigenomics studies. PMID:21085693
Recombinational Cloning Using Gateway and In-Fusion Cloning Schemes

PubMed Central

Throop, Andrea L.; LaBaer, Joshua

2015-01-01

The comprehensive study of protein structure and function, or proteomics, depends on the obtainability of full-length cDNAs in species-specific expression vectors and subsequent functional analysis of the expressed protein. Recombinational cloning is a universal cloning technique based on site-specific recombination that is independent of the insert DNA sequence of interest, which differentiates this method from the classical restriction enzyme-based cloning methods. Recombinational cloning enables rapid and efficient parallel transfer of DNA inserts into multiple expression systems. This unit summarizes strategies for generating expression-ready clones using the most popular recombinational cloning technologies, including the commercially available Gateway® (Life Technologies) and In-Fusion® (Clontech) cloning technologies. PMID:25827088
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi

PubMed Central

2013-01-01

Background Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. Results A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Conclusion Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia. PMID:23777366
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control consortium

PubMed Central

2014-01-01

We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the United States Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed, for these and qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings. PMID:25150838
Comprehensive analysis of single molecule sequencing-derived complete genome and whole transcriptome of Hyposidra talaca nuclear polyhedrosis virus.

PubMed

Nguyen, Thong T; Suryamohan, Kushal; Kuriakose, Boney; Janakiraman, Vasantharajan; Reichelt, Mike; Chaudhuri, Subhra; Guillory, Joseph; Divakaran, Neethu; Rabins, P E; Goel, Ridhi; Deka, Bhabesh; Sarkar, Suman; Ekka, Preety; Tsai, Yu-Chih; Vargas, Derek; Santhosh, Sam; Mohan, Sangeetha; Chin, Chen-Shan; Korlach, Jonas; Thomas, George; Babu, Azariah; Seshagiri, Somasekar

2018-06-12

We sequenced the Hyposidra talaca NPV (HytaNPV) double stranded circular DNA genome using PacBio single molecule sequencing technology. We found that the HytaNPV genome is 139,089 bp long with a GC content of 39.6%. It encodes 141 open reading frames (ORFs) including the 37 baculovirus core genes, 25 genes conserved among lepidopteran baculoviruses, 72 genes known in baculovirus, and 7 genes unique to the HytaNPV genome. It is a group II alphabaculovirus that codes for the F protein and lacks the gp64 gene found in group I alphabaculovirus viruses. Using RNA-seq, we confirmed the expression of the ORFs identified in the HytaNPV genome. Phylogenetic analysis showed HytaNPV to be closest to BusuNPV, SujuNPV and EcobNPV that infect other tea pests, Buzura suppressaria, Sucra jujuba, and Ectropis oblique, respectively. We identified repeat elements and a conserved non-coding baculovirus element in the genome. Analysis of the putative promoter sequences identified motif consistent with the temporal expression of the genes observed in the RNA-seq data.
SELMAP - SELEX affinity landscape MAPping of transcription factor binding sites using integrated microfluidics

PubMed Central

Chen, Dana; Orenstein, Yaron; Golodnitsky, Rada; Pellach, Michal; Avrahami, Dorit; Wachtel, Chaim; Ovadia-Shochat, Avital; Shir-Shapira, Hila; Kedmi, Adi; Juven-Gershon, Tamar; Shamir, Ron; Gerber, Doron

2016-01-01

Transcription factors (TFs) alter gene expression in response to changes in the environment through sequence-specific interactions with the DNA. These interactions are best portrayed as a landscape of TF binding affinities. Current methods to study sequence-specific binding preferences suffer from limited dynamic range, sequence bias, lack of specificity and limited throughput. We have developed a microfluidic-based device for SELEX Affinity Landscape MAPping (SELMAP) of TF binding, which allows high-throughput measurement of 16 proteins in parallel. We used it to measure the relative affinities of Pho4, AtERF2 and Btd full-length proteins to millions of different DNA binding sites, and detected both high and low-affinity interactions in equilibrium conditions, generating a comprehensive landscape of the relative TF affinities to all possible DNA 6-mers, and even DNA10-mers with increased sequencing depth. Low quantities of both the TFs and DNA oligomers were sufficient for obtaining high-quality results, significantly reducing experimental costs. SELMAP allows in-depth screening of hundreds of TFs, and provides a means for better understanding of the regulatory processes that govern gene expression. PMID:27628341
TranslatomeDB: a comprehensive database and cloud-based analysis platform for translatome sequencing data.

PubMed

Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie; Zhang, Gong

2018-01-04

Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
microRNA expression profiling in fetal single ventricle malformation identified by deep sequencing.

PubMed

Yu, Zhang-Bin; Han, Shu-Ping; Bai, Yun-Fei; Zhu, Chun; Pan, Ya; Guo, Xi-Rong

2012-01-01

microRNAs (miRNAs) have emerged as key regulators in many biological processes, particularly cardiac growth and development, although the specific miRNA expression profile associated with this process remains to be elucidated. This study aimed to characterize the cellular microRNA profile involved in the development of congenital heart malformation, through the investigation of single ventricle (SV) defects. Comprehensive miRNA profiling in human fetal SV cardiac tissue was performed by deep sequencing. Differential expression of 48 miRNAs was revealed by sequencing by oligonucleotide ligation and detection (SOLiD) analysis. Of these, 38 were down-regulated and 10 were up-regulated in differentiated SV cardiac tissue, compared to control cardiac tissue. This was confirmed by real-time quantitative reverse transcription-polymerase chain reaction (qRT-PCR) analysis. Predicted target genes of the 48 differentially expressed miRNAs were analyzed by gene ontology and categorized according to cellular process, regulation of biological process and metabolic process. Pathway-Express analysis identified the WNT and mTOR signaling pathways as the most significant processes putatively affected by the differential expression of these miRNAs. The candidate genes involved in cardiac development were identified as potential targets for these differentially expressed microRNAs and the collaborative network of microRNAs and cardiac development related-mRNAs was constructed. These data provide the basis for future investigation of the mechanism of the occurrence and development of fetal SV malformations.
Genomic analysis of expressed sequence tags in American black bear Ursus americanus

PubMed Central

2010-01-01

Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065
Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

PubMed

Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

2010-03-26

Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.
Understanding development and stem cells using single cell-based analyses of gene expression

PubMed Central

Kumar, Pavithra; Tan, Yuqi

2017-01-01

In recent years, genome-wide profiling approaches have begun to uncover the molecular programs that drive developmental processes. In particular, technical advances that enable genome-wide profiling of thousands of individual cells have provided the tantalizing prospect of cataloging cell type diversity and developmental dynamics in a quantitative and comprehensive manner. Here, we review how single-cell RNA sequencing has provided key insights into mammalian developmental and stem cell biology, emphasizing the analytical approaches that are specific to studying gene expression in single cells. PMID:28049689

miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments.

PubMed

Hackenberg, Michael; Sturm, Martin; Langenberger, David; Falcón-Pérez, Juan Manuel; Aransay, Ana M

2009-07-01

Next-generation sequencing allows now the sequencing of small RNA molecules and the estimation of their expression levels. Consequently, there will be a high demand of bioinformatics tools to cope with the several gigabytes of sequence data generated in each single deep-sequencing experiment. Given this scene, we developed miRanalyzer, a web server tool for the analysis of deep-sequencing experiments for small RNAs. The web server tool requires a simple input file containing a list of unique reads and its copy numbers (expression levels). Using these data, miRanalyzer (i) detects all known microRNA sequences annotated in miRBase, (ii) finds all perfect matches against other libraries of transcribed sequences and (iii) predicts new microRNAs. The prediction of new microRNAs is an especially important point as there are many species with very few known microRNAs. Therefore, we implemented a highly accurate machine learning algorithm for the prediction of new microRNAs that reaches AUC values of 97.9% and recall values of up to 75% on unseen data. The web tool summarizes all the described steps in a single output page, which provides a comprehensive overview of the analysis, adding links to more detailed output pages for each analysis module. miRanalyzer is available at http://web.bioinformatics.cicbiogune.es/microRNA/.
Novel small RNA (sRNA) landscape of the starvation-stress response transcriptome of Salmonella enterica serovar typhimurium.

PubMed

Amin, Shivam V; Roberts, Justin T; Patterson, Dillon G; Coley, Alexander B; Allred, Jonathan A; Denner, Jason M; Johnson, Justin P; Mullen, Genevieve E; O'Neal, Trenton K; Smith, Jason T; Cardin, Sara E; Carr, Hank T; Carr, Stacie L; Cowart, Holly E; DaCosta, David H; Herring, Brendon R; King, Valeria M; Polska, Caroline J; Ward, Erin E; Wise, Alice A; McAllister, Kathleen N; Chevalier, David; Spector, Michael P; Borchert, Glen M

2016-01-01

Small RNAs (sRNAs) are short (∼50-200 nucleotides) noncoding RNAs that regulate cellular activities across bacteria. Salmonella enterica starved of a carbon-energy (C) source experience a host of genetic and physiological changes broadly referred to as the starvation-stress response (SSR). In an attempt to identify novel sRNAs contributing to SSR control, we grew log-phase, 5-h C-starved and 24-h C-starved cultures of the virulent Salmonella enterica subspecies enterica serovar Typhimurium strain SL1344 and comprehensively sequenced their small RNA transcriptomes. Strikingly, after employing a novel strategy for sRNA discovery based on identifying dynamic transcripts arising from "gene-empty" regions, we identify 58 wholly undescribed Salmonella sRNA genes potentially regulating SSR averaging an ∼1,000-fold change in expression between log-phase and C-starved cells. Importantly, the expressions of individual sRNA loci were confirmed by both comprehensive transcriptome analyses and northern blotting of select candidates. Of note, we find 43 candidate sRNAs share significant sequence identity to characterized sRNAs in other bacteria, and ∼70% of our sRNAs likely assume characteristic sRNA structural conformations. In addition, we find 53 of our 58 candidate sRNAs either overlap neighboring mRNA loci or share significant sequence complementarity to mRNAs transcribed elsewhere in the SL1344 genome strongly suggesting they regulate the expression of transcripts via antisense base-pairing. Finally, in addition to this work resulting in the identification of 58 entirely novel Salmonella enterica genes likely participating in the SSR, we also find evidence suggesting that sRNAs are significantly more prevalent than currently appreciated and that Salmonella sRNAs may actually number in the thousands.
Novel small RNA (sRNA) landscape of the starvation-stress response transcriptome of Salmonella enterica serovar typhimurium

PubMed Central

Amin, Shivam V.; Roberts, Justin T.; Patterson, Dillon G.; Coley, Alexander B.; Allred, Jonathan A.; Denner, Jason M.; Johnson, Justin P.; Mullen, Genevieve E.; O'Neal, Trenton K.; Smith, Jason T.; Cardin, Sara E.; Carr, Hank T.; Carr, Stacie L.; Cowart, Holly E.; DaCosta, David H.; Herring, Brendon R.; King, Valeria M.; Polska, Caroline J.; Ward, Erin E.; Wise, Alice A.; McAllister, Kathleen N.; Chevalier, David; Spector, Michael P.; Borchert, Glen M.

2016-01-01

ABSTRACT Small RNAs (sRNAs) are short (∼50–200 nucleotides) noncoding RNAs that regulate cellular activities across bacteria. Salmonella enterica starved of a carbon-energy (C) source experience a host of genetic and physiological changes broadly referred to as the starvation-stress response (SSR). In an attempt to identify novel sRNAs contributing to SSR control, we grew log-phase, 5-h C-starved and 24-h C-starved cultures of the virulent Salmonella enterica subspecies enterica serovar Typhimurium strain SL1344 and comprehensively sequenced their small RNA transcriptomes. Strikingly, after employing a novel strategy for sRNA discovery based on identifying dynamic transcripts arising from “gene-empty” regions, we identify 58 wholly undescribed Salmonella sRNA genes potentially regulating SSR averaging an ∼1,000-fold change in expression between log-phase and C-starved cells. Importantly, the expressions of individual sRNA loci were confirmed by both comprehensive transcriptome analyses and northern blotting of select candidates. Of note, we find 43 candidate sRNAs share significant sequence identity to characterized sRNAs in other bacteria, and ∼70% of our sRNAs likely assume characteristic sRNA structural conformations. In addition, we find 53 of our 58 candidate sRNAs either overlap neighboring mRNA loci or share significant sequence complementarity to mRNAs transcribed elsewhere in the SL1344 genome strongly suggesting they regulate the expression of transcripts via antisense base-pairing. Finally, in addition to this work resulting in the identification of 58 entirely novel Salmonella enterica genes likely participating in the SSR, we also find evidence suggesting that sRNAs are significantly more prevalent than currently appreciated and that Salmonella sRNAs may actually number in the thousands. PMID:26853797
A comprehensive phylogeny of auxin homeostasis genes involved in adventitious root formation in carnation stem cuttings.

PubMed

Sánchez-García, Ana Belén; Ibáñez, Sergio; Cano, Antonio; Acosta, Manuel; Pérez-Pérez, José Manuel

2018-01-01

Understanding the functional basis of auxin homeostasis requires knowledge about auxin biosynthesis, auxin transport and auxin catabolism genes, which is not always directly available despite the recent whole-genome sequencing of many plant species. Through sequence homology searches and phylogenetic analyses on a selection of 11 plant species with high-quality genome annotation, we identified the putative gene homologs involved in auxin biosynthesis, auxin catabolism and auxin transport pathways in carnation (Dianthus caryophyllus L.). To deepen our knowledge of the regulatory events underlying auxin-mediated adventitious root formation in carnation stem cuttings, we used RNA-sequencing data to confirm the expression profiles of some auxin homeostasis genes during the rooting of two carnation cultivars with different rooting behaviors. We also confirmed the presence of several auxin-related metabolites in the stem cutting tissues. Our findings offer a comprehensive overview of auxin homeostasis genes in carnation and provide a solid foundation for further experiments investigating the role of auxin homeostasis in the regulation of adventitious root formation in carnation.
A comprehensive phylogeny of auxin homeostasis genes involved in adventitious root formation in carnation stem cuttings

PubMed Central

Cano, Antonio; Acosta, Manuel

2018-01-01

Understanding the functional basis of auxin homeostasis requires knowledge about auxin biosynthesis, auxin transport and auxin catabolism genes, which is not always directly available despite the recent whole-genome sequencing of many plant species. Through sequence homology searches and phylogenetic analyses on a selection of 11 plant species with high-quality genome annotation, we identified the putative gene homologs involved in auxin biosynthesis, auxin catabolism and auxin transport pathways in carnation (Dianthus caryophyllus L.). To deepen our knowledge of the regulatory events underlying auxin-mediated adventitious root formation in carnation stem cuttings, we used RNA-sequencing data to confirm the expression profiles of some auxin homeostasis genes during the rooting of two carnation cultivars with different rooting behaviors. We also confirmed the presence of several auxin-related metabolites in the stem cutting tissues. Our findings offer a comprehensive overview of auxin homeostasis genes in carnation and provide a solid foundation for further experiments investigating the role of auxin homeostasis in the regulation of adventitious root formation in carnation. PMID:29709027
Sequencing and Characterization of the Invasive Sycamore Lace Bug Corythucha ciliata (Hemiptera: Tingidae) Transcriptome

PubMed Central

Qu, Cheng; Fu, Ningning; Xu, Yihua

2016-01-01

The sycamore lace bug, Corythucha ciliata (Hemiptera: Tingidae), is an invasive forestry pest rapidly expanding in many countries. This pest poses a considerable threat to the urban forestry ecosystem, especially to Platanus spp. However, its molecular biology and biochemistry are poorly understood. This study reports the first C. ciliata transcriptome, encompassing three different life stages (Nymphs, adults female (AF) and adults male (AM)). In total, 26.53 GB of clean data and 60,879 unigenes were obtained from three RNA-seq libraries. These unigenes were annotated and classified by Nr (NCBI non-redundant protein sequences), Nt (NCBI non-redundant nucleotide sequences), Pfam (Protein family), KOG/COG (Clusters of Orthologous Groups of proteins), Swiss-Prot (A manually annotated and reviewed protein sequence database), and KO (KEGG Ortholog database). After all pairwise comparisons between these three different samples, a large number of differentially expressed genes were revealed. The dramatic differences in global gene expression profiles were found between distinct life stages (nymphs and AF, nymphs and AM) and sex difference (AF and AM), with some of the significantly differentially expressed genes (DEGs) being related to metamorphosis, digestion, immune and sex difference. The different express of unigenes were validated through quantitative Real-Time PCR (qRT-PCR) for 16 randomly selected unigenes. In addition, 17,462 potential simple sequence repeat molecular markers were identified in these transcriptome resources. These comprehensive C. ciliata transcriptomic information can be utilized to promote the development of environmentally friendly methodologies to disrupt the processes of metamorphosis, digestion, immune and sex differences. PMID:27494615
Getting a cue before getting a clue: Event-related potentials to inference in visual narrative comprehension

PubMed Central

Cohn, Neil; Kutas, Marta

2015-01-01

Inference has long been emphasized in the comprehension of verbal and visual narratives. Here, we measured event-related brain potentials to visual sequences designed to elicit inferential processing. In Impoverished sequences, an expressionless “onlooker” watches an undepicted event (e.g., person throws a ball for a dog, then watches the dog chase it) just prior to a surprising finale (e.g., someone else returns the ball), which should lead to an inference (i.e., the different person retrieved the ball). Implied sequences alter this narrative structure by adding visual cues to the critical panel such as a surprised facial expression to the onlooker implying they saw an unexpected, albeit undepicted, event. In contrast, Expected sequences show a predictable, but then confounded, event (i.e., dog retrieves ball, then different person returns it), and Explicit sequences depict the unexpected event (i.e., different person retrieves then returns ball). At the critical penultimate panel, sequences representing depicted events (Explicit, Expected) elicited a larger posterior positivity (P600) than the relatively passive events of an onlooker (Impoverished, Implied), though Implied sequences were slightly more positive than Impoverished sequences. At the subsequent and final panel, a posterior positivity (P600) was greater to images in Impoverished sequences than those in Explicit and Implied sequences, which did not differ. In addition, both sequence types requiring inference (Implied, Impoverished) elicited a larger frontal negativity than those explicitly depicting events (Expected, Explicit). These results show that neural processing differs for visual narratives omitting events versus those depicting events, and that the presence of subtle visual cues can modulate such effects presumably by altering narrative structure. PMID:26320706
Rice Phospholipase A Superfamily: Organization, Phylogenetic and Expression Analysis during Abiotic Stresses and Development

PubMed Central

Singh, Amarjeet; Baranwal, Vinay; Shankar, Alka; Kanwar, Poonam; Ranjan, Rajeev; Yadav, Sandeep; Pandey, Amita; Kapoor, Sanjay; Pandey, Girdhar K.

2012-01-01

Background Phospholipase A (PLA) is an important group of enzymes responsible for phospholipid hydrolysis in lipid signaling. PLAs have been implicated in abiotic stress signaling and developmental events in various plants species. Genome-wide analysis of PLA superfamily has been carried out in dicot plant Arabidopsis. A comprehensive genome-wide analysis of PLAs has not been presented yet in crop plant rice. Methodology/Principal Findings A comprehensive bioinformatics analysis identified a total of 31 PLA encoding genes in the rice genome, which are divided into three classes; phospholipase A1 (PLA1), patatin like phospholipases (pPLA) and low molecular weight secretory phospholipase A2 (sPLA2) based on their sequences and phylogeny. A subset of 10 rice PLAs exhibited chromosomal duplication, emphasizing the role of duplication in the expansion of this gene family in rice. Microarray expression profiling revealed a number of PLA members expressing differentially and significantly under abiotic stresses and reproductive development. Comparative expression analysis with Arabidopsis PLAs revealed a high degree of functional conservation between the orthologs in two plant species, which also indicated the vital role of PLAs in stress signaling and plant development across different plant species. Moreover, sub-cellular localization of a few candidates suggests their differential localization and functional role in the lipid signaling. Conclusion/Significance The comprehensive analysis and expression profiling would provide a critical platform for the functional characterization of the candidate PLA genes in crop plants. PMID:22363522
Deciphering the transcriptional cis-regulatory code.

PubMed

Yáñez-Cuna, J Omar; Kvon, Evgeny Z; Stark, Alexander

2013-01-01

Information about developmental gene expression resides in defined regulatory elements, called enhancers, in the non-coding part of the genome. Although cells reliably utilize enhancers to orchestrate gene expression, a cis-regulatory code that would allow their interpretation has remained one of the greatest challenges of modern biology. In this review, we summarize studies from the past three decades that describe progress towards revealing the properties of enhancers and discuss how recent approaches are providing unprecedented insights into regulatory elements in animal genomes. Over the next years, we believe that the functional characterization of regulatory sequences in entire genomes, combined with recent computational methods, will provide a comprehensive view of genomic regulatory elements and their building blocks and will enable researchers to begin to understand the sequence basis of the cis-regulatory code. Copyright © 2012 Elsevier Ltd. All rights reserved.
Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs.

PubMed

Chávez Montes, Ricardo A; de Fátima Rosas-Cárdenas, Flor; De Paoli, Emanuele; Accerbi, Monica; Rymarquis, Linda A; Mahalingam, Gayathri; Marsch-Martínez, Nayelli; Meyers, Blake C; Green, Pamela J; de Folter, Stefan

2014-04-23

Small RNAs are pivotal regulators of gene expression that guide transcriptional and post-transcriptional silencing mechanisms in eukaryotes, including plants. Here we report a comprehensive atlas of sRNA and miRNA from 3 species of algae and 31 representative species across vascular plants, including non-model plants. We sequence and quantify sRNAs from 99 different tissues or treatments across species, resulting in a data set of over 132 million distinct sequences. Using miRBase mature sequences as a reference, we identify the miRNA sequences present in these libraries. We apply diverse profiling methods to examine critical sRNA and miRNA features, such as size distribution, tissue-specific regulation and sequence conservation between species, as well as to predict putative new miRNA sequences. We also develop database resources, computational analysis tools and a dedicated website, http://smallrna.udel.edu/. This study provides new insights on plant sRNAs and miRNAs, and a foundation for future studies.
Genome wide comprehensive analysis and web resource development on cell wall degrading enzymes from phyto-parasitic nematodes.

PubMed

Rai, Krishan Mohan; Balasubramanian, Vimal Kumar; Welker, Cassie Marie; Pang, Mingxiong; Hii, Mei Mei; Mendu, Venugopal

2015-08-01

The plant cell wall serves as a primary barrier against pathogen invasion. The success of a plant pathogen largely depends on its ability to overcome this barrier. During the infection process, plant parasitic nematodes secrete cell wall degrading enzymes (CWDEs) apart from piercing with their stylet, a sharp and hard mouthpart used for successful infection. CWDEs typically consist of cellulases, hemicellulases, and pectinases, which help the nematode to infect and establish the feeding structure or form a cyst. The study of nematode cell wall degrading enzymes not only enhance our understanding of the interaction between nematodes and their host, but also provides information on a novel source of enzymes for their potential use in biomass based biofuel/bioproduct industries. Although there is comprehensive information available on genome wide analysis of CWDEs for bacteria, fungi, termites and plants, but no comprehensive information available for plant pathogenic nematodes. Herein we have performed a genome wide analysis of CWDEs from the genome sequenced phyto pathogenic nematode species and developed a comprehensive publicly available database. In the present study, we have performed a genome wide analysis for the presence of CWDEs from five plant parasitic nematode species with fully sequenced genomes covering three genera viz. Bursaphelenchus, Glorodera and Meloidogyne. Using the Hidden Markov Models (HMM) conserved domain profiles of the respective gene families, we have identified 530 genes encoding CWDEs that are distributed among 24 gene families of glycoside hydrolases (412) and polysaccharide lyases (118). Furthermore, expression profiles of these genes were analyzed across the life cycle of a potato cyst nematode. Most genes were found to have moderate to high expression from early to late infectious stages, while some clusters were invasion stage specific, indicating the role of these enzymes in the nematode's infection and establishment process. Additionally, we have also developed a Nematode's Plant Cell Wall Degrading Enzyme (NCWDE) database as a platform to provide a comprehensive outcome of the present study. Our study provides collective information about different families of CWDEs from five different sequenced plant pathogenic nematode species. The outcomes of this study will help in developing better strategies to curtail the nematode infection, as well as help in identification of novel cell wall degrading enzymes for biofuel/bioproduct industries.
Next generation sequencing of extraskeletal myxoid chondrosarcoma.

PubMed

Davis, Elizabeth J; Wu, Yi-Mi; Robinson, Dan; Schuetze, Scott M; Baker, Laurence H; Athanikar, Jyoti; Cao, Xuhong; Kunju, Lakshmi P; Chinnaiyan, Arul M; Chugh, Rashmi

2017-03-28

Extraskeletal myxoid chondrosarcoma (EMC) is an indolent translocation-associated soft tissue sarcoma with a high propensity for metastases. Using a clinical sequencing approach, we genomically profiled patients with metastatic EMC to elucidate the molecular biology and identify potentially actionable mutations. We also evaluated potential predictive factors of benefit to sunitinib, a multi-targeted tyrosine kinase inhibitor with reported activity in a subset of EMC patients. Between January 31, 2012 and April 15, 2016, six patients with EMC participated in the clinical sequencing research study. High quality DNA and RNA was isolated and matched normal samples underwent comprehensive next generation sequencing (whole or OncoSeq capture exome of tumor and normal, tumor PolyA+ and capture transcriptome). The expression levels of sunitinib targeted-kinases were measured by transcriptome sequencing for KDR, PDGFRA/B, KIT, RET, FLT1, and FLT4. The previously reported EWSR1-NR4A3 translocation was identified in all patient tumors; however, other recurring genomic abnormalities were not detected. RET expression was significantly greater in patients with EMC relative to other types of sarcomas except for liposarcoma (p<0.0002). The folate receptor was overexpressed in two patients. Our study demonstrated that similar to other translocation-associated sarcomas, the mutational profile of metastatic EMC is limited beyond the pathognomonic translocation. The clinical significance of RET expression in EMC should be explored. Additional pre-clinical investigations of EMC may help elucidate molecular mechanisms contributing to EMC tumorigenesis that could be translated to the clinical setting.
MAGIC database and interfaces: an integrated package for gene discovery and expression.

PubMed

Cordonnier-Pratt, Marie-Michèle; Liang, Chun; Wang, Haiming; Kolychev, Dmitri S; Sun, Feng; Freeman, Robert; Sullivan, Robert; Pratt, Lee H

2004-01-01

The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC) Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs), and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.
Comprehensive red blood cell and platelet antigen prediction from whole genome sequencing: proof of principle

PubMed Central

Westhoff, Connie M.; Uy, Jon Michael; Aguad, Maria; Smeland‐Wagman, Robin; Kaufman, Richard M.; Rehm, Heidi L.; Green, Robert C.; Silberstein, Leslie E.

2015-01-01

BACKGROUND There are 346 serologically defined red blood cell (RBC) antigens and 33 serologically defined platelet (PLT) antigens, most of which have known genetic changes in 45 RBC or six PLT genes that correlate with antigen expression. Polymorphic sites associated with antigen expression in the primary literature and reference databases are annotated according to nucleotide positions in cDNA. This makes antigen prediction from next‐generation sequencing data challenging, since it uses genomic coordinates. STUDY DESIGN AND METHODS The conventional cDNA reference sequences for all known RBC and PLT genes that correlate with antigen expression were aligned to the human reference genome. The alignments allowed conversion of conventional cDNA nucleotide positions to the corresponding genomic coordinates. RBC and PLT antigen prediction was then performed using the human reference genome and whole genome sequencing (WGS) data with serologic confirmation. RESULTS Some major differences and alignment issues were found when attempting to convert the conventional cDNA to human reference genome sequences for the following genes: ABO, A4GALT, RHD, RHCE, FUT3, ACKR1 (previously DARC), ACHE, FUT2, CR1, GCNT2, and RHAG. However, it was possible to create usable alignments, which facilitated the prediction of all RBC and PLT antigens with a known molecular basis from WGS data. Traditional serologic typing for 18 RBC antigens were in agreement with the WGS‐based antigen predictions, providing proof of principle for this approach. CONCLUSION Detailed mapping of conventional cDNA annotated RBC and PLT alleles can enable accurate prediction of RBC and PLT antigens from whole genomic sequencing data. PMID:26634332
MicroRNA Expression-Based Model Indicates Event-Free Survival in Pediatric Acute Myeloid Leukemia

PubMed Central

Lim, Emilia L.; Trinh, Diane L.; Ries, Rhonda E.; Wang, Jim; Gerbing, Robert B.; Ma, Yussanne; Topham, James; Hughes, Maya; Pleasance, Erin; Mungall, Andrew J.; Moore, Richard; Zhao, Yongjun; Aplenc, Richard; Sung, Lillian; Kolb, E. Anders; Gamis, Alan; Smith, Malcolm; Gerhard, Daniela S.; Alonzo, Todd A.; Meshinchi, Soheil; Marra, Marco A.

2017-01-01

Purpose Children with acute myeloid leukemia (AML) whose disease is refractory to standard induction chemotherapy therapy or who experience relapse after initial response have dismal outcomes. We sought to comprehensively profile pediatric AML microRNA (miRNA) samples to identify dysregulated genes and assess the utility of miRNAs for improved outcome prediction. Patients and Methods To identify miRNA biomarkers that are associated with treatment failure, we performed a comprehensive sequence-based characterization of the pediatric AML miRNA landscape. miRNA sequencing was performed on 1,362 samples—1,303 primary, 22 refractory, and 37 relapse samples. One hundred sixty-four matched samples—127 primary and 37 relapse samples—were analyzed by using RNA sequencing. Results By using penalized lasso Cox proportional hazards regression, we identified 36 miRNAs the expression levels at diagnosis of which were highly associated with event-free survival. Combined expression of the 36 miRNAs was used to create a novel miRNA-based risk classification scheme (AMLmiR36). This new miRNA-based risk classifier identifies those patients who are at high risk (hazard ratio, 2.830; P ≤ .001) or low risk (hazard ratio, 0.323; P ≤ .001) of experiencing treatment failure, independent of conventional karyotype or mutation status. The performance of AMLmiR36 was independently assessed by using 878 patients from two different clinical trials (AAML0531 and AAML1031). Our analysis also revealed that miR-106a-363 was abundantly expressed in relapse and refractory samples, and several candidate targets of miR-106a-5p were involved in oxidative phosphorylation, a process that is suppressed in treatment-resistant leukemic cells. Conclusion To assess the utility of miRNAs for outcome prediction in patients with pediatric AML, we designed and validated a miRNA-based risk classification scheme. We also hypothesized that the abundant expression of miR-106a could increase treatment resistance via modulation of genes that are involved in oxidative phosphorylation. PMID:29068783
Identification of Transposable Elements Contributing to Tissue-Specific Expression of Long Non-Coding RNAs

PubMed Central

Chishima, Takafumi; Iwakiri, Junichi

2018-01-01

It has been recently suggested that transposable elements (TEs) are re-used as functional elements of long non-coding RNAs (lncRNAs). This is supported by some examples such as the human endogenous retrovirus subfamily H (HERVH) elements contained within lncRNAs and expressed specifically in human embryonic stem cells (hESCs), as required to maintain hESC identity. There are at least two unanswered questions about all lncRNAs. How many TEs are re-used within lncRNAs? Are there any other TEs that affect tissue specificity of lncRNA expression? To answer these questions, we comprehensively identify TEs that are significantly related to tissue-specific expression levels of lncRNAs. We downloaded lncRNA expression data corresponding to normal human tissue from the Expression Atlas and transformed the data into tissue specificity estimates. Then, Fisher’s exact tests were performed to verify whether the presence or absence of TE-derived sequences influences the tissue specificity of lncRNA expression. Many TE–tissue pairs associated with tissue-specific expression of lncRNAs were detected, indicating that multiple TE families can be re-used as functional domains or regulatory sequences of lncRNAs. In particular, we found that the antisense promoter region of L1PA2, a LINE-1 subfamily, appears to act as a promoter for lncRNAs with placenta-specific expression. PMID:29315213
Understanding development and stem cells using single cell-based analyses of gene expression.

PubMed

Kumar, Pavithra; Tan, Yuqi; Cahan, Patrick

2017-01-01

In recent years, genome-wide profiling approaches have begun to uncover the molecular programs that drive developmental processes. In particular, technical advances that enable genome-wide profiling of thousands of individual cells have provided the tantalizing prospect of cataloging cell type diversity and developmental dynamics in a quantitative and comprehensive manner. Here, we review how single-cell RNA sequencing has provided key insights into mammalian developmental and stem cell biology, emphasizing the analytical approaches that are specific to studying gene expression in single cells. © 2017. Published by The Company of Biologists Ltd.
The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres

PubMed Central

Yuan, Daojun; Tang, Zhonghui; Wang, Maojun; Gao, Wenhui; Tu, Lili; Jin, Xin; Chen, Lingling; He, Yonghui; Zhang, Lin; Zhu, Longfu; Li, Yang; Liang, Qiqi; Lin, Zhongxu; Yang, Xiyan; Liu, Nian; Jin, Shuangxia; Lei, Yang; Ding, Yuanhao; Li, Guoliang; Ruan, Xiaoan; Ruan, Yijun; Zhang, Xianlong

2015-01-01

Gossypium hirsutum contributes the most production of cotton fibre, but G. barbadense is valued for its better comprehensive resistance and superior fibre properties. However, the allotetraploid genome of G. barbadense has not been comprehensively analysed. Here we present a high-quality assembly of the 2.57 gigabase genome of G. barbadense, including 80,876 protein-coding genes. The double-sized genome of the A (or At) (1.50 Gb) against D (or Dt) (853 Mb) primarily resulted from the expansion of Gypsy elements, including Peabody and Retrosat2 subclades in the Del clade, and the Athila subclade in the Athila/Tat clade. Substantial gene expansion and contraction were observed and rich homoeologous gene pairs with biased expression patterns were identified, suggesting abundant gene sub-functionalization occurred by allopolyploidization. More specifically, the CesA gene family has adapted differentially temporal expression patterns, suggesting an integrated regulatory mechanism of CesA genes from At and Dt subgenomes for the primary and secondary cellulose biosynthesis of cotton fibre in a “relay race”-like fashion. We anticipate that the G. barbadense genome sequence will advance our understanding the mechanism of genome polyploidization and underpin genome-wide comparison research in this genus. PMID:26634818
Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing.

PubMed

Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini

2014-11-01

MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Genome-wide discovery of novel and conserved microRNAs in white shrimp (Litopenaeus vannamei).

PubMed

Xi, Qian-Yun; Xiong, Yuan-Yan; Wang, Yuan-Mei; Cheng, Xiao; Qi, Qi-En; Shu, Gang; Wang, Song-Bo; Wang, Li-Na; Gao, Ping; Zhu, Xiao-Tong; Jiang, Qing-Yan; Zhang, Yong-Liang; Liu, Li

2015-01-01

Of late years, a large amount of conserved and species-specific microRNAs (miRNAs) have been performed on identification from species which are economically important but lack a full genome sequence. In this study, Solexa deep sequencing and cross-species miRNA microarray were used to detect miRNAs in white shrimp. We identified 239 conserved miRNAs, 14 miRNA* sequences and 20 novel miRNAs by bioinformatics analysis from 7,561,406 high-quality reads representing 325,370 distinct sequences. The all 20 novel miRNAs were species-specific in white shrimp and not homologous in other species. Using the conserved miRNAs from the miRBase database as a query set to search for homologs from shrimp expressed sequence tags (ESTs), 32 conserved computationally predicted miRNAs were discovered in shrimp. In addition, using microarray analysis in the shrimp fed with Panax ginseng polysaccharide complex, 151 conserved miRNAs were identified, 18 of which were significant up-expression, while 49 miRNAs were significant down-expression. In particular, qRT-PCR analysis was also performed for nine miRNAs in three shrimp tissues such as muscle, gill and hepatopancreas. Results showed that these miRNAs expression are tissue specific. Combining results of the three methods, we detected 20 novel and 394 conserved miRNAs. Verification with quantitative reverse transcription (qRT-PCR) and Northern blot showed a high confidentiality of data. The study provides the first comprehensive specific miRNA profile of white shrimp, which includes useful information for future investigations into the function of miRNAs in regulation of shrimp development and immunology.

De novo sequencing and analysis of the cranberry fruit transcriptome to identify putative genes involved in flavonoid biosynthesis, transport and regulation.

PubMed

Sun, Haiyue; Liu, Yushan; Gai, Yuzhuo; Geng, Jinman; Chen, Li; Liu, Hongdi; Kang, Limin; Tian, Youwen; Li, Yadong

2015-09-02

Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry. In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected. Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.
Multiple Solutions to the Same Problem: Utilization of Plausibility and Syntax in Sentence Comprehension by Older Adults with Impaired Hearing.

PubMed

Amichetti, Nicole M; White, Alison G; Wingfield, Arthur

2016-01-01

A fundamental question in psycholinguistic theory is whether equivalent success in sentence comprehension may come about by different underlying operations. Of special interest is whether adult aging, especially when accompanied by reduced hearing acuity, may shift the balance of reliance on formal syntax vs. plausibility in determining sentence meaning. In two experiments participants were asked to identify the thematic roles in grammatical sentences that contained either plausible or implausible semantic relations. Comprehension of sentence meanings was indexed by the ability to correctly name the agent or the recipient of an action represented in the sentence. In Experiment 1 young and older adults' comprehension was tested for plausible and implausible sentences with the meaning expressed with either an active-declarative or a passive syntactic form. In Experiment 2 comprehension performance was examined for young adults with age-normal hearing, older adults with good hearing acuity, and age-matched older adults with mild-to-moderate hearing loss for plausible or implausible sentences with meaning expressed with either a subject-relative (SR) or an object-relative (OR) syntactic structure. Experiment 1 showed that the likelihood of interpreting a sentence according to its literal meaning was reduced when that meaning expressed an implausible relationship. Experiment 2 showed that this likelihood was further decreased for OR as compared to SR sentences, and especially so for older adults whose hearing impairment added to the perceptual challenge. Experiment 2 also showed that working memory capacity as measured with a letter-number sequencing task contributed to the likelihood that listeners would base their comprehension responses on the literal syntax even when this processing scheme yielded an implausible meaning. Taken together, the results of both experiments support the postulate that listeners may use more than a single uniform processing strategy for successful sentence comprehension, with the existence of these alternative solutions only revealed when literal syntax and plausibility do not coincide.
Multiple Solutions to the Same Problem: Utilization of Plausibility and Syntax in Sentence Comprehension by Older Adults with Impaired Hearing

PubMed Central

Amichetti, Nicole M.; White, Alison G.; Wingfield, Arthur

2016-01-01

A fundamental question in psycholinguistic theory is whether equivalent success in sentence comprehension may come about by different underlying operations. Of special interest is whether adult aging, especially when accompanied by reduced hearing acuity, may shift the balance of reliance on formal syntax vs. plausibility in determining sentence meaning. In two experiments participants were asked to identify the thematic roles in grammatical sentences that contained either plausible or implausible semantic relations. Comprehension of sentence meanings was indexed by the ability to correctly name the agent or the recipient of an action represented in the sentence. In Experiment 1 young and older adults’ comprehension was tested for plausible and implausible sentences with the meaning expressed with either an active-declarative or a passive syntactic form. In Experiment 2 comprehension performance was examined for young adults with age-normal hearing, older adults with good hearing acuity, and age-matched older adults with mild-to-moderate hearing loss for plausible or implausible sentences with meaning expressed with either a subject-relative (SR) or an object-relative (OR) syntactic structure. Experiment 1 showed that the likelihood of interpreting a sentence according to its literal meaning was reduced when that meaning expressed an implausible relationship. Experiment 2 showed that this likelihood was further decreased for OR as compared to SR sentences, and especially so for older adults whose hearing impairment added to the perceptual challenge. Experiment 2 also showed that working memory capacity as measured with a letter-number sequencing task contributed to the likelihood that listeners would base their comprehension responses on the literal syntax even when this processing scheme yielded an implausible meaning. Taken together, the results of both experiments support the postulate that listeners may use more than a single uniform processing strategy for successful sentence comprehension, with the existence of these alternative solutions only revealed when literal syntax and plausibility do not coincide. PMID:27303346
RISC RNA sequencing for context-specific identification of in vivo miR targets

PubMed Central

Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W

2010-01-01

Rationale MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. Objective To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). Methods and Results We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias, and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1,645 mRNAs consistently targeted to mouse cardiac RISCs. We employed this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing ‘seed’ sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. Conclusions RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context, and is applicable to any tissue and any disease state. Summary MicroRNAs (miRs) are key regulators of mRNA translation in health and disease. While bioinformatic predictions suggest that a single miR may target hundreds of mRNAs, the number of experimentally verified targets of miRs is low. To enable comprehensive, unbiased examination of miR targets, we have performed deep RNA sequencing of cardiac transcriptomes in parallel with cardiac RNA-induced silencing complex (RISC)-associated RNAs (the RISCome), called RISC sequencing. We developed methods that did not require cross-linking of RNAs to RISCs or amplification of mRNA prior to sequencing, making it possible to rapidly perform RISC sequencing from intact tissue while avoiding amplification bias. Comparison of RISCome with transcriptome expression defined the degree of RISC enrichment for each mRNA. The majority of the mRNAs enriched in wild-type cardiac RISComes compared to transcriptomes were bioinformatically predicted to be targets of at least 1 of 139 cardiac-expressed miRs. Programming cardiomyocyte RISCs via transgenic overexpression in adult hearts of miR-133a or miR-499, two miRs that contain entirely different ‘seed’ sequences, elicited differing profiles of RISC-targeted mRNAs. Thus, RISC sequencing represents a highly sensitive method for general RISC profiling and individual miR target identification in biological context. PMID:21030712
Uncovering microRNA-mediated response to SO2 stress in Arabidopsis thaliana by deep sequencing.

PubMed

Li, Lihong; Xue, Meizhao; Yi, Huilan

2016-10-05

Sulfur dioxide (SO2) is a major air pollutant and has significant impacts on plants. MicroRNAs (miRNAs) are a class of gene expression regulators that play important roles in response to environmental stresses. In this study, deep sequencing was used for genome-wide identification of miRNAs and their expression profiles in response to SO2 stress in Arabidopsis thaliana shoots. A total of 27 conserved miRNAs and 5 novel miRNAs were found to be differentially expressed under SO2 stress. qRT-PCR analysis showed mostly negative correlation between miRNA accumulation and target gene mRNA abundance, suggesting regulatory roles of these miRNAs during SO2 exposure. The target genes of SO2-responsive miRNAs encode transcription factors and proteins that regulate auxin signaling and stress response, and the miRNAs-mediated suppression of these genes could improve plant resistance to SO2 stress. Promoter sequence analysis of genes encoding SO2-responsive miRNAs showed that stress-responsive and phytohormone-related cis-regulatory elements occurred frequently, providing additional evidence of the involvement of miRNAs in adaption to SO2 stress. This study represents a comprehensive expression profiling of SO2-responsive miRNAs in Arabidopsis and broads our perspective on the ubiquitous regulatory roles of miRNAs under stress conditions. Copyright © 2016 Elsevier B.V. All rights reserved.
A comprehensive simulation study on classification of RNA-Seq data.

PubMed

Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet

2017-01-01

RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM classifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html.
MEPD: a Medaka gene expression pattern database

PubMed Central

Henrich, Thorsten; Ramialison, Mirana; Quiring, Rebecca; Wittbrodt, Beate; Furutani-Seiki, Makoto; Wittbrodt, Joachim; Kondoh, Hisato

2003-01-01

The Medaka Expression Pattern Database (MEPD) stores and integrates information of gene expression during embryonic development of the small freshwater fish Medaka (Oryzias latipes). Expression patterns of genes identified by ESTs are documented by images and by descriptions through parameters such as staining intensity, category and comments and through a comprehensive, hierarchically organized dictionary of anatomical terms. Sequences of the ESTs are available and searchable through BLAST. ESTs in the database are clustered upon entry and have been blasted against public data-bases. The BLAST results are updated regularly, stored within the database and searchable. The MEPD is a project within the Medaka Genome Initiative (MGI) and entries will be interconnected to integrated genomic map databases. MEPD is accessible through the WWW at http://medaka.dsp.jst.go.jp/MEPD. PMID:12519950
Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity

NASA Astrophysics Data System (ADS)

Corcoran, Martin M.; Phad, Ganesh E.; Bernat, Néstor Vázquez; Stahl-Hennig, Christiane; Sumida, Noriyuki; Persson, Mats A. A.; Martin, Marcel; Hedestam, Gunilla B. Karlsson

2016-12-01

Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool.
Genome-wide compendium and functional assessment of in vivo heart enhancers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen

Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of > 35 epigenomic data sets from mouse and human pre-and postnatal hearts we created a comprehensive reference of > 80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs ofmore » two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function.« less
Genetic and Functional Drivers of Diffuse Large B Cell Lymphoma.

PubMed

Reddy, Anupama; Zhang, Jenny; Davis, Nicholas S; Moffitt, Andrea B; Love, Cassandra L; Waldrop, Alexander; Leppa, Sirpa; Pasanen, Annika; Meriranta, Leo; Karjalainen-Lindsberg, Marja-Liisa; Nørgaard, Peter; Pedersen, Mette; Gang, Anne O; Høgdall, Estrid; Heavican, Tayla B; Lone, Waseem; Iqbal, Javeed; Qin, Qiu; Li, Guojie; Kim, So Young; Healy, Jane; Richards, Kristy L; Fedoriw, Yuri; Bernal-Mizrachi, Leon; Koff, Jean L; Staton, Ashley D; Flowers, Christopher R; Paltiel, Ora; Goldschmidt, Neta; Calaminici, Maria; Clear, Andrew; Gribben, John; Nguyen, Evelyn; Czader, Magdalena B; Ondrejka, Sarah L; Collie, Angela; Hsi, Eric D; Tse, Eric; Au-Yeung, Rex K H; Kwong, Yok-Lam; Srivastava, Gopesh; Choi, William W L; Evens, Andrew M; Pilichowska, Monika; Sengar, Manju; Reddy, Nishitha; Li, Shaoying; Chadburn, Amy; Gordon, Leo I; Jaffe, Elaine S; Levy, Shawn; Rempel, Rachel; Tzeng, Tiffany; Happ, Lanie E; Dave, Tushar; Rajagopalan, Deepthi; Datta, Jyotishka; Dunson, David B; Dave, Sandeep S

2017-10-05

Diffuse large B cell lymphoma (DLBCL) is the most common form of blood cancer and is characterized by a striking degree of genetic and clinical heterogeneity. This heterogeneity poses a major barrier to understanding the genetic basis of the disease and its response to therapy. Here, we performed an integrative analysis of whole-exome sequencing and transcriptome sequencing in a cohort of 1,001 DLBCL patients to comprehensively define the landscape of 150 genetic drivers of the disease. We characterized the functional impact of these genes using an unbiased CRISPR screen of DLBCL cell lines to define oncogenes that promote cell growth. A prognostic model comprising these genetic alterations outperformed current established methods: cell of origin, the International Prognostic Index comprising clinical variables, and dual MYC and BCL2 expression. These results comprehensively define the genetic drivers and their functional roles in DLBCL to identify new therapeutic opportunities in the disease. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome-wide compendium and functional assessment of in vivo heart enhancers

DOE PAGES

Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen; ...

2016-10-05

Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of > 35 epigenomic data sets from mouse and human pre-and postnatal hearts we created a comprehensive reference of > 80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs ofmore » two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function.« less
PRAPI: post-transcriptional regulation analysis pipeline for Iso-Seq.

PubMed

Gao, Yubang; Wang, Huiyuan; Zhang, Hangxiao; Wang, Yongsheng; Chen, Jinfeng; Gu, Lianfeng

2018-05-01

The single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) based on Pacific Bioscience (PacBio) platform has received increasing attention for its ability to explore full-length isoforms. Thus, comprehensive tools for Iso-Seq bioinformatics analysis are extremely useful. Here, we present a one-stop solution for Iso-Seq analysis, called PRAPI to analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively. PRAPI is capable of combining Iso-Seq full-length isoforms with short read data, such as RNA-Seq or polyadenylation site sequencing (PAS-seq) for differential expression analysis of NAT, AS, APA and circRNAs. Furthermore, PRAPI can annotate new genes and correct mis-annotated genes when gene annotation is available. Finally, PRAPI generates high-quality vector graphics to visualize and highlight the Iso-Seq results. The Dockerfile of PRAPI is available at http://www.bioinfor.org/tool/PRAPI. lfgu@fafu.edu.cn.
Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity

PubMed Central

Corcoran, Martin M.; Phad, Ganesh E.; Bernat, Néstor Vázquez; Stahl-Hennig, Christiane; Sumida, Noriyuki; Persson, Mats A.A.; Martin, Marcel; Hedestam, Gunilla B. Karlsson

2016-01-01

Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool. PMID:27995928
Genome-wide compendium and functional assessment of in vivo heart enhancers

PubMed Central

Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen; Fukuda-Yuzawa, Yoko; Osterwalder, Marco; Mannion, Brandon J.; May, Dalit; Spurrell, Cailyn H.; Plajzer-Frick, Ingrid; Pickle, Catherine S.; Lee, Elizabeth; Garvin, Tyler H.; Kato, Momoe; Akiyama, Jennifer A.; Afzal, Veena; Lee, Ah Young; Gorkin, David U.; Ren, Bing; Rubin, Edward M.; Visel, Axel; Pennacchio, Len A.

2016-01-01

Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of >35 epigenomic data sets from mouse and human pre- and postnatal hearts we created a comprehensive reference of >80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs of two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function. PMID:27703156
Gene Expression Profiles in Paired Gingival Biopsies from Periodontitis-Affected and Healthy Tissues Revealed by Massively Parallel Sequencing

PubMed Central

Båge, Tove; Lagervall, Maria; Jansson, Leif; Lundeberg, Joakim; Yucel-Lindberg, Tülay

2012-01-01

Periodontitis is a chronic inflammatory disease affecting the soft tissue and bone that surrounds the teeth. Despite extensive research, distinctive genes responsible for the disease have not been identified. The objective of this study was to elucidate transcriptome changes in periodontitis, by investigating gene expression profiles in gingival tissue obtained from periodontitis-affected and healthy gingiva from the same patient, using RNA-sequencing. Gingival biopsies were obtained from a disease-affected and a healthy site from each of 10 individuals diagnosed with periodontitis. Enrichment analysis performed among uniquely expressed genes for the periodontitis-affected and healthy tissues revealed several regulated pathways indicative of inflammation for the periodontitis-affected condition. Hierarchical clustering of the sequenced biopsies demonstrated clustering according to the degree of inflammation, as observed histologically in the biopsies, rather than clustering at the individual level. Among the top 50 upregulated genes in periodontitis-affected tissues, we investigated two genes which have not previously been demonstrated to be involved in periodontitis. These included interferon regulatory factor 4 and chemokine (C-C motif) ligand 18, which were also expressed at the protein level in gingival biopsies from patients with periodontitis. In conclusion, this study provides a first step towards a quantitative comprehensive insight into the transcriptome changes in periodontitis. We demonstrate for the first time site-specific local variation in gene expression profiles of periodontitis-affected and healthy tissues obtained from patients with periodontitis, using RNA-seq. Further, we have identified novel genes expressed in periodontitis tissues, which may constitute potential therapeutic targets for future treatment strategies of periodontitis. PMID:23029519
Identification and temporal expression of putative circadian clock transcripts in the amphipod crustacean Talitrus saltator

PubMed Central

O’Grady, Joseph F.; Hoelters, Laura S.; Swain, Martin T.

2016-01-01

Background Talitrus saltator is an amphipod crustacean that inhabits the supralittoral zone on sandy beaches in the Northeast Atlantic and Mediterranean. T. saltator exhibits endogenous locomotor activity rhythms and time-compensated sun and moon orientation, both of which necessitate at least one chronometric mechanism. Whilst their behaviour is well studied, currently there are no descriptions of the underlying molecular components of a biological clock in this animal, and very few in other crustacean species. Methods We harvested brain tissue from animals expressing robust circadian activity rhythms and used homology cloning and Illumina RNAseq approaches to sequence and identify the core circadian clock and clock-related genes in these samples. We assessed the temporal expression of these genes in time-course samples from rhythmic animals using RNAseq. Results We identified a comprehensive suite of circadian clock gene homologues in T. saltator including the ‘core’ clock genes period (Talper), cryptochrome 2 (Talcry2), timeless (Taltim), clock (Talclk), and bmal1 (Talbmal1). In addition we describe the sequence and putative structures of 23 clock-associated genes including two unusual, extended isoforms of pigment dispersing hormone (Talpdh). We examined time-course RNAseq expression data, derived from tissues harvested from behaviourally rhythmic animals, to reveal rhythmic expression of these genes with approximately circadian period in Talper and Talbmal1. Of the clock-related genes, casein kinase IIβ (TalckIIβ), ebony (Talebony), jetlag (Taljetlag), pigment dispensing hormone (Talpdh), protein phosphatase 1 (Talpp1), shaggy (Talshaggy), sirt1 (Talsirt1), sirt7 (Talsirt7) and supernumerary limbs (Talslimb) show temporal changes in expression. Discussion We report the sequences of principle genes that comprise the circadian clock of T. saltator and highlight the conserved structural and functional domains of their deduced cognate proteins. Our sequencing data contribute to the growing inventory of described comparative clocks. Expression profiling of the identified clock genes illuminates tantalising targets for experimental manipulation to elucidate the molecular and cellular control of clock-driven phenotypes in this crustacean. PMID:27761341
Comprehensive Genome-Wide Survey, Genomic Constitution and Expression Profiling of the NAC Transcription Factor Family in Foxtail Millet (Setaria italica L.)

PubMed Central

Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B., Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj

2013-01-01

The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants. PMID:23691254
Comprehensive genome-wide survey, genomic constitution and expression profiling of the NAC transcription factor family in foxtail millet (Setaria italica L.).

PubMed

Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B, Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj

2013-01-01

The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants.
Biosynthesis of the active compounds of Isatis indigotica based on transcriptome sequencing and metabolites profiling

PubMed Central

2013-01-01

Backgroud Isatis indigotica is a widely used herb for the clinical treatment of colds, fever, and influenza in Traditional Chinese Medicine (TCM). Various structural classes of compounds have been identified as effective ingredients. However, little is known at genetics level about these active metabolites. In the present study, we performed de novo transcriptome sequencing for the first time to produce a comprehensive dataset of I. indigotica. Results A database of 36,367 unigenes (average length = 1,115.67 bases) was generated by performing transcriptome sequencing. Based on the gene annotation of the transcriptome, 104 unigenes were identified covering most of the catalytic steps in the general biosynthetic pathways of indole, terpenoid, and phenylpropanoid. Subsequently, the organ-specific expression patterns of the genes involved in these pathways, and their responses to methyl jasmonate (MeJA) induction, were investigated. Metabolites profile of effective phenylpropanoid showed accumulation pattern of secondary metabolites were mostly correlated with the transcription of their biosynthetic genes. According to the analysis of UDP-dependent glycosyltransferases (UGT) family, several flavonoids were indicated to exist in I. indigotica and further identified by metabolic profile using UPLC/Q-TOF. Moreover, applying transcriptome co-expression analysis, nine new, putative UGTs were suggested as flavonol glycosyltransferases and lignan glycosyltransferases. Conclusions This database provides a pool of candidate genes involved in biosynthesis of effective metabolites in I. indigotica. Furthermore, the comprehensive analysis and characterization of the significant pathways are expected to give a better insight regarding the diversity of chemical composition, synthetic characteristics, and the regulatory mechanism which operate in this medical herb. PMID:24308360
Molecular Cloning and Characterization of G Alpha Proteins from the Western Tarnished Plant Bug, Lygus hesperus

PubMed Central

Hull, J. Joe; Wang, Meixian

2014-01-01

The Gα subunits of heterotrimeric G proteins play critical roles in the activation of diverse signal transduction cascades. However, the role of these genes in chemosensation remains to be fully elucidated. To initiate a comprehensive survey of signal transduction genes, we used homology-based cloning methods and transcriptome data mining to identity Gα subunits in the western tarnished plant bug (Lygus hesperus Knight). Among the nine sequences identified were single variants of the Gαi, Gαo, Gαs, and Gα12 subfamilies and five alternative splice variants of the Gαq subfamily. Sequence alignment and phylogenetic analyses of the putative L. hesperus Gα subunits support initial classifications and are consistent with established evolutionary relationships. End-point PCR-based profiling of the transcripts indicated head specific expression for LhGαq4, and largely ubiquitous expression, albeit at varying levels, for the other LhGα transcripts. All subfamilies were amplified from L. hesperus chemosensory tissues, suggesting potential roles in olfaction and/or gustation. Immunohistochemical staining of cultured insect cells transiently expressing recombinant His-tagged LhGαi, LhGαs, and LhGαq1 revealed plasma membrane targeting, suggesting the respective sequences encode functional G protein subunits. PMID:26463065

Construction of Pará rubber tree genome and multi-transcriptome database accelerates rubber researches.

PubMed

Makita, Yuko; Kawashima, Mika; Lau, Nyok Sean; Othman, Ahmad Sofiman; Matsui, Minami

2018-01-19

Natural rubber is an economically important material. Currently the Pará rubber tree, Hevea brasiliensis is the main commercial source. Little is known about rubber biosynthesis at the molecular level. Next-generation sequencing (NGS) technologies brought draft genomes of three rubber cultivars and a variety of RNA sequencing (RNA-seq) data. However, no current genome or transcriptome databases (DB) are organized by gene. A gene-oriented database is a valuable support for rubber research. Based on our original draft genome sequence of H. brasiliensis RRIM600, we constructed a rubber tree genome and transcriptome DB. Our DB provides genome information including gene functional annotations and multi-transcriptome data of RNA-seq, full-length cDNAs including PacBio Isoform sequencing (Iso-Seq), ESTs and genome wide transcription start sites (TSSs) derived from CAGE technology. Using our original and publically available RNA-seq data, we calculated co-expressed genes for identifying functionally related gene sets and/or genes regulated by the same transcription factor (TF). Users can access multi-transcriptome data through both a gene-oriented web page and a genome browser. For the gene searching system, we provide keyword search, sequence homology search and gene expression search; users can also select their expression threshold easily. The rubber genome and transcriptome DB provides rubber tree genome sequence and multi-transcriptomics data. This DB is useful for comprehensive understanding of the rubber transcriptome. This will assist both industrial and academic researchers for rubber and economically important close relatives such as R. communis, M. esculenta and J. curcas. The Rubber Transcriptome DB release 2017.03 is accessible at http://matsui-lab.riken.jp/rubber/ .
Genomic Heat Shock Element Sequences Drive Cooperative Human Heat Shock Factor 1 DNA Binding and Selectivity*

PubMed Central

Jaeger, Alex M.; Makley, Leah N.; Gestwicki, Jason E.; Thiele, Dennis J.

2014-01-01

The heat shock transcription factor 1 (HSF1) activates expression of a variety of genes involved in cell survival, including protein chaperones, the protein degradation machinery, anti-apoptotic proteins, and transcription factors. Although HSF1 activation has been linked to amelioration of neurodegenerative disease, cancer cells exhibit a dependence on HSF1 for survival. Indeed, HSF1 drives a program of gene expression in cancer cells that is distinct from that activated in response to proteotoxic stress, and HSF1 DNA binding activity is elevated in cycling cells as compared with arrested cells. Active HSF1 homotrimerizes and binds to a DNA sequence consisting of inverted repeats of the pentameric sequence nGAAn, known as heat shock elements (HSEs). Recent comprehensive ChIP-seq experiments demonstrated that the architecture of HSEs is very diverse in the human genome, with deviations from the consensus sequence in the spacing, orientation, and extent of HSE repeats that could influence HSF1 DNA binding efficacy and the kinetics and magnitude of target gene expression. To understand the mechanisms that dictate binding specificity, HSF1 was purified as either a monomer or trimer and used to evaluate DNA-binding site preferences in vitro using fluorescence polarization and thermal denaturation profiling. These results were compared with quantitative chromatin immunoprecipitation assays in vivo. We demonstrate a role for specific orientations of extended HSE sequences in driving preferential HSF1 DNA binding to target loci in vivo. These studies provide a biochemical basis for understanding differential HSF1 target gene recognition and transcription in neurodegenerative disease and in cancer. PMID:25204655
Evolution of Synonymous Codon Usage in Neurospora tetrasperma and Neurospora discreta

PubMed Central

Whittle, C. A.; Sun, Y.; Johannesson, H.

2011-01-01

Neurospora comprises a primary model system for the study of fungal genetics and biology. In spite of this, little is known about genome evolution in Neurospora. For example, the evolution of synonymous codon usage is largely unknown in this genus. In the present investigation, we conducted a comprehensive analysis of synonymous codon usage and its relationship to gene expression and gene length (GL) in Neurospora tetrasperma and Neurospora discreta. For our analysis, we examined codon usage among 2,079 genes per organism and assessed gene expression using large-scale expressed sequenced tag (EST) data sets (279,323 and 453,559 ESTs for N. tetrasperma and N. discreta, respectively). Data on relative synonymous codon usage revealed 24 codons (and two putative codons) that are more frequently used in genes with high than with low expression and thus were defined as optimal codons. Although codon-usage bias was highly correlated with gene expression, it was independent of selectively neutral base composition (introns); thus demonstrating that translational selection drives synonymous codon usage in these genomes. We also report that GL (coding sequences [CDS]) was inversely associated with optimal codon usage at each gene expression level, with highly expressed short genes having the greatest frequency of optimal codons. Optimal codon frequency was moderately higher in N. tetrasperma than in N. discreta, which might be due to variation in selective pressures and/or mating systems. PMID:21402862
Transcriptome and proteomic analysis of mango (Mangifera indica Linn) fruits.

PubMed

Wu, Hong-xia; Jia, Hui-min; Ma, Xiao-wei; Wang, Song-biao; Yao, Quan-sheng; Xu, Wen-tian; Zhou, Yi-gang; Gao, Zhong-shan; Zhan, Ru-lin

2014-06-13

Here we used Illumina RNA-seq technology for transcriptome sequencing of a mixed fruit sample from 'Zill' mango (Mangifera indica Linn) fruit pericarp and pulp during the development and ripening stages. RNA-seq generated 68,419,722 sequence reads that were assembled into 54,207 transcripts with a mean length of 858bp, including 26,413 clusters and 27,794 singletons. A total of 42,515(78.43%) transcripts were annotated using public protein databases, with a cut-off E-value above 10(-5), of which 35,198 and 14,619 transcripts were assigned to gene ontology terms and clusters of orthologous groups respectively. Functional annotation against the Kyoto Encyclopedia of Genes and Genomes database identified 23,741(43.79%) transcripts which were mapped to 128 pathways. These pathways revealed many previously unknown transcripts. We also applied mass spectrometry-based transcriptome data to characterize the proteome of ripe fruit. LC-MS/MS analysis of the mango fruit proteome was using tandem mass spectrometry (MS/MS) in an LTQ Orbitrap Velos (Thermo) coupled online to the HPLC. This approach enabled the identification of 7536 peptides that matched 2754 proteins. Our study provides a comprehensive sequence for a systemic view of transcriptome during mango fruit development and the most comprehensive fruit proteome to date, which are useful for further genomics research and proteomic studies. Our study provides a comprehensive sequence for a systemic view of both the transcriptome and proteome of mango fruit, and a valuable reference for further research on gene expression and protein identification. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.
DeepBase: annotation and discovery of microRNAs and other noncoding RNAs from deep-sequencing data.

PubMed

Yang, Jian-Hua; Qu, Liang-Hu

2012-01-01

Recent advances in high-throughput deep-sequencing technology have produced large numbers of short and long RNA sequences and enabled the detection and profiling of known and novel microRNAs (miRNAs) and other noncoding RNAs (ncRNAs) at unprecedented sensitivity and depth. In this chapter, we describe the use of deepBase, a database that we have developed to integrate all public deep-sequencing data and to facilitate the comprehensive annotation and discovery of miRNAs and other ncRNAs from these data. deepBase provides an integrative, interactive, and versatile web graphical interface to evaluate miRBase-annotated miRNA genes and other known ncRNAs, explores the expression patterns of miRNAs and other ncRNAs, and discovers novel miRNAs and other ncRNAs from deep-sequencing data. deepBase also provides a deepView genome browser to comparatively analyze these data at multiple levels. deepBase is available at http://deepbase.sysu.edu.cn/.
Comprehensive Assessments of RNA-seq by the SEQC Consortium: FDA-Led Efforts Advance Precision Medicine.

PubMed

Xu, Joshua; Gong, Binsheng; Wu, Leihong; Thakkar, Shraddha; Hong, Huixiao; Tong, Weida

2016-03-15

Studies on gene expression in response to therapy have led to the discovery of pharmacogenomics biomarkers and advances in precision medicine. Whole transcriptome sequencing (RNA-seq) is an emerging tool for profiling gene expression and has received wide adoption in the biomedical research community. However, its value in regulatory decision making requires rigorous assessment and consensus between various stakeholders, including the research community, regulatory agencies, and industry. The FDA-led SEquencing Quality Control (SEQC) consortium has made considerable progress in this direction, and is the subject of this review. Specifically, three RNA-seq platforms (Illumina HiSeq, Life Technologies SOLiD, and Roche 454) were extensively evaluated at multiple sites to assess cross-site and cross-platform reproducibility. The results demonstrated that relative gene expression measurements were consistently comparable across labs and platforms, but not so for the measurement of absolute expression levels. As part of the quality evaluation several studies were included to evaluate the utility of RNA-seq in clinical settings and safety assessment. The neuroblastoma study profiled tumor samples from 498 pediatric neuroblastoma patients by both microarray and RNA-seq. RNA-seq offers more utilities than microarray in determining the transcriptomic characteristics of cancer. However, RNA-seq and microarray-based models were comparable in clinical endpoint prediction, even when including additional features unique to RNA-seq beyond gene expression. The toxicogenomics study compared microarray and RNA-seq profiles of the liver samples from rats exposed to 27 different chemicals representing multiple toxicity modes of action. Cross-platform concordance was dependent on chemical treatment and transcript abundance. Though both RNA-seq and microarray are suitable for developing gene expression based predictive models with comparable prediction performance, RNA-seq offers advantages over microarray in profiling genes with low expression. The rat BodyMap study provided a comprehensive rat transcriptomic body map by performing RNA-Seq on 320 samples from 11 organs in either sex of juvenile, adolescent, adult and aged Fischer 344 rats. Lastly, the transferability study demonstrated that signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development using a comprehensive approach with two large clinical data sets. This result suggests continued usefulness of legacy microarray data in the coming RNA-seq era. In conclusion, the SEQC project enhances our understanding of RNA-seq and provides valuable guidelines for RNA-seq based clinical application and safety evaluation to advance precision medicine.
In Silico Prediction of Neuropeptides/Peptide Hormone Transcripts in the Cheilostome Bryozoan Bugula neritina

PubMed Central

Zhang, Gen; He, Li-Sheng; Qian, Pei-Yuan

2016-01-01

The bryozoan Bugula neritina has a biphasic life cycle that consists of a planktonic larval stage and a sessile juvenile/adult stage. The transition between these two stages is crucial for the development and recruitment of B. neritina. Metamorphosis in B. neritina is mediated by both the nervous system and the release of developmental signals. However, no research has been conducted to investigate the expression of neuropeptides (NP)/peptide hormones in B. neritina larvae. Here, we report a comprehensive study of the NP/peptide hormones in the marine bryozoan B. neritina based on in silico identification methods. We recovered 22 transcripts encompassing 11 NP/peptide hormone precursor transcript sequences. The transcript sequences of the 11 isolated NP precursors were validated by cDNA cloning using gene-specific primers. We also examined the expression of three peptide hormone precursor transcripts (BnFDSIG, BnILP1, BnGPB) in the coronate larvae of B. neritina, demonstrating their distinct expression patterns in the larvae. Overall, our findings serve as an important foundation for subsequent investigations of the peptidergic control of bryozoan larval behavior and settlement. PMID:27537380
Deciphering Transcriptional Programming during Pod and Seed Development Using RNA-Seq in Pigeonpea (Cajanus cajan).

PubMed

Pazhamala, Lekha T; Agarwal, Gaurav; Bajaj, Prasad; Kumar, Vinay; Kulshreshtha, Akanksha; Saxena, Rachit K; Varshney, Rajeev K

2016-01-01

Seed development is an important event in plant life cycle that has interested humankind since ages, especially in crops of economic importance. Pigeonpea is an important grain legume of the semi-arid tropics, used mainly for its protein rich seeds. In order to understand the transcriptional programming during the pod and seed development, RNA-seq data was generated from embryo sac from the day of anthesis (0 DAA), seed and pod wall (5, 10, 20 and 30 DAA) of pigeonpea variety "Asha" (ICPL 87119) using Illumina HiSeq 2500. About 684 million sequencing reads have been generated from nine samples, which resulted in the identification of 27,441 expressed genes after sequence analysis. These genes have been studied for their differentially expression, co-expression, temporal and spatial gene expression. We have also used the RNA-seq data to identify important seed-specific transcription factors, biological processes and associated pathways during seed development process in pigeonpea. The comprehensive gene expression study from flowering to mature pod development in pigeonpea would be crucial in identifying candidate genes involved in seed traits directly or indirectly related to yield and quality. The dataset will serve as an important resource for gene discovery and deciphering the molecular mechanisms underlying various seed related traits.
Deciphering Transcriptional Programming during Pod and Seed Development Using RNA-Seq in Pigeonpea (Cajanus cajan)

PubMed Central

Pazhamala, Lekha T.; Agarwal, Gaurav; Bajaj, Prasad; Kumar, Vinay; Kulshreshtha, Akanksha; Saxena, Rachit K.; Varshney, Rajeev K.

2016-01-01

Seed development is an important event in plant life cycle that has interested humankind since ages, especially in crops of economic importance. Pigeonpea is an important grain legume of the semi-arid tropics, used mainly for its protein rich seeds. In order to understand the transcriptional programming during the pod and seed development, RNA-seq data was generated from embryo sac from the day of anthesis (0 DAA), seed and pod wall (5, 10, 20 and 30 DAA) of pigeonpea variety “Asha” (ICPL 87119) using Illumina HiSeq 2500. About 684 million sequencing reads have been generated from nine samples, which resulted in the identification of 27,441 expressed genes after sequence analysis. These genes have been studied for their differentially expression, co-expression, temporal and spatial gene expression. We have also used the RNA-seq data to identify important seed-specific transcription factors, biological processes and associated pathways during seed development process in pigeonpea. The comprehensive gene expression study from flowering to mature pod development in pigeonpea would be crucial in identifying candidate genes involved in seed traits directly or indirectly related to yield and quality. The dataset will serve as an important resource for gene discovery and deciphering the molecular mechanisms underlying various seed related traits. PMID:27760186
pseudoMap: an innovative and comprehensive resource for identification of siRNA-mediated mechanisms in human transcribed pseudogenes.

PubMed

Chan, Wen-Ling; Yang, Wen-Kuang; Huang, Hsien-Da; Chang, Jan-Gowth

2013-01-01

RNA interference (RNAi) is a gene silencing process within living cells, which is controlled by the RNA-induced silencing complex with a sequence-specific manner. In flies and mice, the pseudogene transcripts can be processed into short interfering RNAs (siRNAs) that regulate protein-coding genes through the RNAi pathway. Following these findings, we construct an innovative and comprehensive database to elucidate siRNA-mediated mechanism in human transcribed pseudogenes (TPGs). To investigate TPG producing siRNAs that regulate protein-coding genes, we mapped the TPGs to small RNAs (sRNAs) that were supported by publicly deep sequencing data from various sRNA libraries and constructed the TPG-derived siRNA-target interactions. In addition, we also presented that TPGs can act as a target for miRNAs that actually regulate the parental gene. To enable the systematic compilation and updating of these results and additional information, we have developed a database, pseudoMap, capturing various types of information, including sequence data, TPG and cognate annotation, deep sequencing data, RNA-folding structure, gene expression profiles, miRNA annotation and target prediction. As our knowledge, pseudoMap is the first database to demonstrate two mechanisms of human TPGs: encoding siRNAs and decoying miRNAs that target the parental gene. pseudoMap is freely accessible at http://pseudomap.mbc.nctu.edu.tw/. Database URL: http://pseudomap.mbc.nctu.edu.tw/
Recently Patented Viral Nucleotide Sequences and Generation of Virus-Derived Vaccines.

PubMed

Venkataraman, Srividhya; Ahmad, Tauqeer; Haidar, Mounir A; Hefferon, Kathleen L

2017-01-01

With an increase in comprehension of the molecular biology of viruses, there has been a recent surge in the application of virus sequences and viral gene expression strategies towards the diagnosis and treatment of diseases. The scope of the patenting landscape has widened as a result and the current review discusses patents pertaining to live / attenuated viral vaccines. The vaccines addressed here have been developed by both conventional means as well as by the state-of-the-art genetic engineering techniques. This review also addresses the applications of these patents for clinical and biotechnological purposes. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
RNA sequencing-based cell proliferation analysis across 19 cancers identifies a subset of proliferation-informative cancers with a common survival signature.

PubMed

Ramaker, Ryne C; Lasseigne, Brittany N; Hardigan, Andrew A; Palacio, Laura; Gunther, David S; Myers, Richard M; Cooper, Sara J

2017-06-13

Despite advances in cancer diagnosis and treatment strategies, robust prognostic signatures remain elusive in most cancers. Cell proliferation has long been recognized as a prognostic marker in cancer, but the generation of comprehensive, publicly available datasets allows examination of the links between cell proliferation and cancer characteristics such as mutation rate, stage, and patient outcomes. Here we explore the role of cell proliferation across 19 cancers (n = 6,581 patients) by using tissue-based RNA sequencing data from The Cancer Genome Atlas Project and calculating a 'proliferative index' derived from gene expression associated with Proliferating Cell Nuclear Antigen (PCNA) levels. This proliferative index is significantly associated with patient survival (Cox, p-value < 0.05) in 7 of 19 cancers, which we have defined as "proliferation-informative cancers" (PICs). In PICs, the proliferative index is strongly correlated with tumor stage and nodal invasion. PICs demonstrate reduced baseline expression of proliferation machinery relative to non-PICs. Additionally, we find the proliferative index is significantly associated with gross somatic mutation burden (Spearman, p = 1.76 x 10-23) as well as with mutations in individual driver genes. This analysis provides a comprehensive characterization of tumor proliferation indices and their association with disease progression and prognosis in multiple cancer types and highlights specific cancers that may be particularly susceptible to improved targeting of this classic cancer hallmark.
Translation efficiency of heterologous proteins is significantly affected by the genetic context of RBS sequences in engineered cyanobacterium Synechocystis sp. PCC 6803.

PubMed

Thiel, Kati; Mulaku, Edita; Dandapani, Hariharan; Nagy, Csaba; Aro, Eva-Mari; Kallio, Pauli

2018-03-02

Photosynthetic cyanobacteria have been studied as potential host organisms for direct solar-driven production of different carbon-based chemicals from CO 2 and water, as part of the development of sustainable future biotechnological applications. The engineering approaches, however, are still limited by the lack of comprehensive information on most optimal expression strategies and validated species-specific genetic elements which are essential for increasing the intricacy, predictability and efficiency of the systems. This study focused on the systematic evaluation of the key translational control elements, ribosome binding sites (RBS), in the cyanobacterial host Synechocystis sp. PCC 6803, with the objective of expanding the palette of tools for more rigorous engineering approaches. An expression system was established for the comparison of 13 selected RBS sequences in Synechocystis, using several alternative reporter proteins (sYFP2, codon-optimized GFPmut3 and ethylene forming enzyme) as quantitative indicators of the relative translation efficiencies. The set-up was shown to yield highly reproducible expression patterns in independent analytical series with low variation between biological replicates, thus allowing statistical comparison of the activities of the different RBSs in vivo. While the RBSs covered a relatively broad overall expression level range, the downstream gene sequence was demonstrated in a rigorous manner to have a clear impact on the resulting translational profiles. This was expected to reflect interfering sequence-specific mRNA-level interaction between the RBS and the coding region, yet correlation between potential secondary structure formation and observed translation levels could not be resolved with existing in silico prediction tools. The study expands our current understanding on the potential and limitations associated with the regulation of protein expression at translational level in engineered cyanobacteria. The acquired information can be used for selecting appropriate RBSs for optimizing over-expression constructs or multicistronic pathways in Synechocystis, while underlining the complications in predicting the activity due to gene-specific interactions which may reduce the translational efficiency for a given RBS-gene combination. Ultimately, the findings emphasize the need for additional characterized insulator sequence elements to decouple the interaction between the RBS and the coding region for future engineering approaches.
GarlicESTdb: an online database and mining tool for garlic EST sequences.

PubMed

Kim, Dae-Won; Jung, Tae-Sung; Nam, Seong-Hyeuk; Kwon, Hyuk-Ryul; Kim, Aeri; Chae, Sung-Hwa; Choi, Sang-Haeng; Kim, Dong-Wook; Kim, Ryong Nam; Park, Hong-Seog

2009-05-18

Allium sativum., commonly known as garlic, is a species in the onion genus (Allium), which is a large and diverse one containing over 1,250 species. Its close relatives include chives, onion, leek and shallot. Garlic has been used throughout recorded history for culinary, medicinal use and health benefits. Currently, the interest in garlic is highly increasing due to nutritional and pharmaceutical value including high blood pressure and cholesterol, atherosclerosis and cancer. For all that, there are no comprehensive databases available for Expressed Sequence Tags(EST) of garlic for gene discovery and future efforts of genome annotation. That is why we developed a new garlic database and applications to enable comprehensive analysis of garlic gene expression. GarlicESTdb is an integrated database and mining tool for large-scale garlic (Allium sativum) EST sequencing. A total of 21,595 ESTs collected from an in-house cDNA library were used to construct the database. The analysis pipeline is an automated system written in JAVA and consists of the following components: automatic preprocessing of EST reads, assembly of raw sequences, annotation of the assembled sequences, storage of the analyzed information into MySQL databases, and graphic display of all processed data. A web application was implemented with the latest J2EE (Java 2 Platform Enterprise Edition) software technology (JSP/EJB/JavaServlet) for browsing and querying the database, for creation of dynamic web pages on the client side, and for mapping annotated enzymes to KEGG pathways, the AJAX framework was also used partially. The online resources, such as putative annotation, single nucleotide polymorphisms (SNP) and tandem repeat data sets, can be searched by text, explored on the website, searched using BLAST, and downloaded. To archive more significant BLAST results, a curation system was introduced with which biologists can easily edit best-hit annotation information for others to view. The GarlicESTdb web application is freely available at http://garlicdb.kribb.re.kr. GarlicESTdb is the first incorporated online information database of EST sequences isolated from garlic that can be freely accessed and downloaded. It has many useful features for interactive mining of EST contigs and datasets from each library, including curation of annotated information, expression profiling, information retrieval, and summary of statistics of functional annotation. Consequently, the development of GarlicESTdb will provide a crucial contribution to biologists for data-mining and more efficient experimental studies.
Systems biology of embryonic development: Prospects for a complete understanding of the Caenorhabditis elegans embryo.

PubMed

Murray, John Isaac

2018-05-01

The convergence of developmental biology and modern genomics tools brings the potential for a comprehensive understanding of developmental systems. This is especially true for the Caenorhabditis elegans embryo because its small size, invariant developmental lineage, and powerful genetic and genomic tools provide the prospect of a cellular resolution understanding of messenger RNA (mRNA) expression and regulation across the organism. We describe here how a systems biology framework might allow large-scale determination of the embryonic regulatory relationships encoded in the C. elegans genome. This framework consists of two broad steps: (a) defining the "parts list"-all genes expressed in all cells at each time during development and (b) iterative steps of computational modeling and refinement of these models by experimental perturbation. Substantial progress has been made towards defining the parts list through imaging methods such as large-scale green fluorescent protein (GFP) reporter analysis. Imaging results are now being augmented by high-resolution transcriptome methods such as single-cell RNA sequencing, and it is likely the complete expression patterns of all genes across the embryo will be known within the next few years. In contrast, the modeling and perturbation experiments performed so far have focused largely on individual cell types or genes, and improved methods will be needed to expand them to the full genome and organism. This emerging comprehensive map of embryonic expression and regulatory function will provide a powerful resource for developmental biologists, and would also allow scientists to ask questions not accessible without a comprehensive picture. This article is categorized under: Invertebrate Organogenesis > Worms Technologies > Analysis of the Transcriptome Gene Expression and Transcriptional Hierarchies > Gene Networks and Genomics. © 2018 Wiley Periodicals, Inc.
Reproducibility of the dynamics of facial expressions in unilateral facial palsy.

PubMed

Alagha, M A; Ju, X; Morley, S; Ayoub, A

2018-02-01

The aim of this study was to assess the reproducibility of non-verbal facial expressions in unilateral facial paralysis using dynamic four-dimensional (4D) imaging. The Di4D system was used to record five facial expressions of 20 adult patients. The system captured 60 three-dimensional (3D) images per second; each facial expression took 3-4seconds which was recorded in real time. Thus a set of 180 3D facial images was generated for each expression. The procedure was repeated after 30min to assess the reproducibility of the expressions. A mathematical facial mesh consisting of thousands of quasi-point 'vertices' was conformed to the face in order to determine the morphological characteristics in a comprehensive manner. The vertices were tracked throughout the sequence of the 180 images. Five key 3D facial frames from each sequence of images were analyzed. Comparisons were made between the first and second capture of each facial expression to assess the reproducibility of facial movements. Corresponding images were aligned using partial Procrustes analysis, and the root mean square distance between them was calculated and analyzed statistically (paired Student t-test, P<0.05). Facial expressions of lip purse, cheek puff, and raising of eyebrows were reproducible. Facial expressions of maximum smile and forceful eye closure were not reproducible. The limited coordination of various groups of facial muscles contributed to the lack of reproducibility of these facial expressions. 4D imaging is a useful clinical tool for the assessment of facial expressions. Copyright © 2017 International Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.
miRNEST database: an integrative approach in microRNA search and annotation

PubMed Central

Szcześniak, Michał Wojciech; Deorowicz, Sebastian; Gapski, Jakub; Kaczyński, Łukasz; Makałowska, Izabela

2012-01-01

Despite accumulating data on animal and plant microRNAs and their functions, existing public miRNA resources usually collect miRNAs from a very limited number of species. A lot of microRNAs, including those from model organisms, remain undiscovered. As a result there is a continuous need to search for new microRNAs. We present miRNEST (http://mirnest.amu.edu.pl), a comprehensive database of animal, plant and virus microRNAs. The core part of the database is built from our miRNA predictions conducted on Expressed Sequence Tags of 225 animal and 202 plant species. The miRNA search was performed based on sequence similarity and as many as 10 004 miRNA candidates in 221 animal and 199 plant species were discovered. Out of them only 299 have already been deposited in miRBase. Additionally, miRNEST has been integrated with external miRNA data from literature and 13 databases, which includes miRNA sequences, small RNA sequencing data, expression, polymorphisms and targets data as well as links to external miRNA resources, whenever applicable. All this makes miRNEST a considerable miRNA resource in a sense of number of species (544) that integrates a scattered miRNA data into a uniform format with a user-friendly web interface. PMID:22135287
The Eimeria Transcript DB: an integrated resource for annotated transcripts of protozoan parasites of the genus Eimeria

PubMed Central

Rangel, Luiz Thibério; Novaes, Jeniffer; Durham, Alan M.; Madeira, Alda Maria B. N.; Gruber, Arthur

2013-01-01

Parasites of the genus Eimeria infect a wide range of vertebrate hosts, including chickens. We have recently reported a comparative analysis of the transcriptomes of Eimeria acervulina, Eimeria maxima and Eimeria tenella, integrating ORESTES data produced by our group and publicly available Expressed Sequence Tags (ESTs). All cDNA reads have been assembled, and the reconstructed transcripts have been submitted to a comprehensive functional annotation pipeline. Additional studies included orthology assignment across apicomplexan parasites and clustering analyses of gene expression profiles among different developmental stages of the parasites. To make all this body of information publicly available, we constructed the Eimeria Transcript Database (EimeriaTDB), a web repository that provides access to sequence data, annotation and comparative analyses. Here, we describe the web interface, available sequence data sets and query tools implemented on the site. The main goal of this work is to offer a public repository of sequence and functional annotation data of reconstructed transcripts of parasites of the genus Eimeria. We believe that EimeriaTDB will represent a valuable and complementary resource for the Eimeria scientific community and for those researchers interested in comparative genomics of apicomplexan parasites. Database URL: http://www.coccidia.icb.usp.br/eimeriatdb/ PMID:23411718
Comprehensive RNA-Seq Expression Analysis of Sensory Ganglia with a Focus on Ion Channels and GPCRs in Trigeminal Ganglia

PubMed Central

Manteniotis, Stavros; Lehmann, Ramona; Flegel, Caroline; Vogel, Felix; Hofreuter, Adrian; Schreiner, Benjamin S. P.; Altmüller, Janine; Becker, Christian; Schöbel, Nicole; Hatt, Hanns; Gisselmann, Günter

2013-01-01

The specific functions of sensory systems depend on the tissue-specific expression of genes that code for molecular sensor proteins that are necessary for stimulus detection and membrane signaling. Using the Next Generation Sequencing technique (RNA-Seq), we analyzed the complete transcriptome of the trigeminal ganglia (TG) and dorsal root ganglia (DRG) of adult mice. Focusing on genes with an expression level higher than 1 FPKM (fragments per kilobase of transcript per million mapped reads), we detected the expression of 12984 genes in the TG and 13195 in the DRG. To analyze the specific gene expression patterns of the peripheral neuronal tissues, we compared their gene expression profiles with that of the liver, brain, olfactory epithelium, and skeletal muscle. The transcriptome data of the TG and DRG were scanned for virtually all known G-protein-coupled receptors (GPCRs) as well as for ion channels. The expression profile was ranked with regard to the level and specificity for the TG. In total, we detected 106 non-olfactory GPCRs and 33 ion channels that had not been previously described as expressed in the TG. To validate the RNA-Seq data, in situ hybridization experiments were performed for several of the newly detected transcripts. To identify differences in expression profiles between the sensory ganglia, the RNA-Seq data of the TG and DRG were compared. Among the differentially expressed genes (> 1 FPKM), 65 and 117 were expressed at least 10-fold higher in the TG and DRG, respectively. Our transcriptome analysis allows a comprehensive overview of all ion channels and G protein-coupled receptors that are expressed in trigeminal ganglia and provides additional approaches for the investigation of trigeminal sensing as well as for the physiological and pathophysiological mechanisms of pain. PMID:24260241
Resources and Recommendations for Using Transcriptomics to Address Grand Challenges in Comparative Biology

PubMed Central

Mykles, Donald L.; Burnett, Karen G.; Durica, David S.; Joyce, Blake L.; McCarthy, Fiona M.; Schmidt, Carl J.; Stillman, Jonathon H.

2016-01-01

High-throughput RNA sequencing (RNA-seq) technology has become an important tool for studying physiological responses of organisms to changes in their environment. De novo assembly of RNA-seq data has allowed researchers to create a comprehensive catalog of genes expressed in a tissue and to quantify their expression without a complete genome sequence. The contributions from the “Tapping the Power of Crustacean Transcriptomics to Address Grand Challenges in Comparative Biology” symposium in this issue show the successes and limitations of using RNA-seq in the study of crustaceans. In conjunction with the symposium, the Animal Genome to Phenome Research Coordination Network collated comments from participants at the meeting regarding the challenges encountered when using transcriptomics in their research. Input came from novices and experts ranging from graduate students to principal investigators. Many were unaware of the bioinformatics analysis resources currently available on the CyVerse platform. Our analysis of community responses led to three recommendations for advancing the field: (1) integration of genomic and RNA-seq sequence assemblies for crustacean gene annotation and comparative expression; (2) development of methodologies for the functional analysis of genes; and (3) information and training exchange among laboratories for transmission of best practices. The field lacks the methods for manipulating tissue-specific gene expression. The decapod crustacean research community should consider the cherry shrimp, Neocaridina denticulata, as a decapod model for the application of transgenic tools for functional genomics. This would require a multi-investigator effort. PMID:27639274

High-throughput sequencing identification and characterization of potentially adhesion-related small RNAs in Streptococcus mutans.

PubMed

Zhu, Wenhui; Liu, Shanshan; Liu, Jia; Zhou, Yan; Lin, Huancai

2018-05-01

Adherence capacity is one of the principal virulence factors of Streptococcus mutans, and adhesion virulence factors are controlled by small RNAs (sRNAs) at the post-transcriptional level in various bacteria. Here, we aimed to identify and decipher putative adhesion-related sRNAs in clinical strains of S. mutans. RNA deep-sequencing was performed to identify potential sRNAs under different adhesion conditions. The expression of sRNAs was analysed by quantitative real-time PCR (qRT-PCR), and bioinformatic methods were used to predict the functional characteristics of sRNAs. A total of 736 differentially expressed candidate sRNAs were predicted, and these included 352 sRNAs located on the antisense to mRNA (AM) and 384 sRNAs in intergenic regions (IGRs). The top 7 differentially expressed sRNAs were successfully validated by qRT-PCR in UA159, and 2 of these were further confirmed in 100 clinical isolates. Moreover, the sequences of two sRNAs were conserved in other Streptococcus species, indicating a conserved role in such closely related species. A good correlation between the expression of sRNAs and the adhesion of 100 clinical strains was observed, which, combined with GO and KEGG, provides a perspective for the comprehension of sRNA function annotation. This study revealed a multitude of novel putative adhesion-related sRNAs in S. mutans and contributed to a better understanding of information concerning the transcriptional regulation of adhesion in S. mutans.
RNA-sequencing analysis reveals abundant developmental stage-specific and immunity-related genes in the pollen beetle Meligethes aeneus.

PubMed

Vogel, H; Badapanda, C; Knorr, E; Vilcinskas, A

2014-02-01

The pollen beetle (Meligethes aeneus) is a major pest of oilseed rape (Brassica napus) and other cruciferous crops in Europe. Pesticide-resistant pollen beetle populations are emerging, increasing the economic impact of this species. We isolated total RNA from the larval and adult stages, the latter either naïve or immunized by injection with bacteria and yeast. High-throughput RNA sequencing (RNA-Seq) was carried out to establish a comprehensive transcriptome catalogue and to screen for developmental stage-specific and immunity-related transcripts. We assembled the transcriptome de novo by combining sequence tags from all developmental stages and treatments. Gene expression data based on normalized read counts revealed several functional gene categories that were differentially expressed between larvae and adults, particularly genes associated with digestion and detoxification that were induced in larvae, and genes associated with reproduction and environmental signalling that were induced in adults. We also identified many genes associated with microbe recognition, immunity-related signalling and defence effectors, such as antimicrobial peptides (AMPs) and lysozymes. Digital gene expression analysis revealed significant differences in the profile of AMPs expressed in larvae, naïve adults and immune-challenged adults, providing insight into the steady-state differences between developmental stages and the complex transcriptional remodelling that occurs following the induction of immunity. Our data provide insight into the adaptive mechanisms used by phytophagous insects and could lead to the development of more effective control strategies for insect pests. © 2013 The Royal Entomological Society.
Recommendations for Accurate Resolution of Gene and Isoform Allele-Specific Expression in RNA-Seq Data

PubMed Central

Wood, David L. A.; Nones, Katia; Steptoe, Anita; Christ, Angelika; Harliwong, Ivon; Newell, Felicity; Bruxner, Timothy J. C.; Miller, David; Cloonan, Nicole; Grimmond, Sean M.

2015-01-01

Genetic variation modulates gene expression transcriptionally or post-transcriptionally, and can profoundly alter an individual’s phenotype. Measuring allelic differential expression at heterozygous loci within an individual, a phenomenon called allele-specific expression (ASE), can assist in identifying such factors. Massively parallel DNA and RNA sequencing and advances in bioinformatic methodologies provide an outstanding opportunity to measure ASE genome-wide. In this study, matched DNA and RNA sequencing, genotyping arrays and computationally phased haplotypes were integrated to comprehensively and conservatively quantify ASE in a single human brain and liver tissue sample. We describe a methodological evaluation and assessment of common bioinformatic steps for ASE quantification, and recommend a robust approach to accurately measure SNP, gene and isoform ASE through the use of personalized haplotype genome alignment, strict alignment quality control and intragenic SNP aggregation. Our results indicate that accurate ASE quantification requires careful bioinformatic analyses and is adversely affected by sample specific alignment confounders and random sampling even at moderate sequence depths. We identified multiple known and several novel ASE genes in liver, including WDR72, DSP and UBD, as well as genes that contained ASE SNPs with imbalance direction discordant with haplotype phase, explainable by annotated transcript structure, suggesting isoform derived ASE. The methods evaluated in this study will be of use to researchers performing highly conservative quantification of ASE, and the genes and isoforms identified as ASE of interest to researchers studying those loci. PMID:25965996
Global characterization of Artemisia annua glandular trichome transcriptome using 454 pyrosequencing

PubMed Central

Wang, Wei; Wang, Yejun; Zhang, Qing; Qi, Yan; Guo, Dianjing

2009-01-01

Background Glandular trichomes produce a wide variety of commercially important secondary metabolites in many plant species. The most prominent anti-malarial drug artemisinin, a sesquiterpene lactone, is produced in glandular trichomes of Artemisia annua. However, only limited genomic information is currently available in this non-model plant species. Results We present a global characterization of A. annua glandular trichome transcriptome using 454 pyrosequencing. Sequencing runs using two normalized cDNA collections from glandular trichomes yielded 406,044 expressed sequence tags (average length = 210 nucleotides), which assembled into 42,678 contigs and 147,699 singletons. Performing a second sequencing run only increased the number of genes identified by ~30%, indicating that massively parallel pyrosequencing provides deep coverage of the A. annua trichome transcriptome. By BLAST search against the NCBI non-redundant protein database, putative functions were assigned to over 28,573 unigenes, including previously undescribed enzymes likely involved in sesquiterpene biosynthesis. Comparison with ESTs derived from trichome collections of other plant species revealed expressed genes in common functional categories across different plant species. RT-PCR analysis confirmed the expression of selected unigenes and novel transcripts in A. annua glandular trichomes. Conclusion The presence of contigs corresponding to enzymes for terpenoids and flavonoids biosynthesis suggests important metabolic activity in A. annua glandular trichomes. Our comprehensive survey of genes expressed in glandular trichome will facilitate new gene discovery and shed light on the regulatory mechanism of artemisinin metabolism and trichome function in A. annua. PMID:19818120
Single-cell mRNA sequencing identifies subclonal heterogeneity in anti-cancer drug responses of lung adenocarcinoma cells.

PubMed

Kim, Kyu-Tae; Lee, Hye Won; Lee, Hae-Ock; Kim, Sang Cheol; Seo, Yun Jee; Chung, Woosung; Eum, Hye Hyeon; Nam, Do-Hyun; Kim, Junhyong; Joo, Kyeung Min; Park, Woong-Yang

2015-06-19

Intra-tumoral genetic and functional heterogeneity correlates with cancer clinical prognoses. However, the mechanisms by which intra-tumoral heterogeneity impacts therapeutic outcome remain poorly understood. RNA sequencing (RNA-seq) of single tumor cells can provide comprehensive information about gene expression and single-nucleotide variations in individual tumor cells, which may allow for the translation of heterogeneous tumor cell functional responses into customized anti-cancer treatments. We isolated 34 patient-derived xenograft (PDX) tumor cells from a lung adenocarcinoma patient tumor xenograft. Individual tumor cells were subjected to single cell RNA-seq for gene expression profiling and expressed mutation profiling. Fifty tumor-specific single-nucleotide variations, including KRAS(G12D), were observed to be heterogeneous in individual PDX cells. Semi-supervised clustering, based on KRAS(G12D) mutant expression and a risk score representing expression of 69 lung adenocarcinoma-prognostic genes, classified PDX cells into four groups. PDX cells that survived in vitro anti-cancer drug treatment displayed transcriptome signatures consistent with the group characterized by KRAS(G12D) and low risk score. Single-cell RNA-seq on viable PDX cells identified a candidate tumor cell subgroup associated with anti-cancer drug resistance. Thus, single-cell RNA-seq is a powerful approach for identifying unique tumor cell-specific gene expression profiles which could facilitate the development of optimized clinical anti-cancer strategies.
Seasonal differences in the testicular transcriptome profile of free-living European beavers (Castor fiber L.) determined by the RNA-Seq method

PubMed Central

Paukszto, Łukasz; Jastrzębski, Jan P.; Czerwińska, Joanna; Chojnowska, Katarzyna; Kamińska, Barbara; Kurzyńska, Aleksandra; Smolińska, Nina; Giżejewski, Zygmunt; Kamiński, Tadeusz

2017-01-01

The European beaver (Castor fiber L.) is an important free-living rodent that inhabits Eurasian temperate forests. Beavers are often referred to as ecosystem engineers because they create or change existing habitats, enhance biodiversity and prepare the environment for diverse plant and animal species. Beavers are protected in most European Union countries, but their genomic background remains unknown. In this study, gene expression patterns in beaver testes and the variations in genetic expression in breeding and non-breeding seasons were determined by high-throughput transcriptome sequencing. Paired-end sequencing in the Illumina HiSeq 2000 sequencer produced a total of 373.06 million of high-quality reads. De novo assembly of contigs yielded 130,741 unigenes with an average length of 1,369.3 nt, N50 value of 1,734, and average GC content of 46.51%. A comprehensive analysis of the testicular transcriptome revealed more than 26,000 highly expressed unigenes which exhibited the highest homology with Rattus norvegicus and Ictidomys tridecemlineatus genomes. More than 8,000 highly expressed genes were found to be involved in fundamental biological processes, cellular components or molecular pathways. The study also revealed 42 genes whose regulation differed between breeding and non-breeding seasons. During the non-breeding period, the expression of 37 genes was up-regulated, and the expression of 5 genes was down-regulated relative to the breeding season. The identified genes encode molecules which are involved in signaling transduction, DNA repair, stress responses, inflammatory processes, metabolism and steroidogenesis. Our results pave the way for further research into season-dependent variations in beaver testes. PMID:28678806
cGRNB: a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets.

PubMed

Xu, Huayong; Yu, Hui; Tu, Kang; Shi, Qianqian; Wei, Chaochun; Li, Yuan-Yuan; Li, Yi-Xue

2013-01-01

We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts. In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module. In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets. The cGRNB web-server is free and available online at http://www.scbit.org/cgrnb.
Eukaryotic genomes may exhibit up to 10 generic classes of gene promoters.

PubMed

Gagniuc, Paul; Ionescu-Tirgoviste, Constantin

2012-09-28

The main function of gene promoters appears to be the integration of different gene products in their biological pathways in order to maintain homeostasis. Generally, promoters have been classified in two major classes, namely TATA and CpG. Nevertheless, many genes using the same combinatorial formation of transcription factors have different gene expression patterns. Accordingly, we tried to ask ourselves some fundamental questions: Why certain genes have an overall predisposition for higher gene expression levels than others? What causes such a predisposition? Is there a structural relationship of these sequences in different tissues? Is there a strong phylogenetic relationship between promoters of closely related species? In order to gain valuable insights into different promoter regions, we obtained a series of image-based patterns which allowed us to identify 10 generic classes of promoters. A comprehensive analysis was undertaken for promoter sequences from Arabidopsis thaliana, Drosophila melanogaster, Homo sapiens and Oryza sativa, and a more extensive analysis of tissue-specific promoters in humans. We observed a clear preference for these species to use certain classes of promoters for specific biological processes. Moreover, in humans, we found that different tissues use distinct classes of promoters, reflecting an emerging promoter network. Depending on the tissue type, comparisons made between these classes of promoters reveal a complementarity between their patterns whereas some other classes of promoters have been observed to occur in competition. Furthermore, we also noticed the existence of some transitional states between these classes of promoters that may explain certain evolutionary mechanisms, which suggest a possible predisposition for specific levels of gene expression and perhaps for a different number of factors responsible for triggering gene expression. Our conclusions are based on comprehensive data from three different databases and a new computer model whose core is using Kappa index of coincidence. To fully understand the connections between gene promoters and gene expression, we analyzed thousands of promoter sequences using our Kappa Index of Coincidence method and a specialized Optical Character Recognition (OCR) neural network. Under our criteria, 10 classes of promoters were detected. In addition, the existence of "transitional" promoters suggests that there is an evolutionary weighted continuum between classes, depending perhaps upon changes in their gene products.
An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq.

PubMed

Yuan, Yongxian; Xu, Huaiqian; Leung, Ross Ka-Kit

2016-05-26

Previous studies compared running cost, time and other performance measures of popular sequencing platforms. However, comprehensive assessment of library construction and analysis protocols for Proton sequencing platform remains unexplored. Unlike Illumina sequencing platforms, Proton reads are heterogeneous in length and quality. When sequencing data from different platforms are combined, this can result in reads with various read length. Whether the performance of the commonly used software for handling such kind of data is satisfactory is unknown. By using universal human reference RNA as the initial material, RNaseIII and chemical fragmentation methods in library construction showed similar result in gene and junction discovery number and expression level estimated accuracy. In contrast, sequencing quality, read length and the choice of software affected mapping rate to a much larger extent. Unspliced aligner TMAP attained the highest mapping rate (97.27 % to genome, 86.46 % to transcriptome), though 47.83 % of mapped reads were clipped. Long reads could paradoxically reduce mapping in junctions. With reference annotation guide, the mapping rate of TopHat2 significantly increased from 75.79 to 92.09 %, especially for long (>150 bp) reads. Sailfish, a k-mer based gene expression quantifier attained highly consistent results with that of TaqMan array and highest sensitivity. We provided for the first time, the reference statistics of library preparation methods, gene detection and quantification and junction discovery for RNA-Seq by the Ion Proton platform. Chemical fragmentation performed equally well with the enzyme-based one. The optimal Ion Proton sequencing options and analysis software have been evaluated.
De novo sequencing and analysis of the transcriptome during the browning of fresh-cut Luffa cylindrica 'Fusi-3' fruits.

PubMed

Zhu, Haisheng; Liu, Jianting; Wen, Qingfang; Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng

2017-01-01

Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar 'Fusi-3'. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1-6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism.
Transcriptome analysis of Cymbidium sinense and its application to the identification of genes associated with floral development

PubMed Central

2013-01-01

Background Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. Results In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. Conclusion RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid. PMID:23617896
Transcriptome analysis of Cymbidium sinense and its application to the identification of genes associated with floral development.

PubMed

Zhang, Jianxia; Wu, Kunlin; Zeng, Songjun; Teixeira da Silva, Jaime A; Zhao, Xiaolan; Tian, Chang-En; Xia, Haoqiang; Duan, Jun

2013-04-24

Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid.
Transcriptome analysis of the honey bee fungal pathogen, Ascosphaera apis: implications for host pathogenesis

PubMed Central

2012-01-01

Background We present a comprehensive transcriptome analysis of the fungus Ascosphaera apis, an economically important pathogen of the Western honey bee (Apis mellifera) that causes chalkbrood disease. Our goals were to further annotate the A. apis reference genome and to identify genes that are candidates for being differentially expressed during host infection versus axenic culture. Results We compared A. apis transcriptome sequence from mycelia grown on liquid or solid media with that dissected from host-infected tissue. 454 pyrosequencing provided 252 Mb of filtered sequence reads from both culture types that were assembled into 10,087 contigs. Transcript contigs, protein sequences from multiple fungal species, and ab initio gene predictions were included as evidence sources in the Maker gene prediction pipeline, resulting in 6,992 consensus gene models. A phylogeny based on 12 of these protein-coding loci further supported the taxonomic placement of Ascosphaera as sister to the core Onygenales. Several common protein domains were less abundant in A. apis compared with related ascomycete genomes, particularly cytochrome p450 and protein kinase domains. A novel gene family was identified that has expanded in some ascomycete lineages, but not others. We manually annotated genes with homologs in other fungal genomes that have known relevance to fungal virulence and life history. Functional categories of interest included genes involved in mating-type specification, intracellular signal transduction, and stress response. Computational and manual annotations have been made publicly available on the Bee Pests and Pathogens website. Conclusions This comprehensive transcriptome analysis substantially enhances our understanding of the A. apis genome and its expression during infection of honey bee larvae. It also provides resources for future molecular studies of chalkbrood disease and ultimately improved disease management. PMID:22747707
An anatomically comprehensive atlas of the adult human brain transcriptome

PubMed Central

Guillozet-Bongaarts, Angela L.; Shen, Elaine H.; Ng, Lydia; Miller, Jeremy A.; van de Lagemaat, Louie N.; Smith, Kimberly A.; Ebbert, Amanda; Riley, Zackery L.; Abajian, Chris; Beckmann, Christian F.; Bernard, Amy; Bertagnolli, Darren; Boe, Andrew F.; Cartagena, Preston M.; Chakravarty, M. Mallar; Chapin, Mike; Chong, Jimmy; Dalley, Rachel A.; David Daly, Barry; Dang, Chinh; Datta, Suvro; Dee, Nick; Dolbeare, Tim A.; Faber, Vance; Feng, David; Fowler, David R.; Goldy, Jeff; Gregor, Benjamin W.; Haradon, Zeb; Haynor, David R.; Hohmann, John G.; Horvath, Steve; Howard, Robert E.; Jeromin, Andreas; Jochim, Jayson M.; Kinnunen, Marty; Lau, Christopher; Lazarz, Evan T.; Lee, Changkyu; Lemon, Tracy A.; Li, Ling; Li, Yang; Morris, John A.; Overly, Caroline C.; Parker, Patrick D.; Parry, Sheana E.; Reding, Melissa; Royall, Joshua J.; Schulkin, Jay; Sequeira, Pedro Adolfo; Slaughterbeck, Clifford R.; Smith, Simon C.; Sodt, Andy J.; Sunkin, Susan M.; Swanson, Beryl E.; Vawter, Marquis P.; Williams, Derric; Wohnoutka, Paul; Zielke, H. Ronald; Geschwind, Daniel H.; Hof, Patrick R.; Smith, Stephen M.; Koch, Christof; Grant, Seth G. N.; Jones, Allan R.

2014-01-01

Neuroanatomically precise, genome-wide maps of transcript distributions are critical resources to complement genomic sequence data and to correlate functional and genetic brain architecture. Here we describe the generation and analysis of a transcriptional atlas of the adult human brain, comprising extensive histological analysis and comprehensive microarray profiling of ~900 neuroanatomically precise subdivisions in two individuals. Transcriptional regulation varies enormously by anatomical location, with different regions and their constituent cell types displaying robust molecular signatures that are highly conserved between individuals. Analysis of differential gene expression and gene co-expression relationships demonstrates that brain-wide variation strongly reflects the distributions of major cell classes such as neurons, oligodendrocytes, astrocytes and microglia. Local neighbourhood relationships between fine anatomical subdivisions are associated with discrete neuronal subtypes and genes involved with synaptic transmission. The neocortex displays a relatively homogeneous transcriptional pattern, but with distinct features associated selectively with primary sensorimotor cortices and with enriched frontal lobe expression. Notably, the spatial topography of the neocortex is strongly reflected in its molecular topography— the closer two cortical regions, the more similar their transcriptomes. This freely accessible online data resource forms a high-resolution transcriptional baseline for neurogenetic studies of normal and abnormal human brain function. PMID:22996553
Comprehensive analysis of the dynamic structure of nuclear localization signals.

PubMed

Yamagishi, Ryosuke; Okuyama, Takahide; Oba, Shuntaro; Shimada, Jiro; Chaen, Shigeru; Kaneko, Hiroki

2015-12-01

Most transcription and epigenetic factors in eukaryotic cells have nuclear localization signals (NLSs) and are transported to the nucleus by nuclear transport proteins. Understanding the features of NLSs and the mechanisms of nuclear transport might help understand gene expression regulation, somatic cell reprogramming, thus leading to the treatment of diseases associated with abnormal gene expression. Although many studies analyzed the amino acid sequence of NLSs, few studies investigated their three-dimensional structure. Therefore, we conducted a statistical investigation of the dynamic structure of NLSs by extracting the conformation of these sequences from proteins examined by X-ray crystallography and using a quantity defined as conformational determination rate (a ratio between the number of amino acids determining the conformation and the number of all amino acids included in a certain region). We found that determining the conformation of NLSs is more difficult than determining the conformation of other regions and that NLSs may tend to form more heteropolymers than monomers. Therefore, these findings strongly suggest that NLSs are intrinsically disordered regions.
No more non-model species: the promise of next generation sequencing for comparative immunology.

PubMed

Dheilly, Nolwenn M; Adema, Coen; Raftos, David A; Gourbal, Benjamin; Grunau, Christoph; Du Pasquier, Louis

2014-07-01

Next generation sequencing (NGS) allows for the rapid, comprehensive and cost effective analysis of entire genomes and transcriptomes. NGS provides approaches for immune response gene discovery, profiling gene expression over the course of parasitosis, studying mechanisms of diversification of immune receptors and investigating the role of epigenetic mechanisms in regulating immune gene expression and/or diversification. NGS will allow meaningful comparisons to be made between organisms from different taxa in an effort to understand the selection of diverse strategies for host defence under different environmental pathogen pressures. At the same time, it will reveal the shared and unique components of the immunological toolkit and basic functional aspects that are essential for immune defence throughout the living world. In this review, we argue that NGS will revolutionize our understanding of immune responses throughout the animal kingdom because the depth of information it provides will circumvent the need to concentrate on a few "model" species. Copyright © 2014 Elsevier Ltd. All rights reserved.
Molecular profiling of multiple myeloma: from gene expression analysis to next-generation sequencing.

PubMed

Agnelli, Luca; Tassone, Pierfrancesco; Neri, Antonino

2013-06-01

Multiple myeloma is a fatal malignant proliferation of clonal bone marrow Ig-secreting plasma cells, characterized by wide clinical, biological, and molecular heterogeneity. Herein, global gene and microRNA expression, genome-wide DNA profilings, and next-generation sequencing technology used to investigate the genomic alterations underlying the bio-clinical heterogeneity in multiple myeloma are discussed. High-throughput technologies have undoubtedly allowed a better comprehension of the molecular basis of the disease, a fine stratification, and early identification of high-risk patients, and have provided insights toward targeted therapy studies. However, such technologies are at risk of being affected by laboratory- or cohort-specific biases, and are moreover influenced by high number of expected false positives. This aspect has a major weight in myeloma, which is characterized by large molecular heterogeneity. Therefore, meta-analysis as well as multiple approaches are desirable if not mandatory to validate the results obtained, in line with commonly accepted recommendation for tumor diagnostic/prognostic biomarker studies.
Sensitivity of BRCA1/2 testing in high-risk breast/ovarian/male breast cancer families: little contribution of comprehensive RNA/NGS panel testing.

PubMed

Byers, Helen; Wallis, Yvonne; van Veen, Elke M; Lalloo, Fiona; Reay, Kim; Smith, Philip; Wallace, Andrew J; Bowers, Naomi; Newman, William G; Evans, D Gareth

2016-11-01

The sensitivity of testing BRCA1 and BRCA2 remains unresolved as the frequency of deep intronic splicing variants has not been defined in high-risk familial breast/ovarian cancer families. This variant category is reported at significant frequency in other tumour predisposition genes, including NF1 and MSH2. We carried out comprehensive whole gene RNA analysis on 45 high-risk breast/ovary and male breast cancer families with no identified pathogenic variant on exonic sequencing and copy number analysis of BRCA1/2. In addition, we undertook variant screening of a 10-gene high/moderate risk breast/ovarian cancer panel by next-generation sequencing. DNA testing identified the causative variant in 50/56 (89%) breast/ovarian/male breast cancer families with Manchester scores of ≥50 with two variants being confirmed to affect splicing on RNA analysis. RNA sequencing of BRCA1/BRCA2 on 45 individuals from high-risk families identified no deep intronic variants and did not suggest loss of RNA expression as a cause of lost sensitivity. Panel testing in 42 samples identified a known RAD51D variant, a high-risk ATM variant in another breast ovary family and a truncating CHEK2 mutation. Current exonic sequencing and copy number analysis variant detection methods of BRCA1/2 have high sensitivity in high-risk breast/ovarian cancer families. Sequence analysis of RNA does not identify any variants undetected by current analysis of BRCA1/2. However, RNA analysis clarified the pathogenicity of variants of unknown significance detected by current methods. The low diagnostic uplift achieved through sequence analysis of the other known breast/ovarian cancer susceptibility genes indicates that further high-risk genes remain to be identified.
Genomic identification of regulatory elements by evolutionary sequence comparison and functional analysis.

PubMed

Loots, Gabriela G

2008-01-01

Despite remarkable recent advances in genomics that have enabled us to identify most of the genes in the human genome, comparable efforts to define transcriptional cis-regulatory elements that control gene expression are lagging behind. The difficulty of this task stems from two equally important problems: our knowledge of how regulatory elements are encoded in genomes remains elementary, and there is a vast genomic search space for regulatory elements, since most of mammalian genomes are noncoding. Comparative genomic approaches are having a remarkable impact on the study of transcriptional regulation in eukaryotes and currently represent the most efficient and reliable methods of predicting noncoding sequences likely to control the patterns of gene expression. By subjecting eukaryotic genomic sequences to computational comparisons and subsequent experimentation, we are inching our way toward a more comprehensive catalog of common regulatory motifs that lie behind fundamental biological processes. We are still far from comprehending how the transcriptional regulatory code is encrypted in the human genome and providing an initial global view of regulatory gene networks, but collectively, the continued development of comparative and experimental approaches will rapidly expand our knowledge of the transcriptional regulome.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.

PubMed

Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H

2013-12-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.

Analysis of functional redundancies within the Arabidopsis TCP transcription factor family

PubMed Central

Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.

2013-01-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
Multi-tissue transcriptomics for construction of a comprehensive gene resource for the terrestrial snail Theba pisana.

PubMed

Zhao, M; Wang, T; Adamson, K J; Storey, K B; Cummins, S F

2016-02-08

The land snail Theba pisana is native to the Mediterranean region but has become one of the most abundant invasive species worldwide. Here, we present three transcriptomes of this agriculture pest derived from three tissues: the central nervous system, hepatopancreas (digestive gland), and foot muscle. Sequencing of the three tissues produced 339,479,092 high quality reads and a global de novo assembly generated a total of 250,848 unique transcripts (unigenes). BLAST analysis mapped 52,590 unigenes to NCBI non-redundant protein databases and further functional analysis annotated 21,849 unigenes with gene ontology. We report that T. pisana transcripts have representatives in all functional classes and a comparison of differentially expressed transcripts amongst all three tissues demonstrates enormous differences in their potential metabolic activities. The genes differentially expressed include those with sequence similarity to those genes associated with multiple bacterial diseases and neurological diseases. To provide a valuable resource that will assist functional genomics study, we have implemented a user-friendly web interface, ThebaDB (http://thebadb.bioinfo-minzhao.org/). This online database allows for complex text queries, sequence searches, and data browsing by enriched functional terms and KEGG mapping.
Genomic approaches for the elucidation of genes and gene networks underlying cardiovascular traits.

PubMed

Adriaens, M E; Bezzina, C R

2018-06-22

Genome-wide association studies have shed light on the association between natural genetic variation and cardiovascular traits. However, linking a cardiovascular trait associated locus to a candidate gene or set of candidate genes for prioritization for follow-up mechanistic studies is all but straightforward. Genomic technologies based on next-generation sequencing technology nowadays offer multiple opportunities to dissect gene regulatory networks underlying genetic cardiovascular trait associations, thereby aiding in the identification of candidate genes at unprecedented scale. RNA sequencing in particular becomes a powerful tool when combined with genotyping to identify loci that modulate transcript abundance, known as expression quantitative trait loci (eQTL), or loci modulating transcript splicing known as splicing quantitative trait loci (sQTL). Additionally, the allele-specific resolution of RNA-sequencing technology enables estimation of allelic imbalance, a state where the two alleles of a gene are expressed at a ratio differing from the expected 1:1 ratio. When multiple high-throughput approaches are combined with deep phenotyping in a single study, a comprehensive elucidation of the relationship between genotype and phenotype comes into view, an approach known as systems genetics. In this review, we cover key applications of systems genetics in the broad cardiovascular field.
Mapping cis- and trans-regulatory effects across multiple tissues in twins

PubMed Central

Grundberg, Elin; Small, Kerrin S.; Hedman, Åsa K.; Nica, Alexandra C.; Buil, Alfonso; Keildson, Sarah; Bell, Jordana T.; Yang, Tsun-Po; Meduri, Eshwar; Barrett, Amy; Nisbett, James; Sekowska, Magdalena; Wilk, Alicja; Shin, So-Youn; Glass, Daniel; Travers, Mary; Min, Josine L.; Ring, Sue; Ho, Karen; Thorleifsson, Gudmar; Kong, Augustine; Thorsteindottir, Unnur; Ainali, Chrysanthi; Dimas, Antigone S.; Hassanali, Neelam; Ingle, Catherine; Knowles, David; Krestyaninova, Maria; Lowe, Christopher E.; Di Meglio, Paola; Montgomery, Stephen B.; Parts, Leopold; Potter, Simon; Surdulescu, Gabriela; Tsaprouni, Loukia; Tsoka, Sophia; Bataille, Veronique; Durbin, Richard; Nestle, Frank O.; O’Rahilly, Stephen; Soranzo, Nicole; Lindgren, Cecilia M.; Zondervan, Krina T.; Ahmadi, Kourosh R.; Schadt, Eric E.; Stefansson, Kari; Smith, George Davey; McCarthy, Mark I.; Deloukas, Panos; Dermitzakis, Emmanouil T.; Spector, Tim D.

2013-01-01

Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many eQTL studies typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis-effect on expression cannot be accounted for by common cis-variants, a finding which exposes the contribution of low frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene and identify several replicating trans-variants which act predominantly in a tissue-restricted manner and may regulate the transcription of many genes. PMID:22941192
Divergent and nonuniform gene expression patterns in mouse brain

PubMed Central

Morris, John A.; Royall, Joshua J.; Bertagnolli, Darren; Boe, Andrew F.; Burnell, Josh J.; Byrnes, Emi J.; Copeland, Cathy; Desta, Tsega; Fischer, Shanna R.; Goldy, Jeff; Glattfelder, Katie J.; Kidney, Jolene M.; Lemon, Tracy; Orta, Geralyn J.; Parry, Sheana E.; Pathak, Sayan D.; Pearson, Owen C.; Reding, Melissa; Shapouri, Sheila; Smith, Kimberly A.; Soden, Chad; Solan, Beth M.; Weller, John; Takahashi, Joseph S.; Overly, Caroline C.; Lein, Ed S.; Hawrylycz, Michael J.; Hohmann, John G.; Jones, Allan R.

2010-01-01

Considerable progress has been made in understanding variations in gene sequence and expression level associated with phenotype, yet how genetic diversity translates into complex phenotypic differences remains poorly understood. Here, we examine the relationship between genetic background and spatial patterns of gene expression across seven strains of mice, providing the most extensive cellular-resolution comparative analysis of gene expression in the mammalian brain to date. Using comprehensive brainwide anatomic coverage (more than 200 brain regions), we applied in situ hybridization to analyze the spatial expression patterns of 49 genes encoding well-known pharmaceutical drug targets. Remarkably, over 50% of the genes examined showed interstrain expression variation. In addition, the variability was nonuniformly distributed across strain and neuroanatomic region, suggesting certain organizing principles. First, the degree of expression variance among strains mirrors genealogic relationships. Second, expression pattern differences were concentrated in higher-order brain regions such as the cortex and hippocampus. Divergence in gene expression patterns across the brain could contribute significantly to variations in behavior and responses to neuroactive drugs in laboratory mouse strains and may help to explain individual differences in human responsiveness to neuroactive drugs. PMID:20956311
Transcriptome and Gene Expression Analysis of the Rice Leaf Folder, Cnaphalocrosis medinalis

PubMed Central

Li, Shang-Wei; Yang, Hong; Liu, Yue-Feng; Liao, Qi-Rong; Du, Juan; Jin, Dao-Chao

2012-01-01

Background The rice leaf folder (RLF), Cnaphalocrocis medinalis (Guenee) (Lepidoptera: Pyralidae), is one of the most destructive pests affecting rice in Asia. Although several studies have been performed on the ecological and physiological aspects of this species, the molecular mechanisms underlying its developmental regulation, behavior, and insecticide resistance remain largely unknown. Presently, there is a lack of genomic information for RLF; therefore, studies aimed at profiling the RLF transcriptome expression would provide a better understanding of its biological function at the molecular level. Principal Findings De novo assembly of the RLF transcriptome was performed via the short read sequencing technology (Illumina). In a single run, we produced more than 23 million sequencing reads that were assembled into 44,941 unigenes (mean size = 474 bp) by Trinity. Through a similarity search, 25,281 (56.82%) unigenes matched known proteins in the NCBI Nr protein database. The transcriptome sequences were annotated with gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). Additionally, we profiled gene expression during RLF development using a tag-based digital gene expression (DGE) system. Five DGE libraries were constructed, and variations in gene expression were compared between collected samples: eggs vs. 3rd instar larvae, 3rd instar larvae vs. pupae, pupae vs. adults. The results demonstrated that thousands of genes were significantly differentially expressed during various developmental stages. A number of the differentially expressed genes were confirmed by quantitative real-time PCR (qRT-PCR). Conclusions The RLF transcriptome and DGE data provide a comprehensive and global gene expression profile that would further promote our understanding of the molecular mechanisms underlying various biological characteristics, including development, elevated fecundity, flight, sex differentiation, olfactory behavior, and insecticide resistance in RLF. Therefore, these findings could help elucidate the intrinsic factors involved in the RLF-mediated destruction of rice and offer sustainable insect pest management. PMID:23185238
Writing DNA with GenoCAD.

PubMed

Czar, Michael J; Cai, Yizhi; Peccoud, Jean

2009-07-01

Chemical synthesis of custom DNA made to order calls for software streamlining the design of synthetic DNA sequences. GenoCAD (www.genocad.org) is a free web-based application to design protein expression vectors, artificial gene networks and other genetic constructs composed of multiple functional blocks called genetic parts. By capturing design strategies in grammatical models of DNA sequences, GenoCAD guides the user through the design process. By successively clicking on icons representing structural features or actual genetic parts, complex constructs composed of dozens of functional blocks can be designed in a matter of minutes. GenoCAD automatically derives the construct sequence from its comprehensive libraries of genetic parts. Upon completion of the design process, users can download the sequence for synthesis or further analysis. Users who elect to create a personal account on the system can customize their workspace by creating their own parts libraries, adding new parts to the libraries, or reusing designs to quickly generate sets of related constructs.
Effects of 4-chlorophenol wastewater treatment on sludge acute toxicity, microbial diversity and functional genes expression in an activated sludge process.

PubMed

Zhao, Jianguo; Li, Yahe; Li, Yu; Yu, Zeya; Chen, Xiurong

2018-05-31

In this study, the effects of 4-chlorophenol (4-CP) wastewater treatment on sludge acute toxicity of luminescent bacteria, microbial diversity and functional genes expression of Pseudomonas were explored. Results showed that in the entire operational process, the sludge acute toxicity acclimated by 4-CP in a sequencing batch bioreactor (SBR) was significantly higher than the control SBR without 4-CP. The dominant phyla in acclimated SBR were Proteobacteria and Firmicutes, which also existed in control SBR. Some identified genera in acclimated SBR were responsible for 4-CP degradation. At the stable operational stages, the functional genes expression of Pseudomonas in acclimated SBR was down-regulated at the end of SBR cycle, and their expression mechanisms needed further research. This study provides a theoretical support to comprehensively understand the sludge performance in industrial wastewater treatment. Copyright © 2018 Elsevier Ltd. All rights reserved.
ISOL@: an Italian SOLAnaceae genomics resource.

PubMed

Chiusano, Maria Luisa; D'Agostino, Nunzio; Traini, Alessandra; Licciardello, Concetta; Raimondo, Enrico; Aversano, Mario; Frusciante, Luigi; Monti, Luigi

2008-03-26

Present-day '-omics' technologies produce overwhelming amounts of data which include genome sequences, information on gene expression (transcripts and proteins) and on cell metabolic status. These data represent multiple aspects of a biological system and need to be investigated as a whole to shed light on the mechanisms which underpin the system functionality. The gathering and convergence of data generated by high-throughput technologies, the effective integration of different data-sources and the analysis of the information content based on comparative approaches are key methods for meaningful biological interpretations. In the frame of the International Solanaceae Genome Project, we propose here ISOLA, an Italian SOLAnaceae genomics resource. ISOLA (available at http://biosrv.cab.unina.it/isola) represents a trial platform and it is conceived as a multi-level computational environment.ISOLA currently consists of two main levels: the genome and the expression level. The cornerstone of the genome level is represented by the Solanum lycopersicum genome draft sequences generated by the International Tomato Genome Sequencing Consortium. Instead, the basic element of the expression level is the transcriptome information from different Solanaceae species, mainly in the form of species-specific comprehensive collections of Expressed Sequence Tags (ESTs). The cross-talk between the genome and the expression levels is based on data source sharing and on tools that enhance data quality, that extract information content from the levels' under parts and produce value-added biological knowledge. ISOLA is the result of a bioinformatics effort that addresses the challenges of the post-genomics era. It is designed to exploit '-omics' data based on effective integration to acquire biological knowledge and to approach a systems biology view. Beyond providing experimental biologists with a preliminary annotation of the tomato genome, this effort aims to produce a trial computational environment where different aspects and details are maintained as they are relevant for the analysis of the organization, the functionality and the evolution of the Solanaceae family.
Sequence and Expression Analyses of Ethylene Response Factors Highly Expressed in Latex Cells from Hevea brasiliensis

PubMed Central

Piyatrakul, Piyanuch; Yang, Meng; Putranto, Riza-Arief; Pirrello, Julien; Dessailly, Florence; Hu, Songnian; Summo, Marilyne; Theeravatanasuk, Kannikar; Leclercq, Julie; Kuswanhadi; Montoro, Pascal

2014-01-01

The AP2/ERF superfamily encodes transcription factors that play a key role in plant development and responses to abiotic and biotic stress. In Hevea brasiliensis, ERF genes have been identified by RNA sequencing. This study set out to validate the number of HbERF genes, and identify ERF genes involved in the regulation of latex cell metabolism. A comprehensive Hevea transcriptome was improved using additional RNA reads from reproductive tissues. Newly assembled contigs were annotated in the Gene Ontology database and were assigned to 3 main categories. The AP2/ERF superfamily is the third most represented compared with other transcription factor families. A comparison with genomic scaffolds led to an estimation of 114 AP2/ERF genes and 1 soloist in Hevea brasiliensis. Based on a phylogenetic analysis, functions were predicted for 26 HbERF genes. A relative transcript abundance analysis was performed by real-time RT-PCR in various tissues. Transcripts of ERFs from group I and VIII were very abundant in all tissues while those of group VII were highly accumulated in latex cells. Seven of the thirty-five ERF expression marker genes were highly expressed in latex. Subcellular localization and transactivation analyses suggested that HbERF-VII candidate genes encoded functional transcription factors. PMID:24971876
Sequence and expression analyses of ethylene response factors highly expressed in latex cells from Hevea brasiliensis.

PubMed

Piyatrakul, Piyanuch; Yang, Meng; Putranto, Riza-Arief; Pirrello, Julien; Dessailly, Florence; Hu, Songnian; Summo, Marilyne; Theeravatanasuk, Kannikar; Leclercq, Julie; Kuswanhadi; Montoro, Pascal

2014-01-01

The AP2/ERF superfamily encodes transcription factors that play a key role in plant development and responses to abiotic and biotic stress. In Hevea brasiliensis, ERF genes have been identified by RNA sequencing. This study set out to validate the number of HbERF genes, and identify ERF genes involved in the regulation of latex cell metabolism. A comprehensive Hevea transcriptome was improved using additional RNA reads from reproductive tissues. Newly assembled contigs were annotated in the Gene Ontology database and were assigned to 3 main categories. The AP2/ERF superfamily is the third most represented compared with other transcription factor families. A comparison with genomic scaffolds led to an estimation of 114 AP2/ERF genes and 1 soloist in Hevea brasiliensis. Based on a phylogenetic analysis, functions were predicted for 26 HbERF genes. A relative transcript abundance analysis was performed by real-time RT-PCR in various tissues. Transcripts of ERFs from group I and VIII were very abundant in all tissues while those of group VII were highly accumulated in latex cells. Seven of the thirty-five ERF expression marker genes were highly expressed in latex. Subcellular localization and transactivation analyses suggested that HbERF-VII candidate genes encoded functional transcription factors.
Methods, Tools and Current Perspectives in Proteogenomics *

PubMed Central

Ruggles, Kelly V.; Krug, Karsten; Wang, Xiaojing; Clauser, Karl R.; Wang, Jing; Payne, Samuel H.; Fenyö, David; Zhang, Bing; Mani, D. R.

2017-01-01

With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, i.e. the integrative analysis of proteomic and genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications. PMID:28456751
Expressed MHC class II genes in sea otters (Enhydra lutris) from geographically disparate populations

USGS Publications Warehouse

Bowen, Lizabeth; Aldridge, B.M.; Miles, A. Keith; Stott, J.L.

2006-01-01

The major histocompatibility complex (MHC) is central to maintaining the immunologic vigor of individuals and populations. Classical MHC class II genes were targeted for partial sequencing in sea otters (Enhydra lutris) from populations in California, Washington, and Alaska. Sequences derived from sea otter peripheral blood leukocyte mRNAs were similar to those classified as DQA, DQB, DRA, and DRB in other species. Comparisons of the derived amino acid compositions supported the classification of these as functional molecules from at least one DQA, DQB, and DRA locus and at least two DRB loci. While limited in scope, phylogenetic analysis of the DRB peptide‐binding region suggested the possible existence of distinct clades demarcated by geographic region. These preliminary findings support the need for additional MHC gene sequencing and expansion to a comprehensive study targeting additional otters.
Comprehensive Interrogation of Natural TALE DNA Binding Modules and Transcriptional Repressor Domains

PubMed Central

Cong, Le; Zhou, Ruhong; Kuo, Yu-chi; Cunniff, Margaret; Zhang, Feng

2012-01-01

Transcription activator-like effectors (TALE) are sequence-specific DNA binding proteins that harbor modular, repetitive DNA binding domains. TALEs have enabled the creation of customizable designer transcriptional factors and sequence-specific nucleases for genome engineering. Here we report two improvements of the TALE toolbox for achieving efficient activation and repression of endogenous gene expression in mammalian cells. We show that the naturally occurring repeat variable diresidue (RVD) Asn-His (NH) has high biological activity and specificity for guanine, a highly prevalent base in mammalian genomes. We also report an effective TALE transcriptional repressor architecture for targeted inhibition of transcription in mammalian cells. These findings will improve the precision and effectiveness of genome engineering that can be achieved using TALEs. PMID:22828628
Genome-wide analysis of the Solanum tuberosum (potato) trehalose-6-phosphate synthase (TPS) gene family: evolution and differential expression during development and stress.

PubMed

Xu, Yingchun; Wang, Yanjie; Mattson, Neil; Yang, Liu; Jin, Qijiang

2017-12-01

Trehalose-6-phosphate synthase (TPS) serves important functions in plant desiccation tolerance and response to environmental stimuli. At present, a comprehensive analysis, i.e. functional classification, molecular evolution, and expression patterns of this gene family are still lacking in Solanum tuberosum (potato). In this study, a comprehensive analysis of the TPS gene family was conducted in potato. A total of eight putative potato TPS genes (StTPSs) were identified by searching the latest potato genome sequence. The amino acid identity among eight StTPSs varied from 59.91 to 89.54%. Analysis of d N /d S ratios suggested that regions in the TPP (trehalose-6-phosphate phosphatase) domains evolved faster than the TPS domains. Although the sequence of the eight StTPSs showed high similarity (2571-2796 bp), their gene length is highly differentiated (3189-8406 bp). Many of the regulatory elements possibly related to phytohormones, abiotic stress and development were identified in different TPS genes. Based on the phylogenetic tree constructed using TPS genes of potato, and four other Solanaceae plants, TPS genes could be categorized into 6 distinct groups. Analysis revealed that purifying selection most likely played a major role during the evolution of this family. Amino acid changes detected in specific branches of the phylogenetic tree suggests relaxed constraints might have contributed to functional divergence among groups. Moreover, StTPSs were found to exhibit tissue and treatment specific expression patterns upon analysis of transcriptome data, and performing qRT-PCR. This study provides a reference for genome-wide identification of the potato TPS gene family and sets a framework for further functional studies of this important gene family in development and stress response.
An expanded maize gene expression atlas based on RNA sequencing and its use to explore root development

DOE PAGES

Stelpflug, Scott C.; Sekhon, Rajandeep S.; Vaillancourt, Brieanne; ...

2015-12-30

Comprehensive and systematic transcriptome profiling provides valuable insight into biological and developmental processes that occur throughout the life cycle of a plant. We have enhanced our previously published microarray-based gene atlas of maize ( Zea mays L.) inbred B73 to now include 79 distinct replicated samples that have been interrogated using RNA sequencing (RNA-seq). The current version of the atlas includes 50 original array-based gene atlas samples, a time-course of 12 stalk and leaf samples postflowering, and an additional set of 17 samples from the maize seedling and adult root system. The entire dataset contains 4.6 billion mapped reads, withmore » an average of 20.5 million mapped reads per biological replicate, allowing for detection of genes with lower transcript abundance. As the new root samples represent key additions to the previously examined tissues, we highlight insights into the root transcriptome, which is represented by 28,894 (73.2%) annotated genes in maize. Additionally, we observed remarkable expression differences across both the longitudinal (four zones) and radial gradients (cortical parenchyma and stele) of the primary root supported by fourfold differential expression of 9353 and 4728 genes, respectively. Among the latter were 1110 genes that encode transcription factors, some of which are orthologs of previously characterized transcription factors known to regulate root development in Arabidopsis thaliana (L.) Heynh., while most are novel, and represent attractive targets for reverse genetics approaches to determine their roles in this important organ. As a result, this comprehensive transcriptome dataset is a powerful tool toward understanding maize development, physiology, and phenotypic diversity.« less
An expanded maize gene expression atlas based on RNA sequencing and its use to explore root development

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stelpflug, Scott C.; Sekhon, Rajandeep S.; Vaillancourt, Brieanne

Comprehensive and systematic transcriptome profiling provides valuable insight into biological and developmental processes that occur throughout the life cycle of a plant. We have enhanced our previously published microarray-based gene atlas of maize ( Zea mays L.) inbred B73 to now include 79 distinct replicated samples that have been interrogated using RNA sequencing (RNA-seq). The current version of the atlas includes 50 original array-based gene atlas samples, a time-course of 12 stalk and leaf samples postflowering, and an additional set of 17 samples from the maize seedling and adult root system. The entire dataset contains 4.6 billion mapped reads, withmore » an average of 20.5 million mapped reads per biological replicate, allowing for detection of genes with lower transcript abundance. As the new root samples represent key additions to the previously examined tissues, we highlight insights into the root transcriptome, which is represented by 28,894 (73.2%) annotated genes in maize. Additionally, we observed remarkable expression differences across both the longitudinal (four zones) and radial gradients (cortical parenchyma and stele) of the primary root supported by fourfold differential expression of 9353 and 4728 genes, respectively. Among the latter were 1110 genes that encode transcription factors, some of which are orthologs of previously characterized transcription factors known to regulate root development in Arabidopsis thaliana (L.) Heynh., while most are novel, and represent attractive targets for reverse genetics approaches to determine their roles in this important organ. As a result, this comprehensive transcriptome dataset is a powerful tool toward understanding maize development, physiology, and phenotypic diversity.« less
iFeature: a python package and web server for features extraction and selection from protein and peptide sequences.

PubMed

Chen, Zhen; Zhao, Pei; Li, Fuyi; Leier, André; Marquez-Lago, Tatiana T; Wang, Yanan; Webb, Geoffrey I; Smith, A Ian; Daly, Roger J; Chou, Kuo-Chen; Song, Jiangning

2018-03-08

Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding schemes that encompass 53 different types of feature descriptors. It also allows users to extract specific amino acid properties from the AAindex database. Furthermore, iFeature integrates 12 different types of commonly used feature clustering, selection, and dimensionality reduction algorithms, greatly facilitating training, analysis, and benchmarking of machine-learning models. The functionality of iFeature is made freely available via an online web server and a stand-alone toolkit. http://iFeature.erc.monash.edu/; https://github.com/Superzchen/iFeature/. jiangning.song@monash.edu; kcchou@gordonlifescience.org; roger.daly@monash.edu. Supplementary data are available at Bioinformatics online.
Investigation of Experimental Factors That Underlie BRCA1/2 mRNA Isoform Expression Variation: Recommendations for Utilizing Targeted RNA Sequencing to Evaluate Potential Spliceogenic Variants

PubMed Central

Lattimore, Vanessa L.; Pearson, John F.; Currie, Margaret J.; Spurdle, Amanda B.; Robinson, Bridget A.; Walker, Logan C.

2018-01-01

PCR-based RNA splicing assays are commonly used in diagnostic and research settings to assess the potential effects of variants of uncertain clinical significance in BRCA1 and BRCA2. The Evidence-based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium completed a multicentre investigation to evaluate differences in assay design and the integrity of published data, raising a number of methodological questions associated with cell culture conditions and PCR-based protocols. We utilized targeted RNA-seq to re-assess BRCA1 and BRCA2 mRNA isoform expression patterns in lymphoblastoid cell lines (LCLs) previously used in the multicentre ENIGMA study. Capture of the targeted cDNA sequences was carried out using 34 BRCA1 and 28 BRCA2 oligonucleotides from the Illumina Truseq Targeted RNA Expression platform. Our results show that targeted RNA-seq analysis of LCLs overcomes many of the methodology limitations associated with PCR-based assays leading us to make the following observations and recommendations: (1) technical replicates (n > 2) of variant carriers to capture methodology induced variability associated with RNA-seq assays, (2) LCLs can undergo multiple freeze/thaw cycles and can be cultured up to 2 weeks without noticeably influencing isoform expression levels, (3) nonsense-mediated decay inhibitors are essential prior to splicing assays for comprehensive mRNA isoform detection, (4) quantitative assessment of exon:exon junction levels across BRCA1 and BRCA2 can help distinguish between normal and aberrant isoform expression patterns. Experimentally derived recommendations from this study will facilitate the application of targeted RNA-seq platforms for the quantitation of BRCA1 and BRCA2 mRNA aberrations associated with sequence variants of uncertain clinical significance. PMID:29774201
Investigation of Experimental Factors That Underlie BRCA1/2 mRNA Isoform Expression Variation: Recommendations for Utilizing Targeted RNA Sequencing to Evaluate Potential Spliceogenic Variants.

PubMed

Lattimore, Vanessa L; Pearson, John F; Currie, Margaret J; Spurdle, Amanda B; Robinson, Bridget A; Walker, Logan C

2018-01-01

PCR-based RNA splicing assays are commonly used in diagnostic and research settings to assess the potential effects of variants of uncertain clinical significance in BRCA1 and BRCA2 . The Evidence-based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium completed a multicentre investigation to evaluate differences in assay design and the integrity of published data, raising a number of methodological questions associated with cell culture conditions and PCR-based protocols. We utilized targeted RNA-seq to re-assess BRCA1 and BRCA2 mRNA isoform expression patterns in lymphoblastoid cell lines (LCLs) previously used in the multicentre ENIGMA study. Capture of the targeted cDNA sequences was carried out using 34 BRCA1 and 28 BRCA2 oligonucleotides from the Illumina Truseq Targeted RNA Expression platform. Our results show that targeted RNA-seq analysis of LCLs overcomes many of the methodology limitations associated with PCR-based assays leading us to make the following observations and recommendations: (1) technical replicates ( n > 2) of variant carriers to capture methodology induced variability associated with RNA-seq assays, (2) LCLs can undergo multiple freeze/thaw cycles and can be cultured up to 2 weeks without noticeably influencing isoform expression levels, (3) nonsense-mediated decay inhibitors are essential prior to splicing assays for comprehensive mRNA isoform detection, (4) quantitative assessment of exon:exon junction levels across BRCA1 and BRCA2 can help distinguish between normal and aberrant isoform expression patterns. Experimentally derived recommendations from this study will facilitate the application of targeted RNA-seq platforms for the quantitation of BRCA1 and BRCA2 mRNA aberrations associated with sequence variants of uncertain clinical significance.

Transcriptome sequencing of the Antarctic vascular plant Deschampsia antarctica Desv. under abiotic stress.

PubMed

Lee, Jungeun; Noh, Eun Kyeung; Choi, Hyung-Seok; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

2013-03-01

Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been studied as an extremophile that has successfully adapted to marginal land with the harshest environment for terrestrial plants. However, limited genetic research has focused on this species due to the lack of genomic resources. Here, we present the first de novo assembly of its transcriptome by massive parallel sequencing and its expression profile using D. antarctica grown under various stress conditions. Total sequence reads generated by pyrosequencing were assembled into 60,765 unigenes (28,177 contigs and 32,588 singletons). A total of 29,173 unique protein-coding genes were identified based on sequence similarities to known proteins. The combined results from all three stress conditions indicated differential expression of 3,110 genes. Quantitative reverse transcription polymerase chain reaction showed that several well-known stress-responsive genes encoding late embryogenesis abundant protein, dehydrin 1, and ice recrystallization inhibition protein were induced dramatically and that genes encoding U-box-domain-containing protein, electron transfer flavoprotein-ubiquinone, and F-box-containing protein were induced by abiotic stressors in a manner conserved with other plant species. We identified more than 2,000 simple sequence repeats that can be developed as functional molecular markers. This dataset is the most comprehensive transcriptome resource currently available for D. antarctica and is therefore expected to be an important foundation for future genetic studies of grasses and extremophiles.
ARMOUR - A Rice miRNA: mRNA Interaction Resource.

PubMed

Sanan-Mishra, Neeti; Tripathi, Anita; Goswami, Kavita; Shukla, Rohit N; Vasudevan, Madavan; Goswami, Hitesh

2018-01-01

ARMOUR was developed as A Rice miRNA:mRNA interaction resource. This informative and interactive database includes the experimentally validated expression profiles of miRNAs under different developmental and abiotic stress conditions across seven Indian rice cultivars. This comprehensive database covers 689 known and 1664 predicted novel miRNAs and their expression profiles in more than 38 different tissues or conditions along with their predicted/known target transcripts. The understanding of miRNA:mRNA interactome in regulation of functional cellular machinery is supported by the sequence information of the mature and hairpin structures. ARMOUR provides flexibility to users in querying the database using multiple ways like known gene identifiers, gene ontology identifiers, KEGG identifiers and also allows on the fly fold change analysis and sequence search query with inbuilt BLAST algorithm. ARMOUR database provides a cohesive platform for novel and mature miRNAs and their expression in different experimental conditions and allows searching for their interacting mRNA targets, GO annotation and their involvement in various biological pathways. The ARMOUR database includes a provision for adding more experimental data from users, with an aim to develop it as a platform for sharing and comparing experimental data contributed by research groups working on rice.
WHOLE-GENOME SEQUENCING OF SALIVARY GLAND ADENOID CYSTIC CARCINOMA

PubMed Central

Rettig, Eleni M; Talbot, C Conover; Sausen, Mark; Jones, Sian; Bishop, Justin A; Wood, Laura D; Tokheim, Collin; Niknafs, Noushin; Karchin, Rachel; Fertig, Elana J; Wheelan, Sarah J; Marchionni, Luigi; Considine, Michael; Ling, Shizhang; Fakhry, Carole; Papadopoulos, Nickolas; Kinzler, Kenneth W; Vogelstein, Bert; Ha, Patrick K; Agrawal, Nishant

2016-01-01

Adenoid cystic carcinomas (ACCs) of the salivary glands are challenging to understand, treat, and cure. To better understand the genetic alterations underlying the pathogenesis of these tumors, we performed comprehensive genome analyses of 25 fresh-frozen tumors, including whole genome sequencing, expression and pathway analyses. In addition to the well-described MYB-NFIB fusion which was found in 11 tumors (44%), we observed five different rearrangements involving the NFIB transcription factor gene in seven tumors (28%). Taken together, NFIB translocations occurred in 15 of 25 samples (60%, 95%CI=41–77%). In addition, mRNA expression analysis of 17 tumors revealed overexpression of NFIB in ACC tumors compared with normal tissues (p=0.002). There was no difference in NFIB mRNA expression in tumors with NFIB fusions compared to those without. We also report somatic mutations of genes involved in the axonal guidance and Rho family signaling pathways. Finally, we confirm previously described alterations in genes related to chromatin regulation and Notch signaling. Our findings suggest a separate role for NFIB in ACC oncogenesis and highlight important signaling pathways for future functional characterization and potential therapeutic targeting. PMID:26862087
TP53 Mutation Status of Tubo-ovarian and Peritoneal High-grade Serous Carcinoma with a Wild-type p53 Immunostaining Pattern.

PubMed

Na, Kiyong; Sung, Ji-Youn; Kim, Hyun-Soo

2017-12-01

Diffuse and strong nuclear p53 immunoreactivity and a complete lack of p53 expression are regarded as indicative of missense and nonsense mutations, respectively, of the TP53 gene. Tubo-ovarian and peritoneal high-grade serous carcinoma (HGSC) is characterized by aberrant p53 expression induced by a TP53 mutation. However, our experience with some HGSC cases with a wild-type p53 immunostaining pattern led us to comprehensively review previous cases and investigate the TP53 mutational status of the exceptional cases. We analyzed the immunophenotype of 153 cases of HGSC and performed TP53 gene sequencing analysis in those with a wild-type p53 immunostaining pattern. Immunostaining revealed that 109 (71.3%) cases displayed diffuse and strong p53 expression (missense mutation pattern), while 39 (25.5%) had no p53 expression (nonsense mutation pattern). The remaining five cases of HGSC showed a wild-type p53 immunostaining pattern. Direct sequencing analysis revealed that three of these cases harbored nonsense TP53 mutations and two had novel splice site deletions. TP53 mutation is almost invariably present in HGSC, and p53 immunostaining can be used as a surrogate marker of TP53 mutation. In cases with a wild-type p53 immunostaining pattern, direct sequencing for TP53 mutational status can be helpful to confirm the presence of a TP53 mutation. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
Deep insight into the Ganoderma lucidum by comprehensive analysis of its transcriptome.

PubMed

Yu, Guo-Jun; Wang, Man; Huang, Jie; Yin, Ya-Lin; Chen, Yi-Jie; Jiang, Shuai; Jin, Yan-Xia; Lan, Xian-Qing; Wong, Barry Hon Cheung; Liang, Yi; Sun, Hui

2012-01-01

Ganoderma lucidum is a basidiomycete white rot fungus and is of medicinal importance in China, Japan and other countries in the Asiatic region. To date, much research has been performed in identifying the medicinal ingredients in Ganoderma lucidum. Despite its important therapeutic effects in disease, little is known about Ganoderma lucidum at the genomic level. In order to gain a molecular understanding of this fungus, we utilized Illumina high-throughput technology to sequence and analyze the transcriptome of Ganoderma lucidum. We obtained 6,439,690 and 6,416,670 high-quality reads from the mycelium and fruiting body of Ganoderma lucidum, and these were assembled to form 18,892 and 27,408 unigenes, respectively. A similarity search was performed against the NCBI non-redundant nucleotide database and a customized database composed of five fungal genomes. 11,098 and 8, 775 unigenes were matched to the NCBI non-redundant nucleotide database and our customized database, respectively. All unigenes were subjected to annotation by Gene Ontology, Eukaryotic Orthologous Group terms and Kyoto Encyclopedia of Genes and Genomes. Differentially expressed genes from the Ganoderma lucidum mycelium and fruiting body stage were analyzed, resulting in the identification of 13 unigenes which are involved in the terpenoid backbone biosynthesis pathway. Quantitative real-time PCR was used to confirm the expression levels of these unigenes. Ganoderma lucidum was also studied for wood degrading activity and a total of 22 putative FOLymes (fungal oxidative lignin enzymes) and 120 CAZymes (carbohydrate-active enzymes) were predicted from our Ganoderma lucidum transcriptome. Our study provides comprehensive gene expression information on Ganoderma lucidum at the transcriptional level, which will form the foundation for functional genomics studies in this fungus. The use of Illumina sequencing technology has made de novo transcriptome assembly and gene expression analysis possible in species that lack full genome information.
Deep Insight into the Ganoderma lucidum by Comprehensive Analysis of Its Transcriptome

PubMed Central

Yu, Guo-Jun; Wang, Man; Huang, Jie; Yin, Ya-Lin; Chen, Yi-Jie; Jiang, Shuai; Jin, Yan-Xia; Lan, Xian-Qing; Wong, Barry Hon Cheung; Liang, Yi; Sun, Hui

2012-01-01

Background Ganoderma lucidum is a basidiomycete white rot fungus and is of medicinal importance in China, Japan and other countries in the Asiatic region. To date, much research has been performed in identifying the medicinal ingredients in Ganoderma lucidum. Despite its important therapeutic effects in disease, little is known about Ganoderma lucidum at the genomic level. In order to gain a molecular understanding of this fungus, we utilized Illumina high-throughput technology to sequence and analyze the transcriptome of Ganoderma lucidum. Methodology/Principal Findings We obtained 6,439,690 and 6,416,670 high-quality reads from the mycelium and fruiting body of Ganoderma lucidum, and these were assembled to form 18,892 and 27,408 unigenes, respectively. A similarity search was performed against the NCBI non-redundant nucleotide database and a customized database composed of five fungal genomes. 11,098 and 8, 775 unigenes were matched to the NCBI non-redundant nucleotide database and our customized database, respectively. All unigenes were subjected to annotation by Gene Ontology, Eukaryotic Orthologous Group terms and Kyoto Encyclopedia of Genes and Genomes. Differentially expressed genes from the Ganoderma lucidum mycelium and fruiting body stage were analyzed, resulting in the identification of 13 unigenes which are involved in the terpenoid backbone biosynthesis pathway. Quantitative real-time PCR was used to confirm the expression levels of these unigenes. Ganoderma lucidum was also studied for wood degrading activity and a total of 22 putative FOLymes (fungal oxidative lignin enzymes) and 120 CAZymes (carbohydrate-active enzymes) were predicted from our Ganoderma lucidum transcriptome. Conclusions Our study provides comprehensive gene expression information on Ganoderma lucidum at the transcriptional level, which will form the foundation for functional genomics studies in this fungus. The use of Illumina sequencing technology has made de novo transcriptome assembly and gene expression analysis possible in species that lack full genome information. PMID:22952861
Screening differentially expressed genes in an amphipod (Hyalella azteca) exposed to fungicide vinclozolin by suppression subtractive hybridization.

PubMed

Wu, Yun H; Wu, Tsung M; Hong, Chwan Y; Wang, Yei S; Yen, Jui H

2014-01-01

Vinclozolin, a dicarboximide fungicide, is an endocrine disrupting chemical that competes with an androgenic endocrine disruptor compound. Most research has focused on the epigenetic effect of vinclozolin in humans. In terms of ecotoxicology, understanding the effect of vinclozolin on non-target organisms is important. The expression profile of a comprehensive set of genes in the amphipod Hyalella azteca exposed to vinclozolin was examined. The expressed sequence tags in low-dose vinclozolin-treated and -untreated amphipods were isolated and identified by suppression subtractive hybridization. DNA dot blotting was used to confirm the results and establish a subtracted cDNA library for comparing all differentially expressed sequences with and without vinclozolin treatment. In total, 494 differentially expressed genes, including hemocyanin, heatshock protein, cytochrome, cytochrome oxidase and NADH dehydrogenase were detected. Hemocyanin was the most abundant gene. DNA dot blotting revealed 55 genes with significant differential expression. These genes included larval serum protein 1 alpha, E3 ubiquitin-protein ligase, mitochondrial cytochrome c oxidase, mitochondrial protein, proteasome inhibitor, hemocyanin, zinc-finger-containing protein, mitochondrial NADH-ubiquinone oxidoreductase and epididymal sperm-binding protein. Vinclozolin appears to upregulate stress-related genes and hemocyanin, related to immunity. Moreover, vinclozolin downregulated NADH dehydrogenase, related to respiration. Thus, even a non-lethal concentration of vinclozolin still has an effect at the genetic level in H. azteca and presents a potential risk, especially as it would affect non-target organism hormone metabolism.
Identification and Characterization of 40 Isolated Rehmannia glutinosa MYB Family Genes and Their Expression Profiles in Response to Shading and Continuous Cropping

PubMed Central

Wang, Fengqing; Suo, Yanfei; Wei, He; Li, Mingjie; Xie, Caixia; Wang, Lina; Chen, Xinjian; Zhang, Zhongyi

2015-01-01

The v-myb avian myeloblastosis viral oncogene homolog (MYB) superfamily constitutes one of the most abundant groups of transcription factors (TFs) described in plants. To date, little is known about the MYB genes in Rehmannia glutinosa. Forty unique MYB genes with full-length cDNA sequences were isolated. These 40 genes were grouped into five categories, one R1R2R3-MYB, four TRFL MYBs, four SMH MYBs, 25 R2R3-MYBs, and six MYB-related members. The MYB DNA-binding domain (DBD) sequence composition was conserved among proteins of the same subgroup. As expected, most of the closely related members in the phylogenetic tree exhibited common motifs. Additionally, the gene structure and motifs of the R. glutinosa MYB genes were analyzed. MYB gene expression was analyzed in the leaf and the tuberous root under two abiotic stress conditions. Expression profiles showed that most R. glutinosa MYB genes were expressed in the leaf and the tuberous root, suggesting that MYB genes are involved in various physiological and developmental processes in R. glutinosa. Seven MYB genes were up-regulated in response to shading in at least one tissue. Two MYB genes showed increased expression and 13 MYB genes showed decreased expression in the tuberous root under continuous cropping. This investigation is the first comprehensive study of the MYB gene family in R. glutinosa. PMID:26147429
Quantitative DNA Methylation Analysis Identifies a Single CpG Dinucleotide Important for ZAP-70 Expression and Predictive of Prognosis in Chronic Lymphocytic Leukemia

PubMed Central

Claus, Rainer; Lucas, David M.; Stilgenbauer, Stephan; Ruppert, Amy S.; Yu, Lianbo; Zucknick, Manuela; Mertens, Daniel; Bühler, Andreas; Oakes, Christopher C.; Larson, Richard A.; Kay, Neil E.; Jelinek, Diane F.; Kipps, Thomas J.; Rassenti, Laura Z.; Gribben, John G.; Döhner, Hartmut; Heerema, Nyla A.; Marcucci, Guido; Plass, Christoph; Byrd, John C.

2012-01-01

Purpose Increased ZAP-70 expression predicts poor prognosis in chronic lymphocytic leukemia (CLL). Current methods for accurately measuring ZAP-70 expression are problematic, preventing widespread application of these tests in clinical decision making. We therefore used comprehensive DNA methylation profiling of the ZAP-70 regulatory region to identify sites important for transcriptional control. Patients and Methods High-resolution quantitative DNA methylation analysis of the entire ZAP-70 gene regulatory regions was conducted on 247 samples from patients with CLL from four independent clinical studies. Results Through this comprehensive analysis, we identified a small area in the 5′ regulatory region of ZAP-70 that showed large variability in methylation in CLL samples but was universally methylated in normal B cells. High correlation with mRNA and protein expression, as well as activity in promoter reporter assays, revealed that within this differentially methylated region, a single CpG dinucleotide and neighboring nucleotides are particularly important in ZAP-70 transcriptional regulation. Furthermore, by using clustering approaches, we identified a prognostic role for this site in four independent data sets of patients with CLL using time to treatment, progression-free survival, and overall survival as clinical end points. Conclusion Comprehensive quantitative DNA methylation analysis of the ZAP-70 gene in CLL identified important regions responsible for transcriptional regulation. In addition, loss of methylation at a specific single CpG dinucleotide in the ZAP-70 5′ regulatory sequence is a highly predictive and reproducible biomarker of poor prognosis in this disease. This work demonstrates the feasibility of using quantitative specific ZAP-70 methylation analysis as a relevant clinically applicable prognostic test in CLL. PMID:22564988
A comprehensive resource of drought- and salinity- responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.)

PubMed Central

2009-01-01

Background Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. Results A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (≤1E-05) to sequences in the non-redundant UniProt database. BLASTN analysis of unique sequences with ESTs of four legume species (Medicago, Lotus, soybean and groundnut) and three model plant species (rice, Arabidopsis and poplar) provided insights on conserved genes across legumes as well as novel transcripts for chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with ≥ 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Conclusion Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species. PMID:19912666
De novo sequencing and analysis of the transcriptome during the browning of fresh-cut Luffa cylindrica 'Fusi-3' fruits

PubMed Central

Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng

2017-01-01

Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar ‘Fusi-3’. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1–6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism. PMID:29145430
Comprehensive Genomic Characterization of Upper Tract Urothelial Carcinoma.

PubMed

Moss, Tyler J; Qi, Yuan; Xi, Liu; Peng, Bo; Kim, Tae-Beom; Ezzedine, Nader E; Mosqueda, Maribel E; Guo, Charles C; Czerniak, Bogdan A; Ittmann, Michael; Wheeler, David A; Lerner, Seth P; Matin, Surena F

2017-10-01

Upper urinary tract urothelial cancer (UTUC) may have unique etiologic and genomic factors compared to bladder cancer. To characterize the genomic landscape of UTUC and provide insights into its biology using comprehensive integrated genomic analyses. We collected 31 untreated snap-frozen UTUC samples from two institutions and carried out whole-exome sequencing (WES) of DNA, RNA sequencing (RNAseq), and protein analysis. Adjusting for batch effects, consensus mutation calls from independent pipelines identified DNA mutations, gene expression clusters using unsupervised consensus hierarchical clustering (UCHC), and protein expression levels that were correlated with relevant clinical variables, The Cancer Genome Atlas, and other published data. WES identified mutations in FGFR3 (74.1%; 92% low-grade, 60% high-grade), KMT2D (44.4%), PIK3CA (25.9%), and TP53 (22.2%). APOBEC and CpG were the most common mutational signatures. UCHC of RNAseq data segregated samples into four molecular subtypes with the following characteristics. Cluster 1: no PIK3CA mutations, nonsmokers, high-grade
Single-Cell Sequencing for Precise Cancer Research: Progress and Prospects.

PubMed

Zhang, Xiaoyan; Marjani, Sadie L; Hu, Zhaoyang; Weissman, Sherman M; Pan, Xinghua; Wu, Shixiu

2016-03-15

Advances in genomic technology have enabled the faithful detection and measurement of mutations and the gene expression profile of cancer cells at the single-cell level. Recently, several single-cell sequencing methods have been developed that permit the comprehensive and precise analysis of the cancer-cell genome, transcriptome, and epigenome. The use of these methods to analyze cancer cells has led to a series of unanticipated discoveries, such as the high heterogeneity and stochastic changes in cancer-cell populations, the new driver mutations and the complicated clonal evolution mechanisms, and the novel identification of biomarkers of variant tumors. These methods and the knowledge gained from their utilization could potentially improve the early detection and monitoring of rare cancer cells, such as circulating tumor cells and disseminated tumor cells, and promote the development of personalized and highly precise cancer therapy. Here, we discuss the current methods for single cancer-cell sequencing, with a strong focus on those practically used or potentially valuable in cancer research, including single-cell isolation, whole genome and transcriptome amplification, epigenome profiling, multi-dimensional sequencing, and next-generation sequencing and analysis. We also examine the current applications, challenges, and prospects of single cancer-cell sequencing. ©2016 American Association for Cancer Research.
Comparative transcriptomic analyses of normal and malformed flowers in sugar apple (Annona squamosa L.) to identify the differential expressed genes between normal and malformed flowers.

PubMed

Liu, Kaidong; Li, Haili; Li, Weijin; Zhong, Jundi; Chen, Yan; Shen, Chenjia; Yuan, Changchun

2017-10-23

Sugar apple (Annona squamosa L.), a popular fruit with high medicinal and nutritional properties, is widely cultivated in tropical South Asia and America. The malformed flower is a major cause for a reduction in production of sugar apple. However, little information is available on the differences between normal and malformed flowers of sugar apple. To gain a comprehensive perspective on the differences between normal and malformed flowers of sugar apple, cDNA libraries from normal and malformation flowers were prepared independently for Illumina sequencing. The data generated a total of 70,189,896 reads that were integrated and assembled into 55,097 unigenes with a mean length of 783 bp. A large number of differentially expressed genes (DEGs) were identified. Among these DEGs, 701 flower development-associated transcript factor encoding genes were included. Furthermore, a large number of flowering- and hormone-related DEGs were also identified, and most of these genes were down-regulated expressed in the malformation flowers. The expression levels of 15 selected genes were validated using quantitative-PCR. The contents of several endogenous hormones were measured. The malformed flowers displayed lower endogenous hormone levels compared to the normal flowers. The expression data as well as hormone levels in our study will serve as a comprehensive resource for investigating the regulation mechanism involved in floral organ development in sugar apple.
Curated collection of yeast transcription factor DNA binding specificity data reveals novel structural and gene regulatory insights

PubMed Central

2011-01-01

Background Transcription factors (TFs) play a central role in regulating gene expression by interacting with cis-regulatory DNA elements associated with their target genes. Recent surveys have examined the DNA binding specificities of most Saccharomyces cerevisiae TFs, but a comprehensive evaluation of their data has been lacking. Results We analyzed in vitro and in vivo TF-DNA binding data reported in previous large-scale studies to generate a comprehensive, curated resource of DNA binding specificity data for all characterized S. cerevisiae TFs. Our collection comprises DNA binding site motifs and comprehensive in vitro DNA binding specificity data for all possible 8-bp sequences. Investigation of the DNA binding specificities within the basic leucine zipper (bZIP) and VHT1 regulator (VHR) TF families revealed unexpected plasticity in TF-DNA recognition: intriguingly, the VHR TFs, newly characterized by protein binding microarrays in this study, recognize bZIP-like DNA motifs, while the bZIP TF Hac1 recognizes a motif highly similar to the canonical E-box motif of basic helix-loop-helix (bHLH) TFs. We identified several TFs with distinct primary and secondary motifs, which might be associated with different regulatory functions. Finally, integrated analysis of in vivo TF binding data with protein binding microarray data lends further support for indirect DNA binding in vivo by sequence-specific TFs. Conclusions The comprehensive data in this curated collection allow for more accurate analyses of regulatory TF-DNA interactions, in-depth structural studies of TF-DNA specificity determinants, and future experimental investigations of the TFs' predicted target genes and regulatory roles. PMID:22189060
The GermOnline cross-species systems browser provides comprehensive information on genes and gene products relevant for sexual reproduction.

PubMed

Gattiker, Alexandre; Niederhauser-Wiederkehr, Christa; Moore, James; Hermida, Leandro; Primig, Michael

2007-01-01

We report a novel release of the GermOnline knowledgebase covering genes relevant for the cell cycle, gametogenesis and fertility. GermOnline was extended into a cross-species systems browser including information on DNA sequence annotation, gene expression and the function of gene products. The database covers eight model organisms and Homo sapiens, for which complete genome annotation data are available. The database is now built around a sophisticated genome browser (Ensembl), our own microarray information management and annotation system (MIMAS) used to extensively describe experimental data obtained with high-density oligonucleotide microarrays (GeneChips) and a comprehensive system for online editing of database entries (MediaWiki). The RNA data include results from classical microarrays as well as tiling arrays that yield information on RNA expression levels, transcript start sites and lengths as well as exon composition. Members of the research community are solicited to help GermOnline curators keep database entries on genes and gene products complete and accurate. The database is accessible at http://www.germonline.org/.
The Human EST Ontology Explorer: a tissue-oriented visualization system for ontologies distribution in human EST collections.

PubMed

Merelli, Ivan; Caprera, Andrea; Stella, Alessandra; Del Corvo, Marcello; Milanesi, Luciano; Lazzari, Barbara

2009-10-15

The NCBI dbEST currently contains more than eight million human Expressed Sequenced Tags (ESTs). This wide collection represents an important source of information for gene expression studies, provided it can be inspected according to biologically relevant criteria. EST data can be browsed using different dedicated web resources, which allow to investigate library specific gene expression levels and to make comparisons among libraries, highlighting significant differences in gene expression. Nonetheless, no tool is available to examine distributions of quantitative EST collections in Gene Ontology (GO) categories, nor to retrieve information concerning library-dependent EST involvement in metabolic pathways. In this work we present the Human EST Ontology Explorer (HEOE) http://www.itb.cnr.it/ptp/human_est_explorer, a web facility for comparison of expression levels among libraries from several healthy and diseased tissues. The HEOE provides library-dependent statistics on the distribution of sequences in the GO Direct Acyclic Graph (DAG) that can be browsed at each GO hierarchical level. The tool is based on large-scale BLAST annotation of EST sequences. Due to the huge number of input sequences, this BLAST analysis was performed with the aid of grid computing technology, which is particularly suitable to address data parallel task. Relying on the achieved annotation, library-specific distributions of ESTs in the GO Graph were inferred. A pathway-based search interface was also implemented, for a quick evaluation of the representation of libraries in metabolic pathways. EST processing steps were integrated in a semi-automatic procedure that relies on Perl scripts and stores results in a MySQL database. A PHP-based web interface offers the possibility to simultaneously visualize, retrieve and compare data from the different libraries. Statistically significant differences in GO categories among user selected libraries can also be computed. The HEOE provides an alternative and complementary way to inspect EST expression levels with respect to approaches currently offered by other resources. Furthermore, BLAST computation on the whole human EST dataset was a suitable test of grid scalability in the context of large-scale bioinformatics analysis. The HEOE currently comprises sequence analysis from 70 non-normalized libraries, representing a comprehensive overview on healthy and unhealthy tissues. As the analysis procedure can be easily applied to other libraries, the number of represented tissues is intended to increase.
An Investigation of the Role of Sequencing in Children's Reading Comprehension

ERIC Educational Resources Information Center

Gouldthorp, Bethanie; Katsipis, Lia; Mueller, Cara

2018-01-01

To date, little is known about the high-level language skills and cognitive processes underlying reading comprehension in children. The present study aimed to investigate whether children with high, compared with low, reading comprehension differ in their sequencing skill, which was defined as the ability to identify and recall the temporal order…
The Organization and Anatomy of Narrative Comprehension and Expression in Lewy Body Spectrum Disorders

PubMed Central

Ash, Sharon; Xie, Sharon; Gross, Rachel Goldmann; Dreyfuss, Michael; Boller, Ashley; Camp, Emily; Morgan, Brianna; O’Shea, Jessica; Grossman, Murray

2012-01-01

Objective Patients with Lewy body spectrum disorders (LBSD) such as Parkinson’s disease (PD), Parkinson’s disease with dementia (PDD), and dementia with Lewy bodies (DLB) exhibit deficits in both narrative comprehension and narrative expression. The present research examines the hypothesis that these impairments are due to a material-neutral deficit in organizational executive resources rather than to impairments of language per se. We predicted that comprehension and expression of narrative would be similarly affected and that deficits in both expression and comprehension of narrative would be related to the same anatomic distribution of prefrontal disease. Method We examined 29 LBSD patients and 26 healthy seniors on their comprehension and expression of narrative discourse. For comprehension, we measured accuracy and latency in judging events with high and low associativity from familiar scripts such as “going fishing.” The expression task involved maintaining the connectedness of events while narrating a story from a wordless picture book. Results LBSD patients were impaired on measures of narrative organization during both comprehension and expression relative to healthy seniors. Measures of organization during narrative expression and comprehension were significantly correlated with each other. These measures both correlated with executive measures but not with neuropsychological measures of lexical semantics or grammar. Voxel-based morphometry revealed overlapping regressions relating frontal atrophy to narrative comprehension, narrative expression, and measures of executive control. Conclusions Difficulty with narrative discourse in LBSD stems in part from a deficit of organization common to comprehension and expression. This deficit is related to prefrontal cortical atrophy in LBSD. PMID:22309984
Genetic diagnosis of Duchenne and Becker muscular dystrophy using next-generation sequencing technology: comprehensive mutational search in a single platform.

PubMed

Lim, Byung Chan; Lee, Seungbok; Shin, Jong-Yeon; Kim, Jong-Il; Hwang, Hee; Kim, Ki Joong; Hwang, Yong Seung; Seo, Jeong-Sun; Chae, Jong Hee

2011-11-01

Duchenne muscular dystrophy or Becker muscular dystrophy might be a suitable candidate disease for application of next-generation sequencing in the genetic diagnosis because the complex mutational spectrum and the large size of the dystrophin gene require two or more analytical methods and have a high cost. The authors tested whether large deletions/duplications or small mutations, such as point mutations or short insertions/deletions of the dystrophin gene, could be predicted accurately in a single platform using next-generation sequencing technology. A custom solution-based target enrichment kit was designed to capture whole genomic regions of the dystrophin gene and other muscular-dystrophy-related genes. A multiplexing strategy, wherein four differently bar-coded samples were captured and sequenced together in a single lane of the Illumina Genome Analyser, was applied. The study subjects were 25 16 with deficient dystrophin expression without a large deletion/duplication and 9 with a known large deletion/duplication. Nearly 100% of the exonic region of the dystrophin gene was covered by at least eight reads with a mean read depth of 107. Pathogenic small mutations were identified in 15 of the 16 patients without a large deletion/duplication. Using these 16 patients as the standard, the authors' method accurately predicted the deleted or duplicated exons in the 9 patients with known mutations. Inclusion of non-coding regions and paired-end sequence analysis enabled accurate identification by increasing the read depth and providing information about the breakpoint junction. The current method has an advantage for the genetic diagnosis of Duchenne muscular dystrophy and Becker muscular dystrophy wherein a comprehensive mutational search may be feasible using a single platform.

Personalized comprehensive molecular profiling of high risk osteosarcoma: Implications and limitations for precision medicine.

PubMed

Subbiah, Vivek; Wagner, Michael J; McGuire, Mary F; Sarwari, Nawid M; Devarajan, Eswaran; Lewis, Valerae O; Westin, Shanon; Kato, Shumei; Brown, Robert E; Anderson, Pete

2015-12-01

Despite advances in molecular medicine over recent decades, there has been little advancement in the treatment of osteosarcoma. We performed comprehensive molecular profiling in two cases of metastatic and chemotherapy-refractory osteosarcoma to guide molecularly targeted therapy. Hybridization capture of >300 cancer-related genes plus introns from 28 genes often rearranged or altered in cancer was applied to >50 ng of DNA extracted from tumor samples from two patients with recurrent, metastatic osteosarcoma. The DNA from each sample was sequenced to high, uniform coverage. Immunohistochemical probes and morphoproteomics analysis were performed, in addition to fluorescence in situ hybridization. All analyses were performed in CLIA-certified laboratories. Molecularly targeted therapy based on the resulting profiles was offered to the patients. Biomedical analytics were performed using QIAGEN's Ingenuity® Pathway Analysis. In Patient #1, comprehensive next-generation exome sequencing showed MET amplification, PIK3CA mutation, CCNE1 amplification, and PTPRD mutation. Immunohistochemistry-based morphoproteomic analysis revealed c-Met expression [(p)-c-Met (Tyr1234/1235)] and activation of mTOR/AKT pathway [IGF-1R (Tyr1165/1166), p-mTOR [Ser2448], p-Akt (Ser473)] and expression of SPARC and COX2. Targeted therapy was administered to match the P1K3CA, c-MET, and SPARC and COX2 aberrations with sirolimus+ crizotinib and abraxane+ celecoxib. In Patient #2, aberrations included NF2 loss in exons 2-16, PDGFRα amplification, and TP53 mutation. This patient was enrolled on a clinical trial combining targeted agents temsirolimus, sorafenib and bevacizumab, to match NF2, PDGFRα and TP53 aberrations. Both the patients did not benefit from matched therapy. Relapsed osteosarcoma is characterized by complex signaling and drug resistance pathways. Comprehensive molecular profiling holds great promise for tailoring personalized therapies for cancer. Methods for such profiling are evolving and need to be refined to better assist clinicians in making treatment decisions based on the large amount of data that results from this type of testing. Further research in this area is warranted.
Whole-genome expression analyses of type 2 diabetes in human skin reveal altered immune function and burden of infection.

PubMed

Wu, Chun; Chen, Xiaopan; Shu, Jing; Lee, Chun-Ting

2017-05-23

Skin disorders are among most common complications associated with type 2 diabetes mellitus (T2DM). Although T2DM patients are known to have increased risk of infections and other T2DM-related skin disorders, their molecular mechanisms are largely unknown. This study aims to identify dysregulated genes and gene networks that are associated with T2DM in human skin. We compared the expression profiles of 56,318 transcribed genes on 74 T2DM cases and 148 gender- age-, and race-matched non-diabetes controls from the Genotype-Tissue Expression (GTEx) database. RNA-Sequencing data indicates that diabetic skin is characterized by increased expression of genes that are related to immune responses (CCL20, CXCL9, CXCL10, CXCL11, CXCL13, and CCL18), JAK/STAT signaling pathway (JAK3, STAT1, and STAT2), tumor necrosis factor superfamily (TNFSF10 and TNFSF15), and infectious disease pathways (OAS1, OAS2, OAS3, and IFIH1). Genes in cell adhesion molecules pathway (NCAM1 and L1CAM) and collagen family (PCOLCE2 and COL9A3) are downregulated, suggesting structural changes in the skin of T2DM. For the first time, to the best of our knowledge, this pioneer analytic study reports comprehensive unbiased gene expression changes and dysregulated pathways in the non-diseased skin of T2DM patients. This comprehensive understanding derived from whole-genome expression profiles could advance our knowledge in determining molecular targets for the prevention and treatment of T2DM-associated skin disorders.
Transcriptome sequencing reveals high isoform diversity in the ant Formica exsecta

PubMed Central

Paviala, Jenni; Morandin, Claire; Wheat, Christopher; Sundström, Liselotte; Helanterä, Heikki

2017-01-01

Transcriptome resources for social insects have the potential to provide new insight into polyphenism, i.e., how divergent phenotypes arise from the same genome. Here we present a transcriptome based on paired-end RNA sequencing data for the ant Formica exsecta (Formicidae, Hymenoptera). The RNA sequencing libraries were constructed from samples of several life stages of both sexes and female castes of queens and workers, in order to maximize representation of expressed genes. We first compare the performance of common assembly and scaffolding software (Trinity, Velvet-Oases, and SOAPdenovo-trans), in producing de novo assemblies. Second, we annotate the resulting expressed contigs to the currently published genomes of ants, and other insects, including the honeybee, to filter genes that have annotation evidence of being true genes. Our pipeline resulted in a final assembly of altogether 39,262 mRNA transcripts, with an average coverage of >300X, belonging to 17,496 unique genes with annotation in the related ant species. From these genes, 536 genes were unique to one caste or sex only, highlighting the importance of comprehensive sampling. Our final assembly also showed expression of several splice variants in 6,975 genes, and we show that accounting for splice variants affects the outcome of downstream analyses such as gene ontologies. Our transcriptome provides an outstanding resource for future genetic studies on F. exsecta and other ant species, and the presented transcriptome assembly can be adapted to any non-model species that has genomic resources available from a related taxon. PMID:29177112
Cloning, expression and characterization of β-xylosidase from Aspergillus niger ASKU28.

PubMed

Choengpanya, Khuanjarat; Arthornthurasuk, Siriphan; Wattana-amorn, Pakorn; Huang, Wan-Ting; Plengmuankhae, Wandee; Li, Yaw-Kuen; Kongsaeree, Prachumporn T

2015-11-01

β-Xylosidases catalyze the breakdown of β-1,4-xylooligosaccharides, which are produced from degradation of xylan by xylanases, to fermentable xylose. Due to their important role in xylan degradation, there is an interest in using these enzymes in biofuel production from lignocellulosic biomass. In this study, the coding sequence of a glycoside hydrolase family 3 β-xylosidase from Aspergillus niger ASKU28 (AnBX) was cloned and expressed in Pichia pastoris as an N-terminal fusion protein with the α-mating factor signal sequence (α-MF) and a poly-histidine tag. The expression level was increased to 5.7 g/l in a fermenter system as a result of optimization of only five codons near the 5' end of the α-MF sequence. The recombinant AnBX was purified to homogeneity through a single-step Phenyl Sepharose chromatography. The enzyme exhibited an optimal activity at 70°C and at pH 4.0-4.5, and a very high kinetic efficiency toward a xyloside substrate. AnBX demonstrated an exo-type activity with retention of the β-configuration, and a synergistic action with xylanase in hydrolysis of beechwood xylan. This study provides comprehensive data on characterization of a glycoside hydrolase family 3 β-xylosidase that have not been determined in any prior investigations. Our results suggested that AnBX may be useful for degradation of lignocellulosic biomass in bioethanol production, pulp bleaching process and beverage industry. Copyright © 2015 Elsevier Inc. All rights reserved.
The Oral Narrative Comprehension and Production Abilities of Verbal Preschoolers on the Autism Spectrum.

PubMed

Westerveld, Marleen F; Roberts, Jacqueline M A

2017-10-05

This study described the oral narrative comprehension and production skills of verbal preschool-age children on the autism spectrum and investigated correlations between oral narrative ability and norm-referenced language test performance. Twenty-nine preschool-age children (aged 4;0-5;9 years;months) with autism, who obtained an age-equivalent score of at least 36 months on the expressive communication subscale of the Vineland Adaptive Behavior Scales-Second Edition (Sparrow, Cicchetti, & Balla, 2005), participated. Children listened to an unfamiliar fictional narrative and answered comprehension questions afterward. After listening to the narrative a second time, children were asked to retell the narrative without picture support. Narratives were transcribed and analyzed for length, semantic diversity, grammatical complexity and accuracy, intelligibility, inclusion of critical events, and narrative stage. All children participated in the comprehension task, and 19 children produced an analyzable narrative retell. Compared with published data on typically developing children, significant difficulties were observed in narrative comprehension, intelligibility, and grammatical accuracy. Most of the children told descriptive or action sequences, with only 1 child producing an abbreviated episode. Significant positive correlations were found (a) between performance on the Peabody Picture Vocabulary Test-Fourth Edition (Dunn & Dunn, 2007) and semantic diversity and narrative comprehension and (b) between parent-reported receptive communication competence (Vineland Adaptive Behavior Scales-Second Edition) and narrative comprehension. This study provides preliminary evidence of specific difficulties in oral narrative comprehension and production skills in verbal preschoolers on the autism spectrum.
Profiling mRNAs of Two Cuscuta Species Reveals Possible Candidate Transcripts Shared by Parasitic Plants

PubMed Central

Wijeratne, Saranga; Fraga, Martina; Meulia, Tea; Doohan, Doug; Li, Zhaohu; Qu, Feng

2013-01-01

Dodders are among the most important parasitic plants that cause serious yield losses in crop plants. In this report, we sought to unveil the genetic basis of dodder parasitism by profiling the trancriptomes of Cuscuta pentagona and C. suaveolens, two of the most common dodder species using a next-generation RNA sequencing platform. De novo assembly of the sequence reads resulted in more than 46,000 isotigs and contigs (collectively referred to as expressed sequence tags or ESTs) for each species, with more than half of them predicted to encode proteins that share significant sequence similarities with known proteins of non-parasitic plants. Comparing our datasets with transcriptomes of 12 other fully sequenced plant species confirmed a close evolutionary relationship between dodder and tomato. Using a rigorous set of filtering parameters, we were able to identify seven pairs of ESTs that appear to be shared exclusively by parasitic plants, thus providing targets for tailored management approaches. In addition, we also discovered ESTs with sequences similarities to known plant viruses, including cryptic viruses, in the dodder sequence assemblies. Together this study represents the first comprehensive transcriptome profiling of parasitic plants in the Cuscuta genus, and is expected to contribute to our understanding of the molecular mechanisms of parasitic plant-host plant interactions. PMID:24312295
The LAM-PCR Method to Sequence LV Integration Sites.

PubMed

Wang, Wei; Bartholomae, Cynthia C; Gabriel, Richard; Deichmann, Annette; Schmidt, Manfred

2016-01-01

Integrating viral gene transfer vectors are commonly used gene delivery tools in clinical gene therapy trials providing stable integration and continuous gene expression of the transgene in the treated host cell. However, integration of the reverse-transcribed vector DNA into the host genome is a potentially mutagenic event that may directly contribute to unwanted side effects. A comprehensive and accurate analysis of the integration site (IS) repertoire is indispensable to study clonality in transduced cells obtained from patients undergoing gene therapy and to identify potential in vivo selection of affected cell clones. To date, next-generation sequencing (NGS) of vector-genome junctions allows sophisticated studies on the integration repertoire in vitro and in vivo. We have explored the use of the Illumina MiSeq Personal Sequencer platform to sequence vector ISs amplified by non-restrictive linear amplification-mediated PCR (nrLAM-PCR) and LAM-PCR. MiSeq-based high-quality IS sequence retrieval is accomplished by the introduction of a double-barcode strategy that substantially minimizes the frequency of IS sequence collisions compared to the conventionally used single-barcode protocol. Here, we present an updated protocol of (nr)LAM-PCR for the analysis of lentiviral IS using a double-barcode system and followed by deep sequencing using the MiSeq device.
Profiling mRNAs of two Cuscuta species reveals possible candidate transcripts shared by parasitic plants.

PubMed

Jiang, Linjian; Wijeratne, Asela J; Wijeratne, Saranga; Fraga, Martina; Meulia, Tea; Doohan, Doug; Li, Zhaohu; Qu, Feng

2013-01-01

Dodders are among the most important parasitic plants that cause serious yield losses in crop plants. In this report, we sought to unveil the genetic basis of dodder parasitism by profiling the trancriptomes of Cuscuta pentagona and C. suaveolens, two of the most common dodder species using a next-generation RNA sequencing platform. De novo assembly of the sequence reads resulted in more than 46,000 isotigs and contigs (collectively referred to as expressed sequence tags or ESTs) for each species, with more than half of them predicted to encode proteins that share significant sequence similarities with known proteins of non-parasitic plants. Comparing our datasets with transcriptomes of 12 other fully sequenced plant species confirmed a close evolutionary relationship between dodder and tomato. Using a rigorous set of filtering parameters, we were able to identify seven pairs of ESTs that appear to be shared exclusively by parasitic plants, thus providing targets for tailored management approaches. In addition, we also discovered ESTs with sequences similarities to known plant viruses, including cryptic viruses, in the dodder sequence assemblies. Together this study represents the first comprehensive transcriptome profiling of parasitic plants in the Cuscuta genus, and is expected to contribute to our understanding of the molecular mechanisms of parasitic plant-host plant interactions.
BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.

PubMed

Yang, Bite; Liu, Feng; Ren, Chao; Ouyang, Zhangyi; Xie, Ziwei; Bo, Xiaochen; Shu, Wenjie

2017-07-01

Enhancer elements are noncoding stretches of DNA that play key roles in controlling gene expression programmes. Despite major efforts to develop accurate enhancer prediction methods, identifying enhancer sequences continues to be a challenge in the annotation of mammalian genomes. One of the major issues is the lack of large, sufficiently comprehensive and experimentally validated enhancers for humans or other species. Thus, the development of computational methods based on limited experimentally validated enhancers and deciphering the transcriptional regulatory code encoded in the enhancer sequences is urgent. We present a deep-learning-based hybrid architecture, BiRen, which predicts enhancers using the DNA sequence alone. Our results demonstrate that BiRen can learn common enhancer patterns directly from the DNA sequence and exhibits superior accuracy, robustness and generalizability in enhancer prediction relative to other state-of-the-art enhancer predictors based on sequence characteristics. Our BiRen will enable researchers to acquire a deeper understanding of the regulatory code of enhancer sequences. Our BiRen method can be freely accessed at https://github.com/wenjiegroup/BiRen . shuwj@bmi.ac.cn or boxc@bmi.ac.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

NASA Astrophysics Data System (ADS)

Tarpine, Ryan; Istrail, Sorin

The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.
The WRKY Transcription Factor Family in Citrus: Valuable and Useful Candidate Genes for Citrus Breeding.

PubMed

Ayadi, M; Hanana, M; Kharrat, N; Merchaoui, H; Marzoug, R Ben; Lauvergeat, V; Rebaï, A; Mzid, R

2016-10-01

WRKY transcription factors belong to a large family of plant transcriptional regulators whose members have been reported to be involved in a wide range of biological roles including plant development, adaptation to environmental constraints and response to several diseases. However, little or poor information is available about WRKY's in Citrus. The recent release of completely assembled genomes sequences of Citrus sinensis and Citrus clementina and the availability of ESTs sequences from other citrus species allowed us to perform a genome survey for Citrus WRKY proteins. In the present study, we identified 100 WRKY members from C. sinensis (51), C. clementina (48) and Citrus unshiu (1), and analyzed their chromosomal distribution, gene structure, gene duplication, syntenic relation and phylogenetic analysis. A phylogenetic tree of 100 Citrus WRKY sequences with their orthologs from Arabidopsis has distinguished seven groups. The CsWRKY genes were distributed across all ten sweet orange chromosomes. A comprehensive approach and an integrative analysis of Citrus WRKY gene expression revealed variable profiles of expression within tissues and stress conditions indicating functional diversification. Thus, candidate Citrus WRKY genes have been proposed as potentially involved in fruit acidification, essential oil biosynthesis and abiotic/biotic stress tolerance. Our results provided essential prerequisites for further WRKY genes cloning and functional analysis with an aim of citrus crop improvement.
Comparison of the theoretical and real-world evolutionary potential of a genetic circuit

NASA Astrophysics Data System (ADS)

Razo-Mejia, M.; Boedicker, J. Q.; Jones, D.; DeLuna, A.; Kinney, J. B.; Phillips, R.

2014-04-01

With the development of next-generation sequencing technologies, many large scale experimental efforts aim to map genotypic variability among individuals. This natural variability in populations fuels many fundamental biological processes, ranging from evolutionary adaptation and speciation to the spread of genetic diseases and drug resistance. An interesting and important component of this variability is present within the regulatory regions of genes. As these regions evolve, accumulated mutations lead to modulation of gene expression, which may have consequences for the phenotype. A simple model system where the link between genetic variability, gene regulation and function can be studied in detail is missing. In this article we develop a model to explore how the sequence of the wild-type lac promoter dictates the fold-change in gene expression. The model combines single-base pair resolution maps of transcription factor and RNA polymerase binding energies with a comprehensive thermodynamic model of gene regulation. The model was validated by predicting and then measuring the variability of lac operon regulation in a collection of natural isolates. We then implement the model to analyze the sensitivity of the promoter sequence to the regulatory output, and predict the potential for regulation to evolve due to point mutations in the promoter region.
Widespread alternative and aberrant splicing revealed by lariat sequencing

PubMed Central

Stepankiw, Nicholas; Raghavan, Madhura; Fogarty, Elizabeth A.; Grimson, Andrew; Pleiss, Jeffrey A.

2015-01-01

Alternative splicing is an important and ancient feature of eukaryotic gene structure, the existence of which has likely facilitated eukaryotic proteome expansions. Here, we have used intron lariat sequencing to generate a comprehensive profile of splicing events in Schizosaccharomyces pombe, amongst the simplest organisms that possess mammalian-like splice site degeneracy. We reveal an unprecedented level of alternative splicing, including alternative splice site selection for over half of all annotated introns, hundreds of novel exon-skipping events, and thousands of novel introns. Moreover, the frequency of these events is far higher than previous estimates, with alternative splice sites on average activated at ∼3% the rate of canonical sites. Although a subset of alternative sites are conserved in related species, implying functional potential, the majority are not detectably conserved. Interestingly, the rate of aberrant splicing is inversely related to expression level, with lowly expressed genes more prone to erroneous splicing. Although we validate many events with RNAseq, the proportion of alternative splicing discovered with lariat sequencing is far greater, a difference we attribute to preferential decay of aberrantly spliced transcripts. Together, these data suggest the spliceosome possesses far lower fidelity than previously appreciated, highlighting the potential contributions of alternative splicing in generating novel gene structures. PMID:26261211
Transcriptome analysis of stem development in the tumourous stem mustard Brassica juncea var. tumida Tsen et Lee by RNA sequencing.

PubMed

Sun, Quan; Zhou, Guanfan; Cai, Yingfan; Fan, Yonghong; Zhu, Xiaoyan; Liu, Yihua; He, Xiaohong; Shen, Jinjuan; Jiang, Huaizhong; Hu, Daiwen; Pan, Zheng; Xiang, Liuxin; He, Guanghua; Dong, Daiwen; Yang, Jianping

2012-04-21

Tumourous stem mustard (Brassica juncea var. tumida Tsen et Lee) is an economically and nutritionally important vegetable crop of the Cruciferae family that also provides the raw material for Fuling mustard. The genetics breeding, physiology, biochemistry and classification of mustards have been extensively studied, but little information is available on tumourous stem mustard at the molecular level. To gain greater insight into the molecular mechanisms underlying stem swelling in this vegetable and to provide additional information for molecular research and breeding, we sequenced the transcriptome of tumourous stem mustard at various stem developmental stages and compared it with that of a mutant variety lacking swollen stems. Using Illumina short-read technology with a tag-based digital gene expression (DGE) system, we performed de novo transcriptome assembly and gene expression analysis. In our analysis, we assembled genetic information for tumourous stem mustard at various stem developmental stages. In addition, we constructed five DGE libraries, which covered the strains Yong'an and Dayejie at various development stages. Illumina sequencing identified 146,265 unigenes, including 11,245 clusters and 135,020 singletons. The unigenes were subjected to a BLAST search and annotated using the GO and KO databases. We also compared the gene expression profiles of three swollen stem samples with those of two non-swollen stem samples. A total of 1,042 genes with significantly different expression levels occurring simultaneously in the six comparison groups were screened out. Finally, the altered expression levels of a number of randomly selected genes were confirmed by quantitative real-time PCR. Our data provide comprehensive gene expression information at the transcriptional level and the first insight into the understanding of the molecular mechanisms and regulatory pathways of stem swelling and development in this plant, and will help define new mechanisms of stem development in non-model plant organisms.
A comprehensive resource of genomic, epigenomic and transcriptomic sequencing data for the black truffle Tuber melanosporum

PubMed Central

2014-01-01

Background Tuber melanosporum, also known in the gastronomic community as “truffle”, features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. Findings We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody (“truffle”), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. Conclusions The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles. PMID:25392735
A comprehensive resource of genomic, epigenomic and transcriptomic sequencing data for the black truffle Tuber melanosporum.

PubMed

Chen, Pao-Yang; Montanini, Barbara; Liao, Wen-Wei; Morselli, Marco; Jaroszewicz, Artur; Lopez, David; Ottonello, Simone; Pellegrini, Matteo

2014-01-01

Tuber melanosporum, also known in the gastronomic community as "truffle", features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody ("truffle"), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles.
mirEX: a platform for comparative exploration of plant pri-miRNA expression data.

PubMed

Bielewicz, Dawid; Dolata, Jakub; Zielezinski, Andrzej; Alaba, Sylwia; Szarzynska, Bogna; Szczesniak, Michal W; Jarmolowski, Artur; Szweykowska-Kulinska, Zofia; Karlowski, Wojciech M

2012-01-01

mirEX is a comprehensive platform for comparative analysis of primary microRNA expression data. RT-qPCR-based gene expression profiles are stored in a universal and expandable database scheme and wrapped by an intuitive user-friendly interface. A new way of accessing gene expression data in mirEX includes a simple mouse operated querying system and dynamic graphs for data mining analyses. In contrast to other publicly available databases, the mirEX interface allows a simultaneous comparison of expression levels between various microRNA genes in diverse organs and developmental stages. Currently, mirEX integrates information about the expression profile of 190 Arabidopsis thaliana pri-miRNAs in seven different developmental stages: seeds, seedlings and various organs of mature plants. Additionally, by providing RNA structural models, publicly available deep sequencing results, experimental procedure details and careful selection of auxiliary data in the form of web links, mirEX can function as a one-stop solution for Arabidopsis microRNA information. A web-based mirEX interface can be accessed at http://bioinfo.amu.edu.pl/mirex.
The sugar transporter inventory of tomato: genome-wide identification and expression analysis.

PubMed

Reuscher, Stefan; Akiyama, Masahito; Yasuda, Tomohide; Makino, Haruko; Aoki, Koh; Shibata, Daisuke; Shiratake, Katsuhiro

2014-06-01

The mobility of sugars between source and sink tissues in plants depends on sugar transport proteins. Studying the corresponding genes allows the manipulation of the sink strength of developing fruits, thereby improving fruit quality for human consumption. Tomato (Solanum lycopersicum) is both a major horticultural crop and a model for the development of fleshy fruits. In this article we provide a comprehensive inventory of tomato sugar transporters, including the SUCROSE TRANSPORTER family, the SUGAR TRANSPORTER PROTEIN family, the SUGAR FACILITATOR PROTEIN family, the POLYOL/MONOSACCHARIDE TRANSPORTER family, the INOSITOL TRANSPORTER family, the PLASTIDIC GLUCOSE TRANSLOCATOR family, the TONOPLAST MONOSACCHARIDE TRANSPORTER family and the VACUOLAR GLUCOSE TRANSPORTER family. Expressed sequence tag (EST) sequencing and phylogenetic analyses established a nomenclature for all analyzed tomato sugar transporters. In total we identified 52 genes in tomato putatively encoding sugar transporters. The expression of 29 sugar transporter genes in vegetative tissues and during fruit development was analyzed. Several sugar transporter genes were expressed in a tissue- or developmental stage-specific manner. This information will be helpful to better understand source to sink movement of photoassimilates in tomato. Identification of fruit-specific sugar transporters might be a first step to find novel genes contributing to tomato fruit sugar accumulation. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Integrative analyses of RNA editing, alternative splicing, and expression of young genes in human brain transcriptome by deep RNA sequencing.

PubMed

Wu, Dong-Dong; Ye, Ling-Qun; Li, Yan; Sun, Yan-Bo; Shao, Yi; Chen, Chunyan; Zhu, Zhu; Zhong, Li; Wang, Lu; Irwin, David M; Zhang, Yong E; Zhang, Ya-Ping

2015-08-01

Next-generation RNA sequencing has been successfully used for identification of transcript assembly, evaluation of gene expression levels, and detection of post-transcriptional modifications. Despite these large-scale studies, additional comprehensive RNA-seq data from different subregions of the human brain are required to fully evaluate the evolutionary patterns experienced by the human brain transcriptome. Here, we provide a total of 6.5 billion RNA-seq reads from different subregions of the human brain. A significant correlation was observed between the levels of alternative splicing and RNA editing, which might be explained by a competition between the molecular machineries responsible for the splicing and editing of RNA. Young human protein-coding genes demonstrate biased expression to the neocortical and non-neocortical regions during evolution on the lineage leading to humans. We also found that a significantly greater number of young human protein-coding genes are expressed in the putamen, a tissue that was also observed to have the highest level of RNA-editing activity. The putamen, which previously received little attention, plays an important role in cognitive ability, and our data suggest a potential contribution of the putamen to human evolution. © The Author (2015). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.
Comprehensive analysis of differentially expressed genes reveals the molecular response to elevated CO2 levels in two sea buckthorn cultivars.

PubMed

Zhang, Guoyun; Zhang, Tong; Liu, Juanjuan; Zhang, Jianguo; He, Caiyun

2018-06-20

Atmospheric carbon dioxide (CO 2 ) concentration increases every year. It is critical to understand the elevated CO 2 response molecular mechanisms of plants using genomic techniques. Hippophae rhamnoides L. is a high stress resistance plant species widely distributed in Europe and Asia. However, the molecular mechanism of elevated CO 2 response in H. rhamnoides has been limited. In this study, transcriptomic analysis of two sea buckthorn cultivars under different CO 2 concentrations was performed, based on the next-generation illumina sequencing platform and de novo assembly. We identified 4740 differentially expressed genes in sea buckthorn response to elevated CO 2 concentrations. According to the gene ontology (GO) results, photosystem I, photosynthesis and chloroplast thylakoid membrane were the main enriched terms in 'xiangyang' sea buckthorn. In 'zhongguo' sea buckthorn, photosynthesis was also the main significantly enriched term. However, the number of photosynthesis related differentially expressed genes were different between two sea buckthorn cultivars. Our GO and pathway analyses indicated that the expression levels of the transcription factors WRKY, MYB and NAC were significantly different between the two sea buckthorn cultivars. This study provides a reliable transcriptome sequence resource and is a valuable resource for genetic and genomic researches for plants under high CO 2 concentration in the future. Copyright © 2018 Elsevier B.V. All rights reserved.

ATP13A2 variability in Parkinson disease

PubMed Central

Vilariño-Güell, Carles; Soto, Alexandra I.; Lincoln, Sarah J.; Yahmed, Samia Ben; Kefi, Mounir; Heckman, Michael G.; Hulihan, Mary M.; Chai, Hua; Diehl, Nancy N.; Amouri, Rim; Rajput, Alex; Mash, Deborah C.; Dickson, Dennis W.; Middleton, Lefkos T.; Gibson, Rachel A.; Hentati, Faycal; Farrer, Matthew J.

2008-01-01

Recessively inherited mutations in ATP13A2 result in Kufor-Rakeb syndrome, whereas genetic variability and elevated ATP13A2 expression have been implicated in Parkinson disease (PD). Given this background, ATP13A2 was comprehensively assessed to support or refute its contribution to PD. Sequencing of ATP13A2 exons and intron-exon boundaries was performed in 89 probands with familial parkinsonism from Tunisia. The segregation of mutations with parkinsonism was subsequently assessed within pedigrees. The frequency of genetic variants and evidence for association was also examined in 240 patients with non-familial PD and 372 healthy controls. ATP13A2 mRNA expression was also quantified in brain tissues from 38 patients with non-familial PD and 38 healthy subjects from the US. Sequencing analysis revealed 37 new variants; seven missense, six silent and 24 that were noncoding. However, no single ATP13A2 mutation segregated with familial parkinsonism in either a dominant or recessive manner. Four markers showed marginal association with non-familial PD, prior to correction for multiple testing. ATP13A2 mRNA expression was marginally decreased in PD brains compared with tissue from control subjects. In conclusion, neither ATP13A2 genetic variability nor quantitative gene expression in brain appears to contribute to familial parkinsonism or non-familial PD. PMID:19085912
Transcriptome Analysis of Flounder (Paralichthys olivaceus) Gill in Response to Lymphocystis Disease Virus (LCDV) Infection: Novel Insights into Fish Defense Mechanisms

PubMed Central

Wu, Ronghua; Sheng, Xiuzhen; Tang, Xiaoqian; Xing, Jing; Zhan, Wenbin

2018-01-01

Lymphocystis disease virus (LCDV) infection may induce a variety of host gene expression changes associated with disease development; however, our understanding of the molecular mechanisms underlying host-virus interactions is limited. In this study, RNA sequencing (RNA-seq) was employed to investigate differentially expressed genes (DEGs) in the gill of the flounder (Paralichthys olivaceus) at one week post LCDV infection. Transcriptome sequencing of the gill with and without LCDV infection was performed using the Illumina HiSeq 2500 platform. In total, RNA-seq analysis generated 193,225,170 clean reads aligned with 106,293 unigenes. Among them, 1812 genes were up-regulated and 1626 genes were down-regulated after LCDV infection. The DEGs related to cellular process and metabolism occupied the dominant position involved in the LCDV infection. A further function analysis demonstrated that the genes related to inflammation, the ubiquitin-proteasome pathway, cell proliferation, apoptosis, tumor formation, and anti-viral defense showed a differential expression. Several DEGs including β actin, toll-like receptors, cytokine-related genes, antiviral related genes, and apoptosis related genes were involved in LCDV entry and immune response. In addition, RNA-seq data was validated by quantitative real-time PCR. For the first time, the comprehensive gene expression study provided valuable insights into the host-pathogen interaction between flounder and LCDV. PMID:29304016
Transcriptome Analysis of Flounder (Paralichthys olivaceus) Gill in Response to Lymphocystis Disease Virus (LCDV) Infection: Novel Insights into Fish Defense Mechanisms.

PubMed

Wu, Ronghua; Sheng, Xiuzhen; Tang, Xiaoqian; Xing, Jing; Zhan, Wenbin

2018-01-05

Lymphocystis disease virus (LCDV) infection may induce a variety of host gene expression changes associated with disease development; however, our understanding of the molecular mechanisms underlying host-virus interactions is limited. In this study, RNA sequencing (RNA-seq) was employed to investigate differentially expressed genes (DEGs) in the gill of the flounder ( Paralichthys olivaceus ) at one week post LCDV infection. Transcriptome sequencing of the gill with and without LCDV infection was performed using the Illumina HiSeq 2500 platform. In total, RNA-seq analysis generated 193,225,170 clean reads aligned with 106,293 unigenes. Among them, 1812 genes were up-regulated and 1626 genes were down-regulated after LCDV infection. The DEGs related to cellular process and metabolism occupied the dominant position involved in the LCDV infection. A further function analysis demonstrated that the genes related to inflammation, the ubiquitin-proteasome pathway, cell proliferation, apoptosis, tumor formation, and anti-viral defense showed a differential expression. Several DEGs including β actin , toll-like receptors, cytokine-related genes, antiviral related genes, and apoptosis related genes were involved in LCDV entry and immune response. In addition, RNA-seq data was validated by quantitative real-time PCR. For the first time, the comprehensive gene expression study provided valuable insights into the host-pathogen interaction between flounder and LCDV.
ICAM-1-related long non-coding RNA: promoter analysis and expression in human retinal endothelial cells.

PubMed

Lumsden, Amanda L; Ma, Yuefang; Ashander, Liam M; Stempel, Andrew J; Keating, Damien J; Smith, Justine R; Appukuttan, Binoy

2018-05-09

Regulation of intercellular adhesion molecule (ICAM)-1 in retinal endothelial cells is a promising druggable target for retinal vascular diseases. The ICAM-1-related (ICR) long non-coding RNA stabilizes ICAM-1 transcript, increasing protein expression. However, studies of ICR involvement in disease have been limited as the promoter is uncharacterized. To address this issue, we undertook a comprehensive in silico analysis of the human ICR gene promoter region. We used genomic evolutionary rate profiling to identify a 115 base pair (bp) sequence within 500 bp upstream of the transcription start site of the annotated human ICR gene that was conserved across 25 eutherian genomes. A second constrained sequence upstream of the orthologous mouse gene (68 bp; conserved across 27 Eutherian genomes including human) was also discovered. Searching these elements identified 33 matrices predictive of binding sites for transcription factors known to be responsive to a broad range of pathological stimuli, including hypoxia, and metabolic and inflammatory proteins. Five phenotype-associated single nucleotide polymorphisms (SNPs) in the immediate vicinity of these elements included four SNPs (i.e. rs2569693, rs281439, rs281440 and rs11575074) predicted to impact binding motifs of transcription factors, and thus the expression of ICR and ICAM-1 genes, with potential to influence disease susceptibility. We verified that human retinal endothelial cells expressed ICR, and observed induction of expression by tumor necrosis factor-α.
Comprehensive transcriptome-based characterization of differentially expressed genes involved in microsporogenesis of radish CMS line and its maintainer.

PubMed

Xie, Yang; Zhang, Wei; Wang, Yan; Xu, Liang; Zhu, Xianwen; Muleke, Everlyne M; Liu, Liwang

2016-09-01

Microsporogenesis is an indispensable period for investigating microspore development and cytoplasmic male sterility (CMS) occurrence. Radish CMS line plays a critical role in elite F1 hybrid seed production and heterosis utilization. However, the molecular mechanisms of microspore development and CMS occurrence have not been thoroughly uncovered in radish. In this study, a comparative analysis of radish floral buds from a CMS line (NAU-WA) and its maintainer (NAU-WB) was conducted using next generation sequencing (NGS) technology. Digital gene expression (DGE) profiling revealed that 3504 genes were significantly differentially expressed between NAU-WA and NAU-WB library, among which 1910 were upregulated and 1594 were downregulated. Gene ontology (GO) analysis showed that these differentially expressed genes (DEGs) were mainly enriched in extracellular region, catalytic activity, and response to stimulus. KEGG enrichment analysis revealed that the DEGs were predominantly associated with flavonoid biosynthesis, glycolysis, and biosynthesis of secondary metabolites. Real-time quantitative PCR analysis showed that the expression profiles of 13 randomly selected DEGs were in high agreement with results from Illumina sequencing. Several candidate genes encoding ATP synthase, auxin response factor (ARF), transcription factors (TFs), chalcone synthase (CHS), and male sterility (MS) were responsible for microsporogenesis. Furthermore, a schematic diagram for functional interaction of DEGs from NAU-WA vs. NAU-WB library in radish plants was proposed. These results could provide new information on the dissection of the molecular mechanisms underlying microspore development and CMS occurrence in radish.
Transcriptome Profile Analysis from Different Sex Types of Ginkgo biloba L.

PubMed

Du, Shuhui; Sang, Yalin; Liu, Xiaojing; Xing, Shiyan; Li, Jihong; Tang, Haixia; Sun, Limin

2016-01-01

In plants, sex determination is a comprehensive process of correlated events, which involves genes that are differentially and/or specifically expressed in distinct developmental phases. Exploring gene expression profiles from different sex types will contribute to fully understanding sex determination in plants. In this study, we conducted RNA-sequencing of female and male buds (FB and MB) as well as ovulate strobilus and staminate strobilus (OS and SS) of Ginkgo biloba to gain insights into the genes potentially related to sex determination in this species. Approximately 60 Gb of clean reads were obtained from eight cDNA libraries. De novo assembly of the clean reads generated 108,307 unigenes with an average length of 796 bp. Among these unigenes, 51,953 (47.97%) had at least one significant match with a gene sequence in the public databases searched. A total of 4709 and 9802 differentially expressed genes (DEGs) were identified in MB vs. FB and SS vs. OS, respectively. Genes involved in plant hormone signal and transduction as well as those encoding DNA methyltransferase were found to be differentially expressed between different sex types. Their potential roles in sex determination of G. biloba were discussed. Pistil-related genes were expressed in male buds while anther-specific genes were identified in female buds, suggesting that dioecism in G. biloba was resulted from the selective arrest of reproductive primordia. High correlation of expression level was found between the RNA-Seq and quantitative real-time PCR results. The transcriptome resources that we generated allowed us to characterize gene expression profiles and examine differential expression profiles, which provided foundations for identifying functional genes associated with sex determination in G. biloba.
Transcriptome Profile Analysis from Different Sex Types of Ginkgo biloba L.

PubMed Central

Du, Shuhui; Sang, Yalin; Liu, Xiaojing; Xing, Shiyan; Li, Jihong; Tang, Haixia; Sun, Limin

2016-01-01

In plants, sex determination is a comprehensive process of correlated events, which involves genes that are differentially and/or specifically expressed in distinct developmental phases. Exploring gene expression profiles from different sex types will contribute to fully understanding sex determination in plants. In this study, we conducted RNA-sequencing of female and male buds (FB and MB) as well as ovulate strobilus and staminate strobilus (OS and SS) of Ginkgo biloba to gain insights into the genes potentially related to sex determination in this species. Approximately 60 Gb of clean reads were obtained from eight cDNA libraries. De novo assembly of the clean reads generated 108,307 unigenes with an average length of 796 bp. Among these unigenes, 51,953 (47.97%) had at least one significant match with a gene sequence in the public databases searched. A total of 4709 and 9802 differentially expressed genes (DEGs) were identified in MB vs. FB and SS vs. OS, respectively. Genes involved in plant hormone signal and transduction as well as those encoding DNA methyltransferase were found to be differentially expressed between different sex types. Their potential roles in sex determination of G. biloba were discussed. Pistil-related genes were expressed in male buds while anther-specific genes were identified in female buds, suggesting that dioecism in G. biloba was resulted from the selective arrest of reproductive primordia. High correlation of expression level was found between the RNA-Seq and quantitative real-time PCR results. The transcriptome resources that we generated allowed us to characterize gene expression profiles and examine differential expression profiles, which provided foundations for identifying functional genes associated with sex determination in G. biloba. PMID:27379148
Very Low Abundance Single-Cell Transcript Quantification with 5-Plex ddPCRTM Assays.

PubMed

Karlin-Neumann, George; Zhang, Bin; Litterst, Claudia

2018-01-01

Gene expression studies have provided one of the most accessible windows for understanding the molecular basis of cell and tissue phenotypes and how these change in response to stimuli. Current PCR-based and next generation sequencing methods offer great versatility in allowing the focused study of the roles of small numbers of genes or comprehensive profiling of the entire transcriptome of a sample at one time. Marrying of these approaches to various cell sorting technologies has recently enabled the profiling of expression in single cells, thereby increasing the resolution and sensitivity and strengthening the inferences from observed expression levels and changes. This chapter presents a quick and efficient 1-day workflow for sorting single cells with a small laboratory cell-sorter followed by an ultrahigh sensitivity, multiplexed digital PCR method for quantitative tracking of changes in 5-10 genes per single cell.
Identification of Differentially Expressed Micrornas Associate with Glucose Metabolism in Different Organs of Blunt Snout Bream (Megalobrama amblycephala)

PubMed Central

Miao, Ling-Hong; Lin, Yan; Pan, Wen-Jing; Huang, Xin; Ge, Xian-Ping; Ren, Ming-Chun; Zhou, Qun-Lan; Liu, Bo

2017-01-01

Blunt snout bream (Megalobrama amblycephala) is a widely favored herbivorous fish species and is a frequentlyused fish model for studying the metabolism physiology. This study aimed to provide a comprehensive illustration of the mechanisms of a high-starch diet (HSD) induced lipid metabolic disorder by identifying microRNAs (miRNAs) controlled pathways in glucose and lipid metabolism in fish using high-throughput sequencing technologies. Small RNA libraries derived from intestines, livers, and brains of HSD and normal-starch diet (NSD) treated M. amblycephala were sequenced and 79, 124 and 77 differentially expressed miRNAs (DEMs) in intestines, livers, and brains of HSD treated fish were identified, respectively. Bioinformatics analyses showed that these DEMs targeted hundreds of predicted genes were enriched into metabolic pathways and biosynthetic processes, including peroxisome proliferator-activated receptor (PPAR), glycolysis/gluconeogenesis, and insulin signaling pathway. These analyses confirmed that miRNAs play crucial roles in glucose and lipid metabolism related to high wheat starch treatment. These results provide information on further investigation of a DEM-related mechanism dysregulated by a high carbohydrate diet. PMID:28561770
Comparative inner ear transcriptome analysis between the Rickett's big-footed bats (Myotis ricketti) and the greater short-nosed fruit bats (Cynopterus sphinx).

PubMed

Dong, Dong; Lei, Ming; Liu, Yang; Zhang, Shuyi

2013-12-23

Bats have aroused great interests of researchers for the sake of their advanced echolocation system. However, this highly specialized trait is not characteristic of Old World fruit bats. To comprehensively explore the underlying molecular basis between echolocating and non-echolocating bats, we employed a sequence-based approach to compare the inner ear expression difference between the Rickett's big-footed bat (Myotis ricketti, echolocating bat) and the Greater short-nosed fruit bat (Cynopterus sphinx, non-echolocating bat). De novo sequence assemblies were developed for both species. The results showed that the biological implications of up-regulated genes in M. ricketti were significantly over-represented in biological process categories such as 'cochlea morphogenesis', 'inner ear morphogenesis' and 'sensory perception of sound', which are consistent with the inner ear morphological and physiological differentiation between the two bat species. Moreover, the expression of TMC1 gene confirmed its important function in echolocating bats. Our work presents the first transcriptome comparison between echolocating and non-echolocating bats, and provides information about the genetic basis of their distinct hearing traits.
The FLEXGene repository: exploiting the fruits of the genome projects by creating a needed resource to face the challenges of the post-genomic era.

PubMed

Brizuela, Leonardo; Richardson, Aaron; Marsischky, Gerald; Labaer, Joshua

2002-01-01

Thanks to the results of the multiple completed and ongoing genome sequencing projects and to the newly available recombination-based cloning techniques, it is now possible to build gene repositories with no precedent in their composition, formatting, and potential. This new type of gene repository is necessary to address the challenges imposed by the post-genomic era, i.e., experimentation on a genome-wide scale. We are building the FLEXGene (Full Length EXpression-ready) repository. This unique resource will contain clones representing the complete ORFeome of different organisms, including Homo sapiens as well as several pathogens and model organisms. It will consist of a comprehensive, characterized (sequence-verified), and arrayed gene repository. This resource will allow full exploitation of the genomic information by enabling genome-wide scale experimentation at the level of functional/phenotypic assays as well as at the level of protein expression, purification, and analysis. Here we describe the rationale and construction of this resource and focus on the data obtained from the Saccharomyces cerevisiae project.
Poly A- transcripts expressed in HeLa cells.

PubMed

Wu, Qingfa; Kim, Yeong C; Lu, Jian; Xuan, Zhenyu; Chen, Jun; Zheng, Yonglan; Zhou, Tom; Zhang, Michael Q; Wu, Chung-I; Wang, San Ming

2008-07-30

Transcripts expressed in eukaryotes are classified as poly A+ transcripts or poly A- transcripts based on the presence or absence of the 3' poly A tail. Most transcripts identified so far are poly A+ transcripts, whereas the poly A- transcripts remain largely unknown. We developed the TRD (Total RNA Detection) system for transcript identification. The system detects the transcripts through the following steps: 1) depleting the abundant ribosomal and small-size transcripts; 2) synthesizing cDNA without regard to the status of the 3' poly A tail; 3) applying the 454 sequencing technology for massive 3' EST collection from the cDNA; and 4) determining the genome origins of the detected transcripts by mapping the sequences to the human genome reference sequences. Using this system, we characterized the cytoplasmic transcripts from HeLa cells. Of the 13,467 distinct 3' ESTs analyzed, 24% are poly A-, 36% are poly A+, and 40% are bimorphic with poly A+ features but without the 3' poly A tail. Most of the poly A- 3' ESTs do not match known transcript sequences; they have a similar distribution pattern in the genome as the poly A+ and bimorphic 3' ESTs, and their mapped intergenic regions are evolutionarily conserved. Experiments confirmed the authenticity of the detected poly A- transcripts. Our study provides the first large-scale sequence evidence for the presence of poly A- transcripts in eukaryotes. The abundance of the poly A- transcripts highlights the need for comprehensive identification of these transcripts for decoding the transcriptome, annotating the genome and studying biological relevance of the poly A- transcripts.
High-resolution mapping, characterization, and optimization of autonomously replicating sequences in yeast

PubMed Central

Liachko, Ivan; Youngblood, Rachel A.; Keich, Uri; Dunham, Maitreya J.

2013-01-01

DNA replication origins are necessary for the duplication of genomes. In addition, plasmid-based expression systems require DNA replication origins to maintain plasmids efficiently. The yeast autonomously replicating sequence (ARS) assay has been a valuable tool in dissecting replication origin structure and function. However, the dearth of information on origins in diverse yeasts limits the availability of efficient replication origin modules to only a handful of species and restricts our understanding of origin function and evolution. To enable rapid study of origins, we have developed a sequencing-based suite of methods for comprehensively mapping and characterizing ARSs within a yeast genome. Our approach finely maps genomic inserts capable of supporting plasmid replication and uses massively parallel deep mutational scanning to define molecular determinants of ARS function with single-nucleotide resolution. In addition to providing unprecedented detail into origin structure, our data have allowed us to design short, synthetic DNA sequences that retain maximal ARS function. These methods can be readily applied to understand and modulate ARS function in diverse systems. PMID:23241746
Comparative transcriptome analysis of the Asteraceae halophyte Karelinia caspica under salt stress.

PubMed

Zhang, Xia; Liao, Maoseng; Chang, Dan; Zhang, Fuchun

2014-12-17

Much attention has been given to the potential of halophytes as sources of tolerance traits for introduction into cereals. However, a great deal remains unknown about the diverse mechanisms employed by halophytes to cope with salinity. To characterize salt tolerance mechanisms underlying Karelinia caspica, an Asteraceae halophyte, we performed Large-scale transcriptomic analysis using a high-throughput Illumina sequencing platform. Comparative gene expression analysis was performed to correlate the effects of salt stress and ABA regulation at the molecular level. Total sequence reads generated by pyrosequencing were assembled into 287,185 non-redundant transcripts with an average length of 652 bp. Using the BLAST function in the Swiss-Prot, NCBI nr, GO, KEGG, and KOG databases, a total of 216,416 coding sequences associated with known proteins were annotated. Among these, 35,533 unigenes were classified into 69 gene ontology categories, and 18,378 unigenes were classified into 202 known pathways. Based on the fold changes observed when comparing the salt stress and control samples, 60,127 unigenes were differentially expressed, with 38,122 and 22,005 up- and down-regulated, respectively. Several of the differentially expressed genes are known to be involved in the signaling pathway of the plant hormone ABA, including ABA metabolism, transport, and sensing as well as the ABA signaling cascade. Transcriptome profiling of K. caspica contribute to a comprehensive understanding of K. caspica at the molecular level. Moreover, the global survey of differentially expressed genes in this species under salt stress and analyses of the effects of salt stress and ABA regulation will contribute to the identification and characterization of genes and molecular mechanisms underlying salt stress responses in Asteraceae plants.
VitisExpDB: a database resource for grape functional genomics.

PubMed

Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L

2008-02-28

The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores approximately 320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of approximately 20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
Resources and Recommendations for Using Transcriptomics to Address Grand Challenges in Comparative Biology.

PubMed

Mykles, Donald L; Burnett, Karen G; Durica, David S; Joyce, Blake L; McCarthy, Fiona M; Schmidt, Carl J; Stillman, Jonathon H

2016-12-01

High-throughput RNA sequencing (RNA-seq) technology has become an important tool for studying physiological responses of organisms to changes in their environment. De novo assembly of RNA-seq data has allowed researchers to create a comprehensive catalog of genes expressed in a tissue and to quantify their expression without a complete genome sequence. The contributions from the "Tapping the Power of Crustacean Transcriptomics to Address Grand Challenges in Comparative Biology" symposium in this issue show the successes and limitations of using RNA-seq in the study of crustaceans. In conjunction with the symposium, the Animal Genome to Phenome Research Coordination Network collated comments from participants at the meeting regarding the challenges encountered when using transcriptomics in their research. Input came from novices and experts ranging from graduate students to principal investigators. Many were unaware of the bioinformatics analysis resources currently available on the CyVerse platform. Our analysis of community responses led to three recommendations for advancing the field: (1) integration of genomic and RNA-seq sequence assemblies for crustacean gene annotation and comparative expression; (2) development of methodologies for the functional analysis of genes; and (3) information and training exchange among laboratories for transmission of best practices. The field lacks the methods for manipulating tissue-specific gene expression. The decapod crustacean research community should consider the cherry shrimp, Neocaridina denticulata, as a decapod model for the application of transgenic tools for functional genomics. This would require a multi-investigator effort. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
VitisExpDB: A database resource for grape functional genomics

PubMed Central

Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L

2008-01-01

Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
Deep Sequencing Reveals Uncharted Isoform Heterogeneity of the Protein-Coding Transcriptome in Cerebral Ischemia.

PubMed

Bhattarai, Sunil; Aly, Ahmed; Garcia, Kristy; Ruiz, Diandra; Pontarelli, Fabrizio; Dharap, Ashutosh

2018-06-03

Gene expression in cerebral ischemia has been a subject of intense investigations for several years. Studies utilizing probe-based high-throughput methodologies such as microarrays have contributed significantly to our existing knowledge but lacked the capacity to dissect the transcriptome in detail. Genome-wide RNA-sequencing (RNA-seq) enables comprehensive examinations of transcriptomes for attributes such as strandedness, alternative splicing, alternative transcription start/stop sites, and sequence composition, thus providing a very detailed account of gene expression. Leveraging this capability, we conducted an in-depth, genome-wide evaluation of the protein-coding transcriptome of the adult mouse cortex after transient focal ischemia at 6, 12, or 24 h of reperfusion using RNA-seq. We identified a total of 1007 transcripts at 6 h, 1878 transcripts at 12 h, and 1618 transcripts at 24 h of reperfusion that were significantly altered as compared to sham controls. With isoform-level resolution, we identified 23 splice variants arising from 23 genes that were novel mRNA isoforms. For a subset of genes, we detected reperfusion time-point-dependent splice isoform switching, indicating an expression and/or functional switch for these genes. Finally, for 286 genes across all three reperfusion time-points, we discovered multiple, distinct, simultaneously expressed and differentially altered isoforms per gene that were generated via alternative transcription start/stop sites. Of these, 165 isoforms derived from 109 genes were novel mRNAs. Together, our data unravel the protein-coding transcriptome of the cerebral cortex at an unprecedented depth to provide several new insights into the flexibility and complexity of stroke-related gene transcription and transcript organization.
Temporality of Features in Near-Death Experience Narratives

PubMed Central

Martial, Charlotte; Cassol, Héléna; Antonopoulos, Georgios; Charlier, Thomas; Heros, Julien; Donneau, Anne-Françoise; Charland-Verville, Vanessa; Laureys, Steven

2017-01-01

Background: After an occurrence of a Near-Death Experience (NDE), Near-Death Experiencers (NDErs) usually report extremely rich and detailed narratives. Phenomenologically, a NDE can be described as a set of distinguishable features. Some authors have proposed regular patterns of NDEs, however, the actual temporality sequence of NDE core features remains a little explored area. Objectives: The aim of the present study was to investigate the frequency distribution of these features (globally and according to the position of features in narratives) as well as the most frequently reported temporality sequences of features. Methods: We collected 154 French freely expressed written NDE narratives (i.e., Greyson NDE scale total score ≥ 7/32). A text analysis was conducted on all narratives in order to infer temporal ordering and frequency distribution of NDE features. Results: Our analyses highlighted the following most frequently reported sequence of consecutive NDE features: Out-of-Body Experience, Experiencing a tunnel, Seeing a bright light, Feeling of peace. Yet, this sequence was encountered in a very limited number of NDErs. Conclusion: These findings may suggest that NDEs temporality sequences can vary across NDErs. Exploring associations and relationships among features encountered during NDEs may complete the rigorous definition and scientific comprehension of the phenomenon. PMID:28659779
Temporality of Features in Near-Death Experience Narratives.

PubMed

Martial, Charlotte; Cassol, Héléna; Antonopoulos, Georgios; Charlier, Thomas; Heros, Julien; Donneau, Anne-Françoise; Charland-Verville, Vanessa; Laureys, Steven

2017-01-01

Background: After an occurrence of a Near-Death Experience (NDE), Near-Death Experiencers (NDErs) usually report extremely rich and detailed narratives. Phenomenologically, a NDE can be described as a set of distinguishable features. Some authors have proposed regular patterns of NDEs, however, the actual temporality sequence of NDE core features remains a little explored area. Objectives: The aim of the present study was to investigate the frequency distribution of these features (globally and according to the position of features in narratives) as well as the most frequently reported temporality sequences of features. Methods: We collected 154 French freely expressed written NDE narratives (i.e., Greyson NDE scale total score ≥ 7/32). A text analysis was conducted on all narratives in order to infer temporal ordering and frequency distribution of NDE features. Results: Our analyses highlighted the following most frequently reported sequence of consecutive NDE features: Out-of-Body Experience, Experiencing a tunnel, Seeing a bright light, Feeling of peace. Yet, this sequence was encountered in a very limited number of NDErs. Conclusion: These findings may suggest that NDEs temporality sequences can vary across NDErs. Exploring associations and relationships among features encountered during NDEs may complete the rigorous definition and scientific comprehension of the phenomenon.

Comprehensive analysis of gene expression patterns in Friedreich's ataxia fibroblasts by RNA sequencing reveals altered levels of protein synthesis factors and solute carriers

PubMed Central

Li, Yanjie; Lu, Yue; Lin, Kevin; Hauser, Lauren A.; Lynch, David R.

2017-01-01

ABSTRACT Friedreich's ataxia (FRDA) is an autosomal recessive neurodegenerative disease usually caused by large homozygous expansions of GAA repeat sequences in intron 1 of the frataxin (FXN) gene. FRDA patients homozygous for GAA expansions have low FXN mRNA and protein levels when compared with heterozygous carriers or healthy controls. Frataxin is a mitochondrial protein involved in iron–sulfur cluster synthesis, and many FRDA phenotypes result from deficiencies in cellular metabolism due to lowered expression of FXN. Presently, there is no effective treatment for FRDA, and biomarkers to measure therapeutic trial outcomes and/or to gauge disease progression are lacking. Peripheral tissues, including blood cells, buccal cells and skin fibroblasts, can readily be isolated from FRDA patients and used to define molecular hallmarks of disease pathogenesis. For instance, FXN mRNA and protein levels as well as FXN GAA-repeat tract lengths are routinely determined using all of these cell types. However, because these tissues are not directly involved in disease pathogenesis, their relevance as models of the molecular aspects of the disease is yet to be decided. Herein, we conducted unbiased RNA sequencing to profile the transcriptomes of fibroblast cell lines derived from 18 FRDA patients and 17 unaffected control individuals. Bioinformatic analyses revealed significantly upregulated expression of genes encoding plasma membrane solute carrier proteins in FRDA fibroblasts. Conversely, the expression of genes encoding accessory factors and enzymes involved in cytoplasmic and mitochondrial protein synthesis was consistently decreased in FRDA fibroblasts. Finally, comparison of genes differentially expressed in FRDA fibroblasts to three previously published gene expression signatures defined for FRDA blood cells showed substantial overlap between the independent datasets, including correspondingly deficient expression of antioxidant defense genes. Together, these results indicate that gene expression profiling of cells derived from peripheral tissues can, in fact, consistently reveal novel molecular pathways of the disease. When performed on statistically meaningful sample group sizes, unbiased global profiling analyses utilizing peripheral tissues are critical for the discovery and validation of FRDA disease biomarkers. PMID:29125828
COMAN: a web server for comprehensive metatranscriptomics analysis.

PubMed

Ni, Yueqiong; Li, Jun; Panagiotou, Gianni

2016-08-11

Microbiota-oriented studies based on metagenomic or metatranscriptomic sequencing have revolutionised our understanding on microbial ecology and the roles of both clinical and environmental microbes. The analysis of massive metatranscriptomic data requires extensive computational resources, a collection of bioinformatics tools and expertise in programming. We developed COMAN (Comprehensive Metatranscriptomics Analysis), a web-based tool dedicated to automatically and comprehensively analysing metatranscriptomic data. COMAN pipeline includes quality control of raw reads, removal of reads derived from non-coding RNA, followed by functional annotation, comparative statistical analysis, pathway enrichment analysis, co-expression network analysis and high-quality visualisation. The essential data generated by COMAN are also provided in tabular format for additional analysis and integration with other software. The web server has an easy-to-use interface and detailed instructions, and is freely available at http://sbb.hku.hk/COMAN/ CONCLUSIONS: COMAN is an integrated web server dedicated to comprehensive functional analysis of metatranscriptomic data, translating massive amount of reads to data tables and high-standard figures. It is expected to facilitate the researchers with less expertise in bioinformatics in answering microbiota-related biological questions and to increase the accessibility and interpretation of microbiota RNA-Seq data.
(Pea)nuts and bolts of visual narrative: Structure and meaning in sequential image comprehension

PubMed Central

Cohn, Neil; Paczynski, Martin; Jackendoff, Ray; Holcomb, Phillip J.; Kuperberg, Gina R.

2012-01-01

Just as syntax differentiates coherent sentences from scrambled word strings, the comprehension of sequential images must also use a cognitive system to distinguish coherent narrative sequences from random strings of images. We conducted experiments analogous to two classic studies of language processing to examine the contributions of narrative structure and semantic relatedness to processing sequential images. We compared four types of comic strips: 1) Normal sequences with both structure and meaning, 2) Semantic Only sequences (in which the panels were related to a common semantic theme, but had no narrative structure), 3) Structural Only sequences (narrative structure but no semantic relatedness), and 4) Scrambled sequences of randomly-ordered panels. In Experiment 1, participants monitored for target panels in sequences presented panel-by-panel. Reaction times were slowest to panels in Scrambled sequences, intermediate in both Structural Only and Semantic Only sequences, and fastest in Normal sequences. This suggests that both semantic relatedness and narrative structure offer advantages to processing. Experiment 2 measured ERPs to all panels across the whole sequence. The N300/N400 was largest to panels in both the Scrambled and Structural Only sequences, intermediate in Semantic Only sequences and smallest in the Normal sequences. This implies that a combination of narrative structure and semantic relatedness can facilitate semantic processing of upcoming panels (as reflected by the N300/N400). Also, panels in the Scrambled sequences evoked a larger left-lateralized anterior negativity than panels in the Structural Only sequences. This localized effect was distinct from the N300/N400, and appeared despite the fact that these two sequence types were matched on local semantic relatedness between individual panels. These findings suggest that sequential image comprehension uses a narrative structure that may be independent of semantic relatedness. Altogether, we argue that the comprehension of visual narrative is guided by an interaction between structure and meaning. PMID:22387723
Investigating the molecular underpinnings underlying morphology and changes in carbon partitioning during tension wood formation in Eucalyptus.

PubMed

Mizrachi, Eshchar; Maloney, Victoria J; Silberbauer, Janine; Hefer, Charles A; Berger, Dave K; Mansfield, Shawn D; Myburg, Alexander A

2015-06-01

Tension wood has distinct physical and chemical properties, including altered fibre properties, cell wall composition and ultrastructure. It serves as a good system for investigating the genetic regulation of secondary cell wall biosynthesis and wood formation. The reference genome sequence for Eucalyptus grandis allows investigation of the global transcriptional reprogramming that accompanies tension wood formation in this global wood fibre crop. We report the first comprehensive analysis of physicochemical wood property changes in tension wood of Eucalyptus measured in a hybrid (E. grandis × Eucalyptus urophylla) clone, as well as genome-wide gene expression changes in xylem tissues 3 wk post-induction using RNA sequencing. We found that Eucalyptus tension wood in field-grown trees is characterized by an increase in cellulose, a reduction in lignin, xylose and mannose, and a marked increase in galactose. Gene expression profiling in tension wood-forming tissue showed corresponding down-regulation of monolignol biosynthetic genes, and differential expression of several carbohydrate active enzymes. We conclude that alterations of cell wall traits induced by tension wood formation in Eucalyptus are a consequence of a combination of down-regulation of lignin biosynthesis and hemicellulose remodelling, rather than the often proposed up-regulation of the cellulose biosynthetic pathway. © 2014 University of Pretoria New Phytologist © 2014 New Phytologist Trust.
A bioinformatics prediction approach towards analyzing the glycosylation, co-expression and interaction patterns of epithelial membrane antigen (EMA/MUC1)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kalra, Rajkumar S., E-mail: renu-wadhwa@aist.go.jp; Wadhwa, Renu, E-mail: renu-wadhwa@aist.go.jp

2015-02-27

Epithelial membrane antigen (EMA or MUC1) is a heavily glycosylated, type I transmembrane glycoprotein commonly expressed by epithelial cells of duct organs. It has been shown to be aberrantly glycosylated in several diseases including cancer. Protein sequence based annotation and analysis of glycosylation profile of glycoproteins by robust computational and comprehensive algorithms provides possible insights to the mechanism(s) of anomalous glycosylation. In present report, by using a number of bioinformatics applications we studied EMA/MUC1 and explored its trans-membrane structural domain sequence that is widely subjected to glycosylation. Exploration of different extracellular motifs led to prediction of N and O-linked glycosylationmore » target sites. Based on the putative O-linked target sites, glycosylated moieties and pathways were envisaged. Furthermore, Protein network analysis demonstrated physical interaction of EMA with a number of proteins and confirmed its functional involvement in cell growth and proliferation pathways. Gene Ontology analysis suggested an involvement of EMA in a number of functions including signal transduction, protein binding, processing and transport along with glycosylation. Thus, present study explored potential of bioinformatics prediction approach in analyzing glycosylation, co-expression and interaction patterns of EMA/MUC1 glycoprotein.« less
Robustly detecting differential expression in RNA sequencing data using observation weights

PubMed Central

Zhou, Xiaobei; Lindsay, Helen; Robinson, Mark D.

2014-01-01

A popular approach for comparing gene expression levels between (replicated) conditions of RNA sequencing data relies on counting reads that map to features of interest. Within such count-based methods, many flexible and advanced statistical approaches now exist and offer the ability to adjust for covariates (e.g. batch effects). Often, these methods include some sort of ‘sharing of information’ across features to improve inferences in small samples. It is important to achieve an appropriate tradeoff between statistical power and protection against outliers. Here, we study the robustness of existing approaches for count-based differential expression analysis and propose a new strategy based on observation weights that can be used within existing frameworks. The results suggest that outliers can have a global effect on differential analyses. We demonstrate the effectiveness of our new approach with real data and simulated data that reflects properties of real datasets (e.g. dispersion-mean trend) and develop an extensible framework for comprehensive testing of current and future methods. In addition, we explore the origin of such outliers, in some cases highlighting additional biological or technical factors within the experiment. Further details can be downloaded from the project website: http://imlspenticton.uzh.ch/robinson_lab/edgeR_robust/. PMID:24753412
Assessment of imprinting- and genetic variation-dependent monoallelic expression using reciprocal allele descendants between human family trios.

PubMed

Chuang, Trees-Juen; Tseng, Yu-Hsiang; Chen, Chia-Ying; Wang, Yi-Da

2017-08-01

Genomic imprinting is an important epigenetic process that silences one of the parentally-inherited alleles of a gene and thereby exhibits allelic-specific expression (ASE). Detection of human imprinting events is hampered by the infeasibility of the reciprocal mating system in humans and the removal of ASE events arising from non-imprinting factors. Here, we describe a pipeline with the pattern of reciprocal allele descendants (RADs) through genotyping and transcriptome sequencing data across independent parent-offspring trios to discriminate between varied types of ASE (e.g., imprinting, genetic variation-dependent ASE, and random monoallelic expression (RME)). We show that the vast majority of ASE events are due to sequence-dependent genetic variant, which are evolutionarily conserved and may themselves play a cis-regulatory role. Particularly, 74% of non-RAD ASE events, even though they exhibit ASE biases toward the same parentally-inherited allele across different individuals, are derived from genetic variation but not imprinting. We further show that the RME effect may affect the effectiveness of the population-based method for detecting imprinting events and our pipeline can help to distinguish between these two ASE types. Taken together, this study provides a good indicator for categorization of different types of ASE, opening up this widespread and complex mechanism for comprehensive characterization.
A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics

PubMed Central

Tang, Haixu; Li, Sujun; Ye, Yuzhen

2016-01-01

Metaproteomic studies adopt the common bottom-up proteomics approach to investigate the protein composition and the dynamics of protein expression in microbial communities. When matched metagenomic and/or metatranscriptomic data of the microbial communities are available, metaproteomic data analyses often employ a metagenome-guided approach, in which complete or fragmental protein-coding genes are first directly predicted from metagenomic (and/or metatranscriptomic) sequences or from their assemblies, and the resulting protein sequences are then used as the reference database for peptide/protein identification from MS/MS spectra. This approach is often limited because protein coding genes predicted from metagenomes are incomplete and fragmental. In this paper, we present a graph-centric approach to improving metagenome-guided peptide and protein identification in metaproteomics. Our method exploits the de Bruijn graph structure reported by metagenome assembly algorithms to generate a comprehensive database of protein sequences encoded in the community. We tested our method using several public metaproteomic datasets with matched metagenomic and metatranscriptomic sequencing data acquired from complex microbial communities in a biological wastewater treatment plant. The results showed that many more peptides and proteins can be identified when assembly graphs were utilized, improving the characterization of the proteins expressed in the microbial communities. The additional proteins we identified contribute to the characterization of important pathways such as those involved in degradation of chemical hazards. Our tools are released as open-source software on github at https://github.com/COL-IU/Graph2Pro. PMID:27918579
Genome-wide genetic variation and comparison of fruit-associated traits between kumquat (Citrus japonica) and Clementine mandarin (Citrus clementina).

PubMed

Liu, Tian-Jia; Li, Yong-Ping; Zhou, Jing-Jing; Hu, Chun-Gen; Zhang, Jin-Zhi

2018-03-01

The comprehensive genetic variation of two citrus species were analyzed at genome and transcriptome level. A total of 1090 differentially expressed genes were found during fruit development by RNA-sequencing. Fruit size (fruit equatorial diameter) and weight (fresh weight) are the two most important components determining yield and consumer acceptability for many horticultural crops. However, little is known about the genetic control of these traits. Here, we performed whole-genome resequencing to reveal the comprehensive genetic variation of the fruit development between kumquat (Citrus japonica) and Clementine mandarin (Citrus clementina). In total, 5,865,235 single-nucleotide polymorphisms (SNPs) and 414,447 insertions/deletions (InDels) were identified in the two citrus species. Based on integrative analysis of genome and transcriptome of fruit, 640,801 SNPs and 20,733 InDels were identified. The features, genomic distribution, functional effect, and other characteristics of these genetic variations were explored. RNA-sequencing identified 1090 differentially expressed genes (DEGs) during fruit development of kumquat and Clementine mandarin. Gene Ontology revealed that these genes were involved in various molecular functional and biological processes. In addition, the genetic variation of 939 DEGs and 74 multiple fruit development pathway genes from previous reports were also identified. A global survey identified 24,237 specific alternative splicing events in the two citrus species and showed that intron retention is the most prevalent pattern of alternative splicing. These genome variation data provide a foundation for further exploration of citrus diversity and gene-phenotype relationships and for future research on molecular breeding to improve kumquat, Clementine mandarin and related species.
Challenges of coverage policy development for next-generation tumor sequencing panels: experts and payers weigh in.

PubMed

Trosman, Julia R; Weldon, Christine B; Kelley, R Kate; Phillips, Kathryn A

2015-03-01

Next-generation tumor sequencing (NGTS) panels, which include multiple established and novel targets across cancers, are emerging in oncology practice, but lack formal positive coverage by US payers. Lack of coverage may impact access and adoption. This study identified challenges of NGTS coverage by private payers. We conducted semi-structured interviews with 14 NGTS experts on potential NGTS benefits, and with 10 major payers, representing more than 125,000,000 enrollees, on NGTS coverage considerations. We used the framework approach of qualitative research for study design and thematic analyses and simple frequencies to further describe findings. All interviewed payers see potential NGTS benefits, but all noted challenges to formal coverage: 80% state that inherent features of NGTS do not fit the medical necessity definition required for coverage, 70% view NGTS as a bundle of targets versus comprehensive tumor characterization and may evaluate each target individually, and 70% express skepticism regarding new evidence methods proposed for NGTS. Fifty percent of payers expressed sufficient concerns about NGTS adoption and implementation that will preclude their ability to issue positive coverage policies. Payers perceive that NGTS holds significant promise but, in its current form, poses disruptive challenges to coverage policy frameworks. Proactive multidisciplinary efforts to define the direction for NGTS development, evidence generation, and incorporation into coverage policy are necessary to realize its promise and provide patient access. This study contributes to current literature, as possibly the first study to directly interview US payers on NGTS coverage and reimbursement. Copyright © 2015 by the National Comprehensive Cancer Network.
De novo transcriptome sequencing of Acer palmatum and comprehensive analysis of differentially expressed genes under salt stress in two contrasting genotypes.

PubMed

Rong, Liping; Li, Qianzhong; Li, Shushun; Tang, Ling; Wen, Jing

2016-04-01

Maple (Acer palmatum) is an important species for landscape planting worldwide. Salt stress affects the normal growth of the Maple leaf directly, leading to loss of esthetic value. However, the limited availability of Maple genomic information has hindered research on the mechanisms underlying this tolerance. In this study, we performed comprehensive analyses of the salt tolerance in two genotypes of Maple using RNA-seq. Approximately 146.4 million paired-end reads, representing 181,769 unigenes, were obtained. The N50 length of the unigenes was 738 bp, and their total length over 102.66 Mb. 14,090 simple sequence repeats and over 500,000 single nucleotide polymorphisms were identified, which represent useful resources for marker development. Importantly, 181,769 genes were detected in at least one library, and 303 differentially expressed genes (DEGs) were identified between salt-sensitive and salt-tolerant genotypes. Among these DEGs, 125 were upregulated and 178 were downregulated genes. Two MYB-related proteins and one LEA protein were detected among the first 10 most downregulated genes. Moreover, a methyltransferase-related gene was detected among the first 10 most upregulated genes. The three most significantly enriched pathways were plant hormone signal transduction, arginine and proline metabolism, and photosynthesis. The transcriptome analysis provided a rich genetic resource for gene discovery related to salt tolerance in Maple, and in closely related species. The data will serve as an important public information platform to further our understanding of the molecular mechanisms involved in salt tolerance in Maple.
Mutation allele burden remains unchanged in chronic myelomonocytic leukaemia responding to hypomethylating agents

DOE PAGES

Merlevede, Jane; Droin, Nathalie; Qin, Tingting; ...

2016-02-24

The cytidine analogues azacytidine and 5-aza-2’-deoxycytidine (decitabine) are commonly used to treat myelodysplastic syndromes, with or without a myeloproliferative component. It remains unclear whether the response to these hypomethylating agents results from a cytotoxic or an epigenetic effect. In this study, we address this question in chronic myelomonocytic leukaemia. We describe a comprehensive analysis of the mutational landscape of these tumours, combining whole-exome and whole-genome sequencing. We identify an average of 14 ± 5 somatic mutations in coding sequences of sorted monocyte DNA and the signatures of three mutational processes. Serial sequencing demonstrates that the response to hypomethylating agents ismore » associated with changes in DNA methylation and gene expression, without any decrease in the mutation allele burden, nor prevention of new genetic alteration occurence. Lastly, our findings indicate that cytosine analogues restore a balanced haematopoiesis without decreasing the size of the mutated clone, arguing for a predominantly epigenetic effect.« less
Mutation allele burden remains unchanged in chronic myelomonocytic leukaemia responding to hypomethylating agents

PubMed Central

Merlevede, Jane; Droin, Nathalie; Qin, Tingting; Meldi, Kristen; Yoshida, Kenichi; Morabito, Margot; Chautard, Emilie; Auboeuf, Didier; Fenaux, Pierre; Braun, Thorsten; Itzykson, Raphael; de Botton, Stéphane; Quesnel, Bruno; Commes, Thérèse; Jourdan, Eric; Vainchenker, William; Bernard, Olivier; Pata-Merci, Noemie; Solier, Stéphanie; Gayevskiy, Velimir; Dinger, Marcel E.; Cowley, Mark J.; Selimoglu-Buet, Dorothée; Meyer, Vincent; Artiguenave, François; Deleuze, Jean-François; Preudhomme, Claude; Stratton, Michael R.; Alexandrov, Ludmil B.; Padron, Eric; Ogawa, Seishi; Koscielny, Serge; Figueroa, Maria; Solary, Eric

2016-01-01

The cytidine analogues azacytidine and 5-aza-2'-deoxycytidine (decitabine) are commonly used to treat myelodysplastic syndromes, with or without a myeloproliferative component. It remains unclear whether the response to these hypomethylating agents results from a cytotoxic or an epigenetic effect. In this study, we address this question in chronic myelomonocytic leukaemia. We describe a comprehensive analysis of the mutational landscape of these tumours, combining whole-exome and whole-genome sequencing. We identify an average of 14±5 somatic mutations in coding sequences of sorted monocyte DNA and the signatures of three mutational processes. Serial sequencing demonstrates that the response to hypomethylating agents is associated with changes in DNA methylation and gene expression, without any decrease in the mutation allele burden, nor prevention of new genetic alteration occurence. Our findings indicate that cytosine analogues restore a balanced haematopoiesis without decreasing the size of the mutated clone, arguing for a predominantly epigenetic effect. PMID:26908133
Identification of reference genes for quantitative expression analysis using large-scale RNA-seq data of Arabidopsis thaliana and model crop plants.

PubMed

Kudo, Toru; Sasaki, Yohei; Terashima, Shin; Matsuda-Imai, Noriko; Takano, Tomoyuki; Saito, Misa; Kanno, Maasa; Ozaki, Soichi; Suwabe, Keita; Suzuki, Go; Watanabe, Masao; Matsuoka, Makoto; Takayama, Seiji; Yano, Kentaro

2016-10-13

In quantitative gene expression analysis, normalization using a reference gene as an internal control is frequently performed for appropriate interpretation of the results. Efforts have been devoted to exploring superior novel reference genes using microarray transcriptomic data and to evaluating commonly used reference genes by targeting analysis. However, because the number of specifically detectable genes is totally dependent on probe design in the microarray analysis, exploration using microarray data may miss some of the best choices for the reference genes. Recently emerging RNA sequencing (RNA-seq) provides an ideal resource for comprehensive exploration of reference genes since this method is capable of detecting all expressed genes, in principle including even unknown genes. We report the results of a comprehensive exploration of reference genes using public RNA-seq data from plants such as Arabidopsis thaliana (Arabidopsis), Glycine max (soybean), Solanum lycopersicum (tomato) and Oryza sativa (rice). To select reference genes suitable for the broadest experimental conditions possible, candidates were surveyed by the following four steps: (1) evaluation of the basal expression level of each gene in each experiment; (2) evaluation of the expression stability of each gene in each experiment; (3) evaluation of the expression stability of each gene across the experiments; and (4) selection of top-ranked genes, after ranking according to the number of experiments in which the gene was expressed stably. Employing this procedure, 13, 10, 12 and 21 top candidates for reference genes were proposed in Arabidopsis, soybean, tomato and rice, respectively. Microarray expression data confirmed that the expression of the proposed reference genes under broad experimental conditions was more stable than that of commonly used reference genes. These novel reference genes will be useful for analyzing gene expression profiles across experiments carried out under various experimental conditions.
Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars

PubMed Central

2012-01-01

Background Roses (Rosa sp.), which belong to the family Rosaceae, are the most economically important ornamental plants—making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. Results We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: ‘Vital’, ‘Maroussia’, and ‘Sympathy’ and Rosa rugosa Thunb. , respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO) terms, Plant Ontology (PO) terms, and MIPS Functional Catalogue (FunCat) terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach) and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. Conclusions In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a comprehensive genetic resource which can be used to better understand rose flower development and to identify candidate genes for important phenotypes. PMID:23171001
Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars.

PubMed

Kim, Jungeun; Park, June Hyun; Lim, Chan Ju; Lim, Jae Yun; Ryu, Jee-Youn; Lee, Bong-Woo; Choi, Jae-Pil; Kim, Woong Bom; Lee, Ha Yeon; Choi, Yourim; Kim, Donghyun; Hur, Cheol-Goo; Kim, Sukweon; Noh, Yoo-Sun; Shin, Chanseok; Kwon, Suk-Yoon

2012-11-21

Roses (Rosa sp.), which belong to the family Rosaceae, are the most economically important ornamental plants--making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: 'Vital', 'Maroussia', and 'Sympathy' and Rosa rugosa Thunb., respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO) terms, Plant Ontology (PO) terms, and MIPS Functional Catalogue (FunCat) terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach) and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a comprehensive genetic resource which can be used to better understand rose flower development and to identify candidate genes for important phenotypes.
Comprehensive analysis of MHC class I genes from the U-, S-, and Z-lineages in Atlantic salmon.

PubMed

Lukacs, Morten F; Harstad, Håvard; Bakke, Hege G; Beetz-Sargent, Marianne; McKinnel, Linda; Lubieniecki, Krzysztof P; Koop, Ben F; Grimholt, Unni

2010-03-05

We have previously sequenced more than 500 kb of the duplicated MHC class I regions in Atlantic salmon. In the IA region we identified the loci for the MHC class I gene Sasa-UBA in addition to a soluble MHC class I molecule, Sasa-ULA. A pseudolocus for Sasa-UCA was identified in the nonclassical IB region. Both regions contained genes for antigen presentation, as wells as orthologues to other genes residing in the human MHC region. The genomic localisation of two MHC class I lineages (Z and S) has been resolved. 7 BACs were sequenced using a combination of standard Sanger and 454 sequencing. The new sequence data extended the IA region with 150 kb identifying the location of one Z-lineage locus, ZAA. The IB region was extended with 350 kb including three new Z-lineage loci, ZBA, ZCA and ZDA in addition to a UGA locus. An allelic version of the IB region contained a functional UDA locus in addition to the UCA pseudolocus. Additionally a BAC harbouring two MHC class I genes (UHA) was placed on linkage group 14, while a BAC containing the S-lineage locus SAA (previously known as UAA) was placed on LG10. Gene expression studies showed limited expression range for all class I genes with exception of UBA being dominantly expressed in gut, spleen and gills, and ZAA with high expression in blood. Here we describe the genomic organization of MHC class I loci from the U-, Z-, and S-lineages in Atlantic salmon. Nine of the described class I genes are located in the extension of the duplicated IA and IB regions, while three class I genes are found on two separate linkage groups. The gene organization of the two regions indicates that the IB region is evolving at a different pace than the IA region. Expression profiling, polymorphic content, peptide binding properties and phylogenetic relationship show that Atlantic salmon has only one MHC class Ia gene (UBA), in addition to a multitude of nonclassical MHC class I genes from the U-, S- and Z-lineages.
Next-generation sequencing facilitates quantitative analysis of wild-type and Nrl−/− retinal transcriptomes

PubMed Central

Brooks, Matthew J.; Rajasimha, Harsha K.; Roger, Jerome E.

2011-01-01

Purpose Next-generation sequencing (NGS) has revolutionized systems-based analysis of cellular pathways. The goals of this study are to compare NGS-derived retinal transcriptome profiling (RNA-seq) to microarray and quantitative reverse transcription polymerase chain reaction (qRT–PCR) methods and to evaluate protocols for optimal high-throughput data analysis. Methods Retinal mRNA profiles of 21-day-old wild-type (WT) and neural retina leucine zipper knockout (Nrl−/−) mice were generated by deep sequencing, in triplicate, using Illumina GAIIx. The sequence reads that passed quality filters were analyzed at the transcript isoform level with two methods: Burrows–Wheeler Aligner (BWA) followed by ANOVA (ANOVA) and TopHat followed by Cufflinks. qRT–PCR validation was performed using TaqMan and SYBR Green assays. Results Using an optimized data analysis workflow, we mapped about 30 million sequence reads per sample to the mouse genome (build mm9) and identified 16,014 transcripts in the retinas of WT and Nrl−/− mice with BWA workflow and 34,115 transcripts with TopHat workflow. RNA-seq data confirmed stable expression of 25 known housekeeping genes, and 12 of these were validated with qRT–PCR. RNA-seq data had a linear relationship with qRT–PCR for more than four orders of magnitude and a goodness of fit (R2) of 0.8798. Approximately 10% of the transcripts showed differential expression between the WT and Nrl−/− retina, with a fold change ≥1.5 and p value <0.05. Altered expression of 25 genes was confirmed with qRT–PCR, demonstrating the high degree of sensitivity of the RNA-seq method. Hierarchical clustering of differentially expressed genes uncovered several as yet uncharacterized genes that may contribute to retinal function. Data analysis with BWA and TopHat workflows revealed a significant overlap yet provided complementary insights in transcriptome profiling. Conclusions Our study represents the first detailed analysis of retinal transcriptomes, with biologic replicates, generated by RNA-seq technology. The optimized data analysis workflows reported here should provide a framework for comparative investigations of expression profiles. Our results show that NGS offers a comprehensive and more accurate quantitative and qualitative evaluation of mRNA content within a cell or tissue. We conclude that RNA-seq based transcriptome characterization would expedite genetic network analyses and permit the dissection of complex biologic functions. PMID:22162623
The somatic genomic landscape of chromophobe renal cell carcinoma

PubMed Central

Davis, Caleb F.; Ricketts, Christopher; Wang, Min; Yang, Lixing; Cherniack, Andrew D.; Shen, Hui; Buhay, Christian; Kang, Hyojin; Kim, Sang Cheol; Fahey, Catherine C.; Hacker, Kathryn E.; Bhanot, Gyan; Gordenin, Dmitry A.; Chu, Andy; Gunaratne, Preethi H.; Biehl, Michael; Seth, Sahil; Kaipparettu, Benny A.; Bristow, Christopher A.; Donehower, Lawrence A.; Wallen, Eric M.; Smith, Angela B.; Tickoo, Satish K.; Tamboli, Pheroze; Reuter, Victor; Schmidt, Laura S.; Hsieh, James J.; Choueiri, Toni K.; Hakimi, A. Ari; Chin, Lynda; Meyerson, Matthew; Kucherlapati, Raju; Park, Woong-Yang; Robertson, A. Gordon; Laird, Peter W.; Henske, Elizabeth P.; Kwiatkowski, David J.; Park, Peter J.; Morgan, Margaret; Shuch, Brian; Muzny, Donna; Wheeler, David A.; Linehan, W. Marston; Gibbs, Richard A.; Rathmell, W. Kimryn; Creighton, Chad J.

2014-01-01

Summary We describe the landscape of somatic genomic alterations of 66 chromophobe renal cell carcinomas (ChRCCs) based on multidimensional and comprehensive characterization, including mitochondrial DNA (mtDNA) and whole genome sequencing. The result is consistent that ChRCC originates from the distal nephron compared to other kidney cancers with more proximal origins. Combined mtDNA and gene expression analysis implicates changes in mitochondrial function as a component of the disease biology, while suggesting alternative roles for mtDNA mutations in cancers relying on oxidative phosphorylation. Genomic rearrangements lead to recurrent structural breakpoints within TERT promoter region, which correlates with highly elevated TERT expression and manifestation of kataegis, representing a mechanism of TERT up-regulation in cancer distinct from previously-observed amplifications and point mutations. PMID:25155756
The somatic genomic landscape of chromophobe renal cell carcinoma.

PubMed

Davis, Caleb F; Ricketts, Christopher J; Wang, Min; Yang, Lixing; Cherniack, Andrew D; Shen, Hui; Buhay, Christian; Kang, Hyojin; Kim, Sang Cheol; Fahey, Catherine C; Hacker, Kathryn E; Bhanot, Gyan; Gordenin, Dmitry A; Chu, Andy; Gunaratne, Preethi H; Biehl, Michael; Seth, Sahil; Kaipparettu, Benny A; Bristow, Christopher A; Donehower, Lawrence A; Wallen, Eric M; Smith, Angela B; Tickoo, Satish K; Tamboli, Pheroze; Reuter, Victor; Schmidt, Laura S; Hsieh, James J; Choueiri, Toni K; Hakimi, A Ari; Chin, Lynda; Meyerson, Matthew; Kucherlapati, Raju; Park, Woong-Yang; Robertson, A Gordon; Laird, Peter W; Henske, Elizabeth P; Kwiatkowski, David J; Park, Peter J; Morgan, Margaret; Shuch, Brian; Muzny, Donna; Wheeler, David A; Linehan, W Marston; Gibbs, Richard A; Rathmell, W Kimryn; Creighton, Chad J

2014-09-08

We describe the landscape of somatic genomic alterations of 66 chromophobe renal cell carcinomas (ChRCCs) on the basis of multidimensional and comprehensive characterization, including mtDNA and whole-genome sequencing. The result is consistent that ChRCC originates from the distal nephron compared with other kidney cancers with more proximal origins. Combined mtDNA and gene expression analysis implicates changes in mitochondrial function as a component of the disease biology, while suggesting alternative roles for mtDNA mutations in cancers relying on oxidative phosphorylation. Genomic rearrangements lead to recurrent structural breakpoints within TERT promoter region, which correlates with highly elevated TERT expression and manifestation of kataegis, representing a mechanism of TERT upregulation in cancer distinct from previously observed amplifications and point mutations. Copyright © 2014 Elsevier Inc. All rights reserved.

Genome-wide identification of sweet orange (Citrus sinensis) histone modification gene families and their expression analysis during the fruit development and fruit-blue mold infection process.

PubMed

Xu, Jidi; Xu, Haidan; Liu, Yuanlong; Wang, Xia; Xu, Qiang; Deng, Xiuxin

2015-01-01

In eukaryotes, histone acetylation and methylation have been known to be involved in regulating diverse developmental processes and plant defense. These histone modification events are controlled by a series of histone modification gene families. To date, there is no study regarding genome-wide characterization of histone modification related genes in citrus species. Based on the two recent sequenced sweet orange genome databases, a total of 136 CsHMs (Citrus sinensis histone modification genes), including 47 CsHMTs (histone methyltransferase genes), 23 CsHDMs (histone demethylase genes), 50 CsHATs (histone acetyltransferase genes), and 16 CsHDACs (histone deacetylase genes) were identified. These genes were categorized to 11 gene families. A comprehensive analysis of these 11 gene families was performed with chromosome locations, phylogenetic comparison, gene structures, and conserved domain compositions of proteins. In order to gain an insight into the potential roles of these genes in citrus fruit development, 42 CsHMs with high mRNA abundance in fruit tissues were selected to further analyze their expression profiles at six stages of fruit development. Interestingly, a numbers of genes were expressed highly in flesh of ripening fruit and some of them showed the increasing expression levels along with the fruit development. Furthermore, we analyzed the expression patterns of all 136 CsHMs response to the infection of blue mold (Penicillium digitatum), which is the most devastating pathogen in citrus post-harvest process. The results indicated that 20 of them showed the strong alterations of their expression levels during the fruit-pathogen infection. In conclusion, this study presents a comprehensive analysis of the histone modification gene families in sweet orange and further elucidates their behaviors during the fruit development and the blue mold infection responses.
Gene expression profile in cerebrum in the filial imprinting of domestic chicks (Gallus gallus domesticus).

PubMed

Yamaguchi, Shinji; Fujii-Taira, Ikuko; Katagiri, Sachiko; Izawa, Ei-Ichi; Fujimoto, Yasuyuki; Takeuchi, Hideaki; Takano, Tatsuya; Matsushima, Toshiya; Homma, Koichi J

2008-06-15

In newly hatched chicks, gene expression in the brain has previously been shown to be up-regulated following filial imprinting. By applying cDNA microarrays containing 13,007 expressed sequence tags, we examined the comprehensive gene expression profiling of the intermediate medial mesopallium in the chick cerebrum, which has been shown to play a key role in filial imprinting. We found 52 up-regulated genes and 6 down-regulated genes of at least 2.0-fold changes 3h after the training of filial imprinting, compared to the gene expression of the dark-reared chick brain. The up-regulated genes are known to be involved in a variety of pathways, including signal transduction, cytoskeletal organization, nuclear function, cell metabolism, RNA binding, endoplasmic reticulum or Golgi function, synaptic function, ion channel, and transporter. In contrast, fewer genes were down-regulated in the imprinting, coinciding with the previous data that the total RNA synthesis increased associated with filial imprinting. Our data suggests that the filial imprinting involves the modulation of multiple signaling pathways.
Whole transcriptome analysis of the fasting and fed Burmese python heart: insights into extreme physiological cardiac adaptation.

PubMed

Wall, Christopher E; Cozza, Steven; Riquelme, Cecilia A; McCombie, W Richard; Heimiller, Joseph K; Marr, Thomas G; Leinwand, Leslie A

2011-01-01

The infrequently feeding Burmese python (Python molurus) experiences significant and rapid postprandial cardiac hypertrophy followed by regression as digestion is completed. To begin to explore the molecular mechanisms of this response, we have sequenced and assembled the fasted and postfed Burmese python heart transcriptomes with Illumina technology using the chicken (Gallus gallus) genome as a reference. In addition, we have used RNA-seq analysis to identify differences in the expression of biological processes and signaling pathways between fasted, 1 day postfed (DPF), and 3 DPF hearts. Out of a combined transcriptome of ∼2,800 mRNAs, 464 genes were differentially expressed. Genes showing differential expression at 1 DPF compared with fasted were enriched for biological processes involved in metabolism and energetics, while genes showing differential expression at 3 DPF compared with fasted were enriched for processes involved in biogenesis, structural remodeling, and organization. Moreover, we present evidence for the activation of physiological and not pathological signaling pathways in this rapid, novel model of cardiac growth in pythons. Together, our data provide the first comprehensive gene expression profile for a reptile heart.
Comprehensive analysis of GASA family members in the Malus domestica genome: identification, characterization, and their expressions in response to apple flower induction.

PubMed

Fan, Sheng; Zhang, Dong; Zhang, Lizhi; Gao, Cai; Xin, Mingzhi; Tahir, Muhammad Mobeen; Li, Youmei; Ma, Juanjuan; Han, Mingyu

2017-10-27

The plant-specific gibberellic acid stimulated Arabidopsis (GASA) gene family is critical for plant development. However, little is known about these genes, particularly in fruit tree species. We identified 15 putative Arabidopsis thaliana GASA (AtGASA) and 26 apple GASA (MdGASA) genes. The identified genes were then characterized (e.g., chromosomal location, structure, and evolutionary relationships). All of the identified A. thaliana and apple GASA proteins included a conserved GASA domain and exhibited similar characteristics. Specifically, the MdGASA expression levels in various tissues and organs were analyzed based on an online gene expression profile and by qRT-PCR. These genes were more highly expressed in the leaves, buds, and fruits compared with the seeds, roots, and seedlings. MdGASA genes were also responsive to gibberellic acid (GA 3 ) and abscisic acid treatments. Additionally, transcriptome sequencing results revealed seven potential flowering-related MdGASA genes. We analyzed the expression levels of these genes in response to flowering-related treatments (GA 3 , 6-benzylaminopurine, and sugar) and in apple varieties that differed in terms of flowering ('Nagafu No. 2' and 'Yanfu No. 6') during the flower induction period. These candidate MdGASA genes exhibited diverse expression patterns. The expression levels of six MdGASA genes were inhibited by GA 3 , while the expression of one gene was up-regulated. Additionally, there were expression-level differences induced by the 6-benzylaminopurine and sugar treatments during the flower induction stage, as well as in the different flowering varieties. This study represents the first comprehensive investigation of the A. thaliana and apple GASA gene families. Our data may provide useful clues for future studies and may support the hypotheses regarding the role of GASA proteins during the flower induction stage in fruit tree species.
Transcriptional profiles of bovine in vivo pre-implantation development.

PubMed

Jiang, Zongliang; Sun, Jiangwen; Dong, Hong; Luo, Oscar; Zheng, Xinbao; Obergfell, Craig; Tang, Yong; Bi, Jinbo; O'Neill, Rachel; Ruan, Yijun; Chen, Jingbo; Tian, Xiuchun Cindy

2014-09-04

During mammalian pre-implantation embryonic development dramatic and orchestrated changes occur in gene transcription. The identification of the complete changes has not been possible until the development of the Next Generation Sequencing Technology. Here we report comprehensive transcriptome dynamics of single matured bovine oocytes and pre-implantation embryos developed in vivo. Surprisingly, more than half of the estimated 22,000 bovine genes, 11,488 to 12,729 involved in more than 100 pathways, is expressed in oocytes and early embryos. Despite the similarity in the total numbers of genes expressed across stages, the nature of the expressed genes is dramatically different. A total of 2,845 genes were differentially expressed among different stages, of which the largest change was observed between the 4- and 8-cell stages, demonstrating that the bovine embryonic genome is activated at this transition. Additionally, 774 genes were identified as only expressed/highly enriched in particular stages of development, suggesting their stage-specific roles in embryogenesis. Using weighted gene co-expression network analysis, we found 12 stage-specific modules of co-expressed genes that can be used to represent the corresponding stage of development. Furthermore, we identified conserved key members (or hub genes) of the bovine expressed gene networks. Their vast association with other embryonic genes suggests that they may have important regulatory roles in embryo development; yet, the majority of the hub genes are relatively unknown/under-studied in embryos. We also conducted the first comparison of embryonic expression profiles across three mammalian species, human, mouse and bovine, for which RNA-seq data are available. We found that the three species share more maternally deposited genes than embryonic genome activated genes. More importantly, there are more similarities in embryonic transcriptomes between bovine and humans than between humans and mice, demonstrating that bovine embryos are better models for human embryonic development. This study provides a comprehensive examination of gene activities in bovine embryos and identified little-known potential master regulators of pre-implantation development.
A survey of human brain transcriptome diversity at the single cell level.

PubMed

Darmanis, Spyros; Sloan, Steven A; Zhang, Ye; Enge, Martin; Caneda, Christine; Shuer, Lawrence M; Hayden Gephart, Melanie G; Barres, Ben A; Quake, Stephen R

2015-06-09

The human brain is a tissue of vast complexity in terms of the cell types it comprises. Conventional approaches to classifying cell types in the human brain at single cell resolution have been limited to exploring relatively few markers and therefore have provided a limited molecular characterization of any given cell type. We used single cell RNA sequencing on 466 cells to capture the cellular complexity of the adult and fetal human brain at a whole transcriptome level. Healthy adult temporal lobe tissue was obtained during surgical procedures where otherwise normal tissue was removed to gain access to deeper hippocampal pathology in patients with medical refractory seizures. We were able to classify individual cells into all of the major neuronal, glial, and vascular cell types in the brain. We were able to divide neurons into individual communities and show that these communities preserve the categorization of interneuron subtypes that is typically observed with the use of classic interneuron markers. We then used single cell RNA sequencing on fetal human cortical neurons to identify genes that are differentially expressed between fetal and adult neurons and those genes that display an expression gradient that reflects the transition between replicating and quiescent fetal neuronal populations. Finally, we observed the expression of major histocompatibility complex type I genes in a subset of adult neurons, but not fetal neurons. The work presented here demonstrates the applicability of single cell RNA sequencing on the study of the adult human brain and constitutes a first step toward a comprehensive cellular atlas of the human brain.
Deep sequencing of small RNA repertoires in mice reveals metabolic disorders-associated hepatic miRNAs.

PubMed

Liang, Tingming; Liu, Chang; Ye, Zhenchao

2013-01-01

Obesity and associated metabolic disorders contribute importantly to the metabolic syndrome. On the other hand, microRNAs (miRNAs) are a class of small non-coding RNAs that repress target gene expression by inducing mRNA degradation and/or translation repression. Dysregulation of specific miRNAs in obesity may influence energy metabolism and cause insulin resistance, which leads to dyslipidemia, steatosis hepatis and type 2 diabetes. In the present study, we comprehensively analyzed and validated dysregulated miRNAs in ob/ob mouse liver, as well as miRNA groups based on miRNA gene cluster and gene family by using deep sequencing miRNA datasets. We found that over 13.8% of the total analyzed miRNAs were dysregulated, of which 37 miRNA species showed significantly differential expression. Further RT-qPCR analysis in some selected miRNAs validated the similar expression patterns observed in deep sequencing. Interestingly, we found that miRNA gene cluster and family always showed consistent dysregulation patterns in ob/ob mouse liver, although they had various enrichment levels. Functional enrichment analysis revealed the versatile physiological roles (over six signal pathways and five human diseases) of these miRNAs. Biological studies indicated that overexpression of miR-126 or inhibition of miR-24 in AML-12 cells attenuated free fatty acids-induced fat accumulation. Taken together, our data strongly suggest that obesity and metabolic disturbance are tightly associated with functional miRNAs. We also identified hepatic miRNA candidates serving as potential biomarkers for the diagnose of the metabolic syndrome.
More Than Words: The Role of Multiword Sequences in Language Learning and Use.

PubMed

Christiansen, Morten H; Arnon, Inbal

2017-07-01

The ability to convey our thoughts using an infinite number of linguistic expressions is one of the hallmarks of human language. Understanding the nature of the psychological mechanisms and representations that give rise to this unique productivity is a fundamental goal for the cognitive sciences. A long-standing hypothesis is that single words and rules form the basic building blocks of linguistic productivity, with multiword sequences being treated as units only in peripheral cases such as idioms. The new millennium, however, has seen a shift toward construing multiword linguistic units not as linguistic rarities, but as important building blocks for language acquisition and processing. This shift-which originated within theoretical approaches that emphasize language learning and use-has far-reaching implications for theories of language representation, processing, and acquisition. Incorporating multiword units as integral building blocks blurs the distinction between grammar and lexicon; calls for models of production and comprehension that can accommodate and give rise to the effect of multiword information on processing; and highlights the importance of such units to learning. In this special topic, we bring together cutting-edge work on multiword sequences in theoretical linguistics, first-language acquisition, psycholinguistics, computational modeling, and second-language learning to present a comprehensive overview of the prominence and importance of such units in language, their possible role in explaining differences between first- and second-language learning, and the challenges the combined findings pose for theories of language. Copyright © 2017 Cognitive Science Society, Inc.
De Novo Foliar Transcriptome of Chenopodium amaranticolor and Analysis of Its Gene Expression During Virus-Induced Hypersensitive Response

PubMed Central

Zhang, Yongqiang; Pei, Xinwu; Zhang, Chao; Lu, Zifeng; Wang, Zhixing; Jia, Shirong; Li, Weimin

2012-01-01

Background The hypersensitive response (HR) system of Chenopodium spp. confers broad-spectrum virus resistance. However, little knowledge exists at the genomic level for Chenopodium, thus impeding the advanced molecular research of this attractive feature. Hence, we took advantage of RNA-seq to survey the foliar transcriptome of C. amaranticolor, a Chenopodium species widely used as laboratory indicator for pathogenic viruses, in order to facilitate the characterization of the HR-type of virus resistance. Methodology and Principal Findings Using Illumina HiSeq™ 2000 platform, we obtained 39,868,984 reads with 3,588,208,560 bp, which were assembled into 112,452 unigenes (3,847 clusters and 108,605 singletons). BlastX search against the NCBI NR database identified 61,698 sequences with a cut-off E-value above 10−5. Assembled sequences were annotated with gene descriptions, GO, COG and KEGG terms, respectively. A total number of 738 resistance gene analogs (RGAs) and homology sequences of 6 key signaling proteins within the R proteins-directed signaling pathway were identified. Based on this transcriptome data, we investigated the gene expression profiles over the stage of HR induced by Tobacco mosaic virus and Cucumber mosaic virus by using digital gene expression analysis. Numerous candidate genes specifically or commonly regulated by these two distinct viruses at early and late stages of the HR were identified, and the dynamic changes of the differently expressed genes enriched in the pathway of plant-pathogen interaction were particularly emphasized. Conclusions To our knowledge, this study is the first description of the genetic makeup of C. amaranticolor, providing deep insight into the comprehensive gene expression information at transcriptional level in this species. The 738 RGAs as well as the differentially regulated genes, particularly the common genes regulated by both TMV and CMV, are suitable candidates which merit further functional characterization to dissect the molecular mechanisms and regulatory pathways of the HR-type of virus resistance in Chenopodium. PMID:23029338
Deep RNA sequencing reveals dynamic regulation of myocardial noncoding RNAs in failing human heart and remodeling with mechanical circulatory support.

PubMed

Yang, Kai-Chien; Yamada, Kathryn A; Patel, Akshar Y; Topkara, Veli K; George, Isaac; Cheema, Faisal H; Ewald, Gregory A; Mann, Douglas L; Nerbonne, Jeanne M

2014-03-04

Microarrays have been used extensively to profile transcriptome remodeling in failing human heart, although the genomic coverage provided is limited and fails to provide a detailed picture of the myocardial transcriptome landscape. Here, we describe sequencing-based transcriptome profiling, providing comprehensive analysis of myocardial mRNA, microRNA (miRNA), and long noncoding RNA (lncRNA) expression in failing human heart before and after mechanical support with a left ventricular (LV) assist device (LVAD). Deep sequencing of RNA isolated from paired nonischemic (NICM; n=8) and ischemic (ICM; n=8) human failing LV samples collected before and after LVAD and from nonfailing human LV (n=8) was conducted. These analyses revealed high abundance of mRNA (37%) and lncRNA (71%) of mitochondrial origin. miRNASeq revealed 160 and 147 differentially expressed miRNAs in ICM and NICM, respectively, compared with nonfailing LV. Among these, only 2 (ICM) and 5 (NICM) miRNAs are normalized with LVAD. RNASeq detected 18 480, including 113 novel, lncRNAs in human LV. Among the 679 (ICM) and 570 (NICM) lncRNAs differentially expressed with heart failure, ≈10% are improved or normalized with LVAD. In addition, the expression signature of lncRNAs, but not miRNAs or mRNAs, distinguishes ICM from NICM. Further analysis suggests that cis-gene regulation represents a major mechanism of action of human cardiac lncRNAs. The myocardial transcriptome is dynamically regulated in advanced heart failure and after LVAD support. The expression profiles of lncRNAs, but not mRNAs or miRNAs, can discriminate failing hearts of different pathologies and are markedly altered in response to LVAD support. These results suggest an important role for lncRNAs in the pathogenesis of heart failure and in reverse remodeling observed with mechanical support.
Comprehensive analysis of coding-lncRNA gene co-expression network uncovers conserved functional lncRNAs in zebrafish.

PubMed

Chen, Wen; Zhang, Xuan; Li, Jing; Huang, Shulan; Xiang, Shuanglin; Hu, Xiang; Liu, Changning

2018-05-09

Zebrafish is a full-developed model system for studying development processes and human disease. Recent studies of deep sequencing had discovered a large number of long non-coding RNAs (lncRNAs) in zebrafish. However, only few of them had been functionally characterized. Therefore, how to take advantage of the mature zebrafish system to deeply investigate the lncRNAs' function and conservation is really intriguing. We systematically collected and analyzed a series of zebrafish RNA-seq data, then combined them with resources from known database and literatures. As a result, we obtained by far the most complete dataset of zebrafish lncRNAs, containing 13,604 lncRNA genes (21,128 transcripts) in total. Based on that, a co-expression network upon zebrafish coding and lncRNA genes was constructed and analyzed, and used to predict the Gene Ontology (GO) and the KEGG annotation of lncRNA. Meanwhile, we made a conservation analysis on zebrafish lncRNA, identifying 1828 conserved zebrafish lncRNA genes (1890 transcripts) that have their putative mammalian orthologs. We also found that zebrafish lncRNAs play important roles in regulation of the development and function of nervous system; these conserved lncRNAs present a significant sequential and functional conservation, with their mammalian counterparts. By integrative data analysis and construction of coding-lncRNA gene co-expression network, we gained the most comprehensive dataset of zebrafish lncRNAs up to present, as well as their systematic annotations and comprehensive analyses on function and conservation. Our study provides a reliable zebrafish-based platform to deeply explore lncRNA function and mechanism, as well as the lncRNA commonality between zebrafish and human.
De novo characterization of the gene-rich transcriptomes of two color-polymorphic spiders, Theridion grallator and T. californicum (Araneae: Theridiidae), with special reference to pigment genes.

PubMed

Croucher, Peter J P; Brewer, Michael S; Winchell, Christopher J; Oxford, Geoff S; Gillespie, Rosemary G

2013-12-08

A number of spider species within the family Theridiidae exhibit a dramatic abdominal (opisthosomal) color polymorphism. The polymorphism is inherited in a broadly Mendelian fashion and in some species consists of dozens of discrete morphs that are convergent across taxa and populations. Few genomic resources exist for spiders. Here, as a first necessary step towards identifying the genetic basis for this trait we present the near complete transcriptomes of two species: the Hawaiian happy-face spider Theridion grallator and Theridion californicum. We mined the gene complement for pigment-pathway genes and examined differential expression (DE) between morphs that are unpatterned (plain yellow) and patterned (yellow with superimposed patches of red, white or very dark brown). By deep sequencing both RNA-seq and normalized cDNA libraries from pooled specimens of each species we were able to assemble a comprehensive gene set for both species that we estimate to be 98-99% complete. It is likely that these species express more than 20,000 protein-coding genes, perhaps 4.5% (ca. 870) of which might be unique to spiders. Mining for pigment-associated Drosophila melanogaster genes indicated the presence of all ommochrome pathway genes and most pteridine pathway genes and DE analyses further indicate a possible role for the pteridine pathway in theridiid color patterning. Based upon our estimates, T. grallator and T. californicum express a large inventory of protein-coding genes. Our comprehensive assembly illustrates the continuing value of sequencing normalized cDNA libraries in addition to RNA-seq in order to generate a reference transcriptome for non-model species. The identification of pteridine-related genes and their possible involvement in color patterning is a novel finding in spiders and one that suggests a biochemical link between guanine deposits and the pigments exhibited by these species.
The Histone Database: an integrated resource for histones and histone fold-containing proteins

PubMed Central

Mariño-Ramírez, Leonardo; Levine, Kevin M.; Morales, Mario; Zhang, Suiyuan; Moreland, R. Travis; Baxevanis, Andreas D.; Landsman, David

2011-01-01

Eukaryotic chromatin is composed of DNA and protein components—core histones—that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins. Database URL: The Histone Sequence Database is freely available and can be accessed at http://research.nhgri.nih.gov/histones/. PMID:22025671
TARGET Research Goals

Cancer.gov

TARGET researchers use various sequencing and array-based methods to examine the genomes, transcriptomes, and for some diseases epigenomes of select childhood cancers. This “multi-omic” approach generates a comprehensive profile of molecular alterations for each cancer type. Alterations are changes in DNA or RNA, such as rearrangements in chromosome structure or variations in gene expression, respectively. Through computational analyses and assays to validate biological function, TARGET researchers predict which alterations disrupt the function of a gene or pathway and promote cancer growth, progression, and/or survival. Researchers identify candidate therapeutic targets and/or prognostic markers from the cancer-associated alterations.
Comparative immunogenomics of molluscs.

PubMed

Schultz, Jonathan H; Adema, Coen M

2017-10-01

Comparative immunology, studying both vertebrates and invertebrates, provided the earliest descriptions of phagocytosis as a general immune mechanism. However, the large scale of animal diversity challenges all-inclusive investigations and the field of immunology has developed by mostly emphasizing study of a few vertebrate species. In addressing the lack of comprehensive understanding of animal immunity, especially that of invertebrates, comparative immunology helps toward management of invertebrates that are food sources, agricultural pests, pathogens, or transmit diseases, and helps interpret the evolution of animal immunity. Initial studies showed that the Mollusca (second largest animal phylum), and invertebrates in general, possess innate defenses but lack the lymphocytic immune system that characterizes vertebrate immunology. Recognizing the reality of both common and taxon-specific immune features, and applying up-to-date cell and molecular research capabilities, in-depth studies of a select number of bivalve and gastropod species continue to reveal novel aspects of molluscan immunity. The genomics era heralded a new stage of comparative immunology; large-scale efforts yielded an initial set of full molluscan genome sequences that is available for analyses of full complements of immune genes and regulatory sequences. Next-generation sequencing (NGS), due to lower cost and effort required, allows individual researchers to generate large sequence datasets for growing numbers of molluscs. RNAseq provides expression profiles that enable discovery of immune genes and genome sequences reveal distribution and diversity of immune factors across molluscan phylogeny. Although computational de novo sequence assembly will benefit from continued development and automated annotation may require some experimental validation, NGS is a powerful tool for comparative immunology, especially increasing coverage of the extensive molluscan diversity. To date, immunogenomics revealed new levels of complexity of molluscan defense by indicating sequence heterogeneity in individual snails and bivalves, and members of expanded immune gene families are expressed differentially to generate pathogen-specific defense responses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Whole genome sequencing of elite rice cultivars as a comprehensive information resource for marker assisted selection

USDA-ARS?s Scientific Manuscript database

Current advances in sequencing technologies and bioinformatics allow to determine a nearly complete genomic background of rice, a staple food for the poor people. Consequently, comprehensive databases of variation among thousands of varieties is currently being assembled and released. Proper analysi...
Comparative Transcriptome Analysis of the Accessory Sex Gland and Testis from the Chinese Mitten Crab (Eriocheir sinensis)

PubMed Central

He, Lin; Jiang, Hui; Cao, Dandan; Liu, Lihua; Hu, Songnian; Wang, Qun

2013-01-01

The accessory sex gland (ASG) is an important component of the male reproductive system, which functions to enhance the fertility of spermatozoa during male reproduction. Certain proteins secreted by the ASG are known to bind to the spermatozoa membrane and affect its function. The ASG gene expression profile in Chinese mitten crab (Eriocheir sinensis) has not been extensively studied, and limited genetic research has been conducted on this species. The advent of high-throughput sequencing technologies enables the generation of genomic resources within a short period of time and at minimal cost. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for the ASG of E. sinensis using Illumina sequencing technology. This analysis yielded a total of 33,221,284 sequencing reads, including 2.6 Gb of total nucleotides. Reads were assembled into 85,913 contigs (average 218 bp), or 58,567 scaffold sequences (average 292 bp), that identified 37,955 unigenes (average 385 bp). We assembled all unigenes and compared them with the published testis transcriptome from E. sinensis. In order to identify which genes may be involved in ASG function, as it pertains to modification of spermatozoa, we compared the ASG and testis transcriptome of E. sinensis. Our analysis identified specific genes with both higher and lower tissue expression levels in the two tissues, and the functions of these genes were analyzed to elucidate their potential roles during maturation of spermatozoa. Availability of detailed transcriptome data from ASG and testis in E. sinensis can assist our understanding of the molecular mechanisms involved with spermatozoa conservation, transport, maturation and capacitation and potentially acrosome activation. PMID:23342039
Poly A- Transcripts Expressed in HeLa Cells

PubMed Central

Lu, Jian; Xuan, Zhenyu; Chen, Jun; Zheng, Yonglan; Zhou, Tom; Zhang, Michael Q.; Wu, Chung-I; Wang, San Ming

2008-01-01

Background Transcripts expressed in eukaryotes are classified as poly A+ transcripts or poly A- transcripts based on the presence or absence of the 3′ poly A tail. Most transcripts identified so far are poly A+ transcripts, whereas the poly A- transcripts remain largely unknown. Methodology/Principal Findings We developed the TRD (Total RNA Detection) system for transcript identification. The system detects the transcripts through the following steps: 1) depleting the abundant ribosomal and small-size transcripts; 2) synthesizing cDNA without regard to the status of the 3′ poly A tail; 3) applying the 454 sequencing technology for massive 3′ EST collection from the cDNA; and 4) determining the genome origins of the detected transcripts by mapping the sequences to the human genome reference sequences. Using this system, we characterized the cytoplasmic transcripts from HeLa cells. Of the 13,467 distinct 3′ ESTs analyzed, 24% are poly A-, 36% are poly A+, and 40% are bimorphic with poly A+ features but without the 3′ poly A tail. Most of the poly A- 3′ ESTs do not match known transcript sequences; they have a similar distribution pattern in the genome as the poly A+ and bimorphic 3′ ESTs, and their mapped intergenic regions are evolutionarily conserved. Experiments confirmed the authenticity of the detected poly A- transcripts. Conclusion/Significance Our study provides the first large-scale sequence evidence for the presence of poly A- transcripts in eukaryotes. The abundance of the poly A- transcripts highlights the need for comprehensive identification of these transcripts for decoding the transcriptome, annotating the genome and studying biological relevance of the poly A- transcripts. PMID:18665230
Genome-wide identification and characterization of NB-ARC resistant genes in wheat (Triticum aestivum L.) and their expression during leaf rust infection.

PubMed

Chandra, Saket; Kazmi, Andaleeb Z; Ahmed, Zainab; Roychowdhury, Gargi; Kumari, Veena; Kumar, Manish; Mukhopadhyay, Kunal

2017-07-01

NB-ARC domain-containing resistance genes from the wheat genome were identified, characterized and localized on chromosome arms that displayed differential yet positive response during incompatible and compatible leaf rust interactions. Wheat (Triticum aestivum L.) is an important cereal crop; however, its production is affected severely by numerous diseases including rusts. An efficient, cost-effective and ecologically viable approach to control pathogens is through host resistance. In wheat, high numbers of resistance loci are present but only few have been identified and cloned. A comprehensive analysis of the NB-ARC-containing genes in complete wheat genome was accomplished in this study. Complete NB-ARC encoding genes were mined from the Ensembl Plants database to predict 604 NB-ARC containing sequences using the HMM approach. Genome-wide analysis of orthologous clusters in the NB-ARC-containing sequences of wheat and other members of the Poaceae family revealed maximum homology with Oryza sativa indica and Brachypodium distachyon. The identification of overlap between orthologous clusters enabled the elucidation of the function and evolution of resistance proteins. The distributions of the NB-ARC domain-containing sequences were found to be balanced among the three wheat sub-genomes. Wheat chromosome arms 4AL and 7BL had the most NB-ARC domain-containing contigs. The spatio-temporal expression profiling studies exemplified the positive role of these genes in resistant and susceptible wheat plants during incompatible and compatible interaction in response to the leaf rust pathogen Puccinia triticina. Two NB-ARC domain-containing sequences were modelled in silico, cloned and sequenced to analyze their fine structures. The data obtained in this study will augment isolation, characterization and application NB-ARC resistance genes in marker-assisted selection based breeding programs for improving rust resistance in wheat.
Generation of Mast Cells from Mouse Fetus: Analysis of Differentiation and Functionality, and Transcriptome Profiling Using Next Generation Sequencer

PubMed Central

Fukuishi, Nobuyuki; Igawa, Yuusuke; Kunimi, Tomoyo; Hamano, Hirofumi; Toyota, Masao; Takahashi, Hironobu; Kenmoku, Hiromichi; Yagi, Yasuyuki; Matsui, Nobuaki; Akagi, Masaaki

2013-01-01

While gene knockout technology can reveal the roles of proteins in cellular functions, including in mast cells, fetal death due to gene manipulation frequently interrupts experimental analysis. We generated mast cells from mouse fetal liver (FLMC), and compared the fundamental functions of FLMC with those of bone marrow-derived mouse mast cells (BMMC). Under electron microscopy, numerous small and electron-dense granules were observed in FLMC. In FLMC, the expression levels of a subunit of the FcεRI receptor and degranulation by IgE cross-linking were comparable with BMMC. By flow cytometry we observed surface expression of c-Kit prior to that of FcεRI on FLMC, although on BMMC the expression of c-Kit came after FcεRI. The surface expression levels of Sca-1 and c-Kit, a marker of putative mast cell precursors, were slightly different between bone marrow cells and fetal liver cells, suggesting that differentiation stage or cell type are not necessarily equivalent between both lineages. Moreover, this indicates that phenotypically similar mast cells may not have undergone an identical process of differentiation. By comprehensive analysis using the next generation sequencer, the same frequency of gene expression was observed for 98.6% of all transcripts in both cell types. These results indicate that FLMC could represent a new and useful tool for exploring mast cell differentiation, and may help to elucidate the roles of individual proteins in the function of mast cells where gene manipulation can induce embryonic lethality in the mid to late stages of pregnancy. PMID:23573287

ESTree db: a Tool for Peach Functional Genomics

PubMed Central

Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo

2005-01-01

Background The ESTree db represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. Results The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. Conclusion The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig. PMID:16351742
ESTree db: a tool for peach functional genomics.

PubMed

Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo

2005-12-01

The ESTree db http://www.itb.cnr.it/estree/ represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig.
Comprehensive mutagenesis of the fimS promoter regulatory switch reveals novel regulation of type 1 pili in uropathogenic Escherichia coli

PubMed Central

Zhang, Huibin; Susanto, Teodorus T.; Wan, Yue

2016-01-01

Type 1 pili (T1P) are major virulence factors for uropathogenic Escherichia coli (UPEC), which cause both acute and recurrent urinary tract infections. T1P expression therefore is of direct relevance for disease. T1P are phase variable (both piliated and nonpiliated bacteria exist in a clonal population) and are controlled by an invertible DNA switch (fimS), which contains the promoter for the fim operon encoding T1P. Inversion of fimS is stochastic but may be biased by environmental conditions and other signals that ultimately converge at fimS itself. Previous studies of fimS sequences important for T1P phase variation have focused on laboratory-adapted E. coli strains and have been limited in the number of mutations or by alteration of the fimS genomic context. We surmounted these limitations by using saturating genomic mutagenesis of fimS coupled with accurate sequencing to detect both mutations and phase status simultaneously. In addition to the sequences known to be important for biasing fimS inversion, our method also identifies a previously unknown pair of 5′ UTR inverted repeats that act by altering the relative fimA levels to control phase variation. Thus we have uncovered an additional layer of T1P regulation potentially impacting virulence and the coordinate expression of multiple pilus systems. PMID:27035967
Comprehensive mutagenesis of the fimS promoter regulatory switch reveals novel regulation of type 1 pili in uropathogenic Escherichia coli.

PubMed

Zhang, Huibin; Susanto, Teodorus T; Wan, Yue; Chen, Swaine L

2016-04-12

Type 1 pili (T1P) are major virulence factors for uropathogenic Escherichia coli (UPEC), which cause both acute and recurrent urinary tract infections. T1P expression therefore is of direct relevance for disease. T1P are phase variable (both piliated and nonpiliated bacteria exist in a clonal population) and are controlled by an invertible DNA switch (fimS), which contains the promoter for the fim operon encoding T1P. Inversion of fimS is stochastic but may be biased by environmental conditions and other signals that ultimately converge at fimS itself. Previous studies of fimS sequences important for T1P phase variation have focused on laboratory-adapted E coli strains and have been limited in the number of mutations or by alteration of the fimS genomic context. We surmounted these limitations by using saturating genomic mutagenesis of fimS coupled with accurate sequencing to detect both mutations and phase status simultaneously. In addition to the sequences known to be important for biasing fimS inversion, our method also identifies a previously unknown pair of 5' UTR inverted repeats that act by altering the relative fimA levels to control phase variation. Thus we have uncovered an additional layer of T1P regulation potentially impacting virulence and the coordinate expression of multiple pilus systems.
Long Noncoding RNA MEG3 Is an Epigenetic Determinant of Oncogenic Signaling in Functional Pancreatic Neuroendocrine Tumor Cells

PubMed Central

Iyer, Sucharitha; Modali, Sita D.

2017-01-01

ABSTRACT The long noncoding RNA (lncRNA) MEG3 is significantly downregulated in pancreatic neuroendocrine tumors (PNETs). MEG3 loss corresponds with aberrant upregulation of the oncogenic hepatocyte growth factor (HGF) receptor c-MET in PNETs. Meg3 overexpression in a mouse insulin-secreting PNET cell line, MIN6, downregulates c-Met expression. However, the molecular mechanism by which MEG3 regulates c-MET is not known. Using chromatin isolation by RNA purification and sequencing (ChIRP-Seq), we identified Meg3 binding to unique genomic regions in and around the c-Met gene. In the absence of Meg3, these c-Met regions displayed distinctive enhancer-signature histone modifications. Furthermore, Meg3 relied on functional enhancer of zeste homolog 2 (EZH2), a component of polycomb repressive complex 2 (PRC2), to inhibit c-Met expression. Another mechanism of lncRNA-mediated regulation of gene expression utilized triplex-forming GA-GT rich sequences. Transfection of such motifs from Meg3 RNA, termed triplex-forming oligonucleotides (TFOs), in MIN6 cells suppressed c-Met expression and enhanced cell proliferation, perhaps by modulating other targets. This study comprehensively establishes epigenetic mechanisms underlying Meg3 control of c-Met and the oncogenic consequences of Meg3 loss or c-Met gain. These findings have clinical relevance for targeting c-MET in PNETs. There is also the potential for pancreatic islet β-cell expansion through c-MET regulation to ameliorate β-cell loss in diabetes. PMID:28847847
Comprehensive analysis of transcriptome variation uncovers known and novel driver events in T-cell acute lymphoblastic leukemia.

PubMed

Atak, Zeynep Kalender; Gianfelici, Valentina; Hulselmans, Gert; De Keersmaecker, Kim; Devasia, Arun George; Geerdens, Ellen; Mentens, Nicole; Chiaretti, Sabina; Durinck, Kaat; Uyttebroeck, Anne; Vandenberghe, Peter; Wlodarska, Iwona; Cloos, Jacqueline; Foà, Robin; Speleman, Frank; Cools, Jan; Aerts, Stein

2013-01-01

RNA-seq is a promising technology to re-sequence protein coding genes for the identification of single nucleotide variants (SNV), while simultaneously obtaining information on structural variations and gene expression perturbations. We asked whether RNA-seq is suitable for the detection of driver mutations in T-cell acute lymphoblastic leukemia (T-ALL). These leukemias are caused by a combination of gene fusions, over-expression of transcription factors and cooperative point mutations in oncogenes and tumor suppressor genes. We analyzed 31 T-ALL patient samples and 18 T-ALL cell lines by high-coverage paired-end RNA-seq. First, we optimized the detection of SNVs in RNA-seq data by comparing the results with exome re-sequencing data. We identified known driver genes with recurrent protein altering variations, as well as several new candidates including H3F3A, PTK2B, and STAT5B. Next, we determined accurate gene expression levels from the RNA-seq data through normalizations and batch effect removal, and used these to classify patients into T-ALL subtypes. Finally, we detected gene fusions, of which several can explain the over-expression of key driver genes such as TLX1, PLAG1, LMO1, or NKX2-1; and others result in novel fusion transcripts encoding activated kinases (SSBP2-FER and TPM3-JAK2) or involving MLLT10. In conclusion, we present novel analysis pipelines for variant calling, variant filtering, and expression normalization on RNA-seq data, and successfully applied these for the detection of translocations, point mutations, INDELs, exon-skipping events, and expression perturbations in T-ALL.
Full-length Transcriptome Sequencing and Modular Organization Analysis of Naringin/Neoeriocitrin Related Gene Expression Pattern in Drynaria roosii.

PubMed

Sun, Mei-Yu; Li, Jing-Yi; Li, Dong; Huang, Feng-Jie; Wang, Di; Li, Hui; Xing, Quan; Zhu, Hui-Bin; Shi, Lei

2018-04-12

Drynaria roosii (Nakaike) is a traditional Chinese medicinal fern, known as 'GuSuiBu'. The corresponding effective components of naringin/neoeriocitrin share highly similar chemical structure and medicinal function. Our HPLC-MS/MS results showed that the accumulation of naringin/neoeriocitrin depended on specific tissues or ages. However, little was known about the expression patterns of naringin/neoeriocitrin related genes involved in their regulatory pathways. For lack of the basic genetic information, we applied a combination of SMRT sequencing and SGS to generate the complete and full-length transcriptome of D. roosii. According to the SGS data, the DEG-based heat map analysis revealed the naringin/neoeriocitrin related gene expression exhibited obvious tissue- and time-specific transcriptomic differences. Using the systems biology method of modular organization analysis, we clustered 16,472 DEGs into 17 gene modules and studied the relationships between modules and tissue/time point samples, as well as modules and naringin/neoeriocitrin contents. Hereinto, naringin/neoeriocitrin related DEGs distributed in nine distinct modules, and DEGs in these modules showed significant different patterns of transcript abundance to be linked with specific tissues or ages. Moreover, WGCNA results further identified that PAL, 4CL, C4H and C3H, HCT acted as the major hub genes involved in naringin and neoeriocitrin synthesis respectively and exhibited high co-expression with MYB- and bHLH-regulated genes. In this work, modular organization and co-expression networks elucidated the tissue- and time-specificity of gene expression pattern, as well as hub genes associated with naringin/neoeriocitrin synthesis in D. roosii. Simultaneously, the comprehensive transcriptome dataset provided the important genetic information for further research on D. roosii.
Genome-wide organization and expression profiling of the R2R3-MYB transcription factor family in pineapple (Ananas comosus).

PubMed

Liu, Chaoyang; Xie, Tao; Chen, Chenjie; Luan, Aiping; Long, Jianmei; Li, Chuhao; Ding, Yaqi; He, Yehua

2017-07-01

The MYB proteins comprise one of the largest families of plant transcription factors, which are involved in various plant physiological and biochemical processes. Pineapple (Ananas comosus) is one of three most important tropical fruits worldwide. The completion of pineapple genome sequencing provides a great opportunity to investigate the organization and evolutionary traits of pineapple MYB genes at the genome-wide level. In the present study, a total of 94 pineapple R2R3-MYB genes were identified and further phylogenetically classified into 26 subfamilies, as supported by the conserved gene structures and motif composition. Collinearity analysis indicated that the segmental duplication events played a crucial role in the expansion of pineapple MYB gene family. Further comparative phylogenetic analysis suggested that there have been functional divergences of MYB gene family during plant evolution. RNA-seq data from different tissues and developmental stages revealed distinct temporal and spatial expression profiles of the AcMYB genes. Further quantitative expression analysis showed the specific expression patterns of the selected putative stress-related AcMYB genes in response to distinct abiotic stress and hormonal treatments. The comprehensive expression analysis of the pineapple MYB genes, especially the tissue-preferential and stress-responsive genes, could provide valuable clues for further function characterization. In this work, we systematically identified AcMYB genes by analyzing the pineapple genome sequence using a set of bioinformatics approaches. Our findings provide a global insight into the organization, phylogeny and expression patterns of the pineapple R2R3-MYB genes, and hence contribute to the greater understanding of their biological roles in pineapple.
The impact of reading expressiveness on the listening comprehension of storybooks by prekindergarten children.

PubMed

Mira, William A; Schwanenflugel, Paula J

2013-04-01

The purpose of this study was to determine the effect of oral reading expressiveness on the comprehension of storybooks by 4- and 5-year-old prekindergarten children. The possible impact of prosody on listening comprehension was explored. Ninety-two prekindergarten children (M age = 57.26 months, SD = 3.89 months) listened to an expressive or inexpressive recording of 1 of 2 similar stories. Story comprehension was tested using assessments of both free recall and cued recall. Children showed statistically significantly better cued recall for the expressive readings of stories compared to the inexpressive readings of stories. This effect generalized across stories and when story length was controlled across both expressive and inexpressive versions. The effect of expressiveness on children's free recall was not significant. Highly expressive readings resulted in better comprehension of storybooks by prekindergarten children. Further, because recordings were used, this effect might be attributed to the facilitation of language processing rather than to enhanced social interaction between the reader and the child.
Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh].

PubMed

Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram P; Gupta, Deepak K; Singh, Sangeeta; Dogra, Vivek; Gaikwad, Kishor; Sharma, Tilak R; Raje, Ranjeet S; Bandhopadhya, Tapas K; Datta, Subhojit; Singh, Mahendra N; Bashasab, Fakrudin; Kulwal, Pawan; Wanjari, K B; K Varshney, Rajeev; Cook, Douglas R; Singh, Nagendra K

2011-01-20

Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥ 18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea.
Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh

PubMed Central

2011-01-01

Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea. PMID:21251263
Genome-wide DNA methylation reprogramming in response to inorganic arsenic links inhibition of CTCF binding, DNMT expression and cellular transformation

NASA Astrophysics Data System (ADS)

Rea, Matthew; Eckstein, Meredith; Eleazer, Rebekah; Smith, Caroline; Fondufe-Mittendorf, Yvonne N.

2017-02-01

Chronic low dose inorganic arsenic (iAs) exposure leads to changes in gene expression and epithelial-to-mesenchymal transformation. During this transformation, cells adopt a fibroblast-like phenotype accompanied by profound gene expression changes. While many mechanisms have been implicated in this transformation, studies that focus on the role of epigenetic alterations in this process are just emerging. DNA methylation controls gene expression in physiologic and pathologic states. Several studies show alterations in DNA methylation patterns in iAs-mediated pathogenesis, but these studies focused on single genes. We present a comprehensive genome-wide DNA methylation analysis using methyl-sequencing to measure changes between normal and iAs-transformed cells. Additionally, these differential methylation changes correlated positively with changes in gene expression and alternative splicing. Interestingly, most of these differentially methylated genes function in cell adhesion and communication pathways. To gain insight into how genomic DNA methylation patterns are regulated during iAs-mediated carcinogenesis, we show that iAs probably targets CTCF binding at the promoter of DNA methyltransferases, regulating their expression. These findings reveal how CTCF binding regulates DNA methyltransferase to reprogram the methylome in response to an environmental toxin.
Genome-wide DNA methylation reprogramming in response to inorganic arsenic links inhibition of CTCF binding, DNMT expression and cellular transformation

PubMed Central

Rea, Matthew; Eckstein, Meredith; Eleazer, Rebekah; Smith, Caroline; Fondufe-Mittendorf , Yvonne N.

2017-01-01

Chronic low dose inorganic arsenic (iAs) exposure leads to changes in gene expression and epithelial-to-mesenchymal transformation. During this transformation, cells adopt a fibroblast-like phenotype accompanied by profound gene expression changes. While many mechanisms have been implicated in this transformation, studies that focus on the role of epigenetic alterations in this process are just emerging. DNA methylation controls gene expression in physiologic and pathologic states. Several studies show alterations in DNA methylation patterns in iAs-mediated pathogenesis, but these studies focused on single genes. We present a comprehensive genome-wide DNA methylation analysis using methyl-sequencing to measure changes between normal and iAs-transformed cells. Additionally, these differential methylation changes correlated positively with changes in gene expression and alternative splicing. Interestingly, most of these differentially methylated genes function in cell adhesion and communication pathways. To gain insight into how genomic DNA methylation patterns are regulated during iAs-mediated carcinogenesis, we show that iAs probably targets CTCF binding at the promoter of DNA methyltransferases, regulating their expression. These findings reveal how CTCF binding regulates DNA methyltransferase to reprogram the methylome in response to an environmental toxin. PMID:28150704
DNA methylation biomarkers for head and neck squamous cell carcinoma.

PubMed

Zhou, Chongchang; Ye, Meng; Ni, Shumin; Li, Qun; Ye, Dong; Li, Jinyun; Shen, Zhishen; Deng, Hongxia

2018-06-21

DNA methylation plays an important role in the etiology and pathogenesis of head and neck squamous cell carcinoma (HNSCC). The current study aimed to identify aberrantly methylated-differentially expressed genes (DEGs) by a comprehensive bioinformatics analysis. In addition, we screened for DEGs affected by DNA methylation modification and further investigated their prognostic values for HNSCC. We included microarray data of DNA methylation (GSE25093 and GSE33202) and gene expression (GSE23036 and GSE58911) from Gene Expression Omnibus. Aberrantly methylated-DEGs were analyzed with R software. The Cancer Genome Atlas (TCGA) RNA sequencing and DNA methylation (Illumina HumanMethylation450) databases were utilized for validation. In total, 27 aberrantly methylated genes accompanied by altered expression were identified. After confirmation by The Cancer Genome Atlas (TCGA) database, 2 hypermethylated-low-expression genes (FAM135B and ZNF610) and 2 hypomethylated-high-expression genes (HOXA9 and DCC) were identified. A receiver operating characteristic (ROC) curve confirmed the diagnostic value of these four methylated genes for HNSCC. Multivariate Cox proportional hazards analysis showed that FAM135B methylation was a favorable independent prognostic biomarker for overall survival of HNSCC patients.
RICD: a rice indica cDNA database resource for rice functional genomics.

PubMed

Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin

2008-11-26

The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
Comprehensive Transcriptome Profiling and Functional Analysis of the Frog (Bombina maxima) Immune System

PubMed Central

Zhao, Feng; Yan, Chao; Wang, Xuan; Yang, Yang; Wang, Guangyin; Lee, Wenhui; Xiang, Yang; Zhang, Yun

2014-01-01

Amphibians occupy a key phylogenetic position in vertebrates and evolution of the immune system. But, the resources of its transcriptome or genome are still little now. Bombina maxima possess strong ability to survival in very harsh environment with a more mature immune system. We obtained a comprehensive transcriptome by RNA-sequencing technology. 14.3% of transcripts were identified to be skin-specific genes, most of which were not isolated from skin secretion in previous works or novel non-coding RNAs. 27.9% of transcripts were mapped into 242 predicted KEGG pathways and 6.16% of transcripts related to human disease and cancer. Of 39 448 transcripts with the coding sequence, at least 1501 transcripts (570 genes) related to the immune system process. The molecules of immune signalling pathway were almost presented, several transcripts with high expression in skin and stomach. Experiments showed that lipopolysaccharide or bacteria challenge stimulated pro-inflammatory cytokine production and activation of pro-inflammatory caspase-1. These frog's data can remarkably expand the existing genome or transcriptome resources of amphibians, especially immunity data. The entity of the data provides a valuable platform for further investigation on more detailed immune response in B. maxima and a comparative study with other amphibians. PMID:23942912
LISTA, a comprehensive compilation of nucleotide sequences encoding proteins from the yeast Saccharomyces.

PubMed Central

Linder, P; Dölz, R; Mossé, M O; Lazowska, J; Slonimski, P P

1993-01-01

The amount of nucleotide sequence data is increasing exponentially. We therefore made an effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included in each entry. PMID:8332521
Gestalt Imagery: A Critical Factor in Language Comprehension.

ERIC Educational Resources Information Center

Bell, Nanci

1991-01-01

Lack of gestalt imagery (the ability to create imaged wholes) can contribute to language comprehension disorder characterized by weak reading comprehension, weak oral language comprehension, weak oral language expression, weak written language expression, difficulty following directions, and a weak sense of humor. Sequential stimulation using an…
Expression and methylation of BDNF in the human brain in schizophrenia.

PubMed

Cheah, Sern-Yih; McLeay, Robert; Wockner, Leesa F; Lawford, Bruce R; Young, Ross McD; Morris, Charles P; Voisey, Joanne

2017-08-01

To examine the combined effect of the BDNF Val66Met (rs6265) polymorphism and BDNF DNA methylation on transcriptional regulation of the BDNF gene. DNA methylation profiles were generated for CpG sites proximal to Val66Met, within BDNF promoter I and exon V for prefrontal cortex samples from 25 schizophrenia and 25 control subjects. Val66Met genotypes and BDNF mRNA expression data were generated by transcriptome sequencing. Expression, methylation and genotype data were correlated and examined for association with schizophrenia. There was 43% more of the BDNF V-VIII-IX transcript in schizophrenia samples. BDNF mRNA expression and DNA methylation of seven CpG sites were not associated with schizophrenia after accounting for age and PMI effects. BDNF mRNA expression and DNA methylation were not altered by Val66Met after accounting for age and PMI effects. DNA methylation of one CpG site had a marginally significant positive correlation with mRNA expression in schizophrenia subjects. Schizophrenia risk was not associated with differential BDNF mRNA expression and DNA methylation. A larger age-matched cohort with comprehensive clinical history is required to accurately identify the effects of genotype, mRNA expression and DNA methylation on schizophrenia risk.
Expression profiles of the Gα subunits during Xenopus tropicalis embryonic development.

PubMed

Fuentealba, Jaime; Toro-Tapia, Gabriela; Rodriguez, Marion; Arriagada, Cecilia; Maureira, Alejandro; Beyer, Andrea; Villaseca, Soraya; Leal, Juan I; Hinrichs, Maria V; Olate, Juan; Caprile, Teresa; Torrejón, Marcela

2016-09-01

Heterotrimeric G protein signaling plays major roles during different cellular events. However, there is a limited understanding of the molecular mechanisms underlying G protein control during embryogenesis. G proteins are highly conserved and can be grouped into four subfamilies according to sequence homology and function. To further studies on G protein function during embryogenesis, the present analysis identified four Gα subunits representative of the different subfamilies and determined their spatiotemporal expression patterns during Xenopus tropicalis embryogenesis. Each of the Gα subunit transcripts was maternally and zygotically expressed, and, as development progressed, dynamic expression patterns were observed. In the early developmental stages, the Gα subunits were expressed in the animal hemisphere and dorsal marginal zone. While expression was observed at the somite boundaries, in vascular structures, in the eye, and in the otic vesicle during the later stages, expression was mainly found in neural tissues, such as the neural tube and, especially, in the cephalic vesicles, neural crest region, and neural crest-derived structures. Together, these results support the pleiotropism and complexity of G protein subfamily functions in different cellular events. The present study constitutes the most comprehensive description to date of the spatiotemporal expression patterns of Gα subunits during vertebrate development. Copyright © 2016 Elsevier B.V. All rights reserved.

Characterization of GM events by insert knowledge adapted re-sequencing approaches

PubMed Central

Yang, Litao; Wang, Congmao; Holst-Jensen, Arne; Morisset, Dany; Lin, Yongjun; Zhang, Dabing

2013-01-01

Detection methods and data from molecular characterization of genetically modified (GM) events are needed by stakeholders of public risk assessors and regulators. Generally, the molecular characteristics of GM events are incomprehensively revealed by current approaches and biased towards detecting transformation vector derived sequences. GM events are classified based on available knowledge of the sequences of vectors and inserts (insert knowledge). Herein we present three insert knowledge-adapted approaches for characterization GM events (TT51-1 and T1c-19 rice as examples) based on paired-end re-sequencing with the advantages of comprehensiveness, accuracy, and automation. The comprehensive molecular characteristics of two rice events were revealed with additional unintended insertions comparing with the results from PCR and Southern blotting. Comprehensive transgene characterization of TT51-1 and T1c-19 is shown to be independent of a priori knowledge of the insert and vector sequences employing the developed approaches. This provides an opportunity to identify and characterize also unknown GM events. PMID:24088728
Characterization of GM events by insert knowledge adapted re-sequencing approaches.

PubMed

Yang, Litao; Wang, Congmao; Holst-Jensen, Arne; Morisset, Dany; Lin, Yongjun; Zhang, Dabing

2013-10-03

Detection methods and data from molecular characterization of genetically modified (GM) events are needed by stakeholders of public risk assessors and regulators. Generally, the molecular characteristics of GM events are incomprehensively revealed by current approaches and biased towards detecting transformation vector derived sequences. GM events are classified based on available knowledge of the sequences of vectors and inserts (insert knowledge). Herein we present three insert knowledge-adapted approaches for characterization GM events (TT51-1 and T1c-19 rice as examples) based on paired-end re-sequencing with the advantages of comprehensiveness, accuracy, and automation. The comprehensive molecular characteristics of two rice events were revealed with additional unintended insertions comparing with the results from PCR and Southern blotting. Comprehensive transgene characterization of TT51-1 and T1c-19 is shown to be independent of a priori knowledge of the insert and vector sequences employing the developed approaches. This provides an opportunity to identify and characterize also unknown GM events.
A Molecular Genetics Laboratory Course Applying Bioinformatics and Cell Biology in the Context of Original Research

PubMed Central

Pruitt, Wendy M.; Robinson, Lucy C.

2008-01-01

Research based laboratory courses have been shown to stimulate student interest in science and to improve scientific skills. We describe here a project developed for a semester-long research-based laboratory course that accompanies a genetics lecture course. The project was designed to allow students to become familiar with the use of bioinformatics tools and molecular biology and genetic approaches while carrying out original research. Students were required to present their hypotheses, experiments, and results in a comprehensive lab report. The lab project concerned the yeast casein kinase 1 (CK1) protein kinase Yck2. CK1 protein kinases are present in all organisms and are well conserved in primary structure. These enzymes display sequence features that differ from other protein kinase subfamilies. Students identified such sequences within the CK1 subfamily, chose a sequence to analyze, used available structural data to determine possible functions for their sequences, and designed mutations within the sequences. After generating the mutant alleles, these were expressed in yeast and tested for function by using two growth assays. The student response to the project was positive, both in terms of knowledge and skills increases and interest in research, and several students are continuing the analysis of mutant alleles as summer projects. PMID:19047427
The Impact of Normalization Methods on RNA-Seq Data Analysis

PubMed Central

Zyprych-Walczak, J.; Szabelska, A.; Handschuh, L.; Górczak, K.; Klamecka, K.; Figlerowicz, M.; Siatkowski, I.

2015-01-01

High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably. PMID:26176014
Evaluation of normalization methods in mammalian microRNA-Seq data

PubMed Central

Garmire, Lana Xia; Subramaniam, Shankar

2012-01-01

Simple total tag count normalization is inadequate for microRNA sequencing data generated from the next generation sequencing technology. However, so far systematic evaluation of normalization methods on microRNA sequencing data is lacking. We comprehensively evaluate seven commonly used normalization methods including global normalization, Lowess normalization, Trimmed Mean Method (TMM), quantile normalization, scaling normalization, variance stabilization, and invariant method. We assess these methods on two individual experimental data sets with the empirical statistical metrics of mean square error (MSE) and Kolmogorov-Smirnov (K-S) statistic. Additionally, we evaluate the methods with results from quantitative PCR validation. Our results consistently show that Lowess normalization and quantile normalization perform the best, whereas TMM, a method applied to the RNA-Sequencing normalization, performs the worst. The poor performance of TMM normalization is further evidenced by abnormal results from the test of differential expression (DE) of microRNA-Seq data. Comparing with the models used for DE, the choice of normalization method is the primary factor that affects the results of DE. In summary, Lowess normalization and quantile normalization are recommended for normalizing microRNA-Seq data, whereas the TMM method should be used with caution. PMID:22532701
Leukotriene signaling in the extinct human subspecies Homo denisovan and Homo neanderthalensis. Structural and functional comparison with Homo sapiens.

PubMed

Adel, Susan; Kakularam, Kumar Reddy; Horn, Thomas; Reddanna, Pallu; Kuhn, Hartmut; Heydeck, Dagmar

2015-01-01

Mammalian lipoxygenases (LOXs) have been implicated in cell differentiation and in the biosynthesis of pro- and anti-inflammatory lipid mediators. The initial draft sequence of the Homo neanderthalensis genome (coverage of 1.3-fold) suggested defective leukotriene signaling in this archaic human subspecies since expression of essential proteins appeared to be corrupted. Meanwhile high quality genomic sequence data became available for two extinct human subspecies (H. neanderthalensis, Homo denisovan) and completion of the human 1000 genome project provided a comprehensive database characterizing the genetic variability of the human genome. For this study we extracted the nucleotide sequences of selected eicosanoid relevant genes (ALOX5, ALOX15, ALOX12, ALOX15B, ALOX12B, ALOXE3, COX1, COX2, LTA4H, LTC4S, ALOX5AP, CYSLTR1, CYSLTR2, BLTR1, BLTR2) from the corresponding databases. Comparison of the deduced amino acid sequences in connection with site-directed mutagenesis studies and structural modeling suggested that the major enzymes and receptors of leukotriene signaling as well as the two cyclooxygenase isoforms were fully functional in these two extinct human subspecies. Copyright © 2014 Elsevier Inc. All rights reserved.
ReadXplorer—visualization and analysis of mapped sequences

PubMed Central

Hilker, Rolf; Stadermann, Kai Bernd; Doppmeier, Daniel; Kalinowski, Jörn; Stoye, Jens; Straube, Jasmin; Winnebald, Jörn; Goesmann, Alexander

2014-01-01

Motivation: Fast algorithms and well-arranged visualizations are required for the comprehensive analysis of the ever-growing size of genomic and transcriptomic next-generation sequencing data. Results: ReadXplorer is a software offering straightforward visualization and extensive analysis functions for genomic and transcriptomic DNA sequences mapped on a reference. A unique specialty of ReadXplorer is the quality classification of the read mappings. It is incorporated in all analysis functions and displayed in ReadXplorer's various synchronized data viewers for (i) the reference sequence, its base coverage as (ii) normalizable plot and (iii) histogram, (iv) read alignments and (v) read pairs. ReadXplorer's analysis capability covers RNA secondary structure prediction, single nucleotide polymorphism and deletion–insertion polymorphism detection, genomic feature and general coverage analysis. Especially for RNA-Seq data, it offers differential gene expression analysis, transcription start site and operon detection as well as RPKM value and read count calculations. Furthermore, ReadXplorer can combine or superimpose coverage of different datasets. Availability and implementation: ReadXplorer is available as open-source software at http://www.readxplorer.org along with a detailed manual. Contact: rhilker@mikrobio.med.uni-giessen.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24790157
A transcription factor hierarchy defines an environmental stress response network.

PubMed

Song, Liang; Huang, Shao-Shan Carol; Wise, Aaron; Castanon, Rosa; Nery, Joseph R; Chen, Huaming; Watanabe, Marina; Thomas, Jerushah; Bar-Joseph, Ziv; Ecker, Joseph R

2016-11-04

Environmental stresses are universally encountered by microbes, plants, and animals. Yet systematic studies of stress-responsive transcription factor (TF) networks in multicellular organisms have been limited. The phytohormone abscisic acid (ABA) influences the expression of thousands of genes, allowing us to characterize complex stress-responsive regulatory networks. Using chromatin immunoprecipitation sequencing, we identified genome-wide targets of 21 ABA-related TFs to construct a comprehensive regulatory network in Arabidopsis thaliana Determinants of dynamic TF binding and a hierarchy among TFs were defined, illuminating the relationship between differential gene expression patterns and ABA pathway feedback regulation. By extrapolating regulatory characteristics of observed canonical ABA pathway components, we identified a new family of transcriptional regulators modulating ABA and salt responsiveness and demonstrated their utility to modulate plant resilience to osmotic stress. Copyright © 2016, American Association for the Advancement of Science.
Comprehensive analysis of cancers of unknown primary for the biomarkers of response to immune checkpoint blockade therapy.

PubMed

Gatalica, Zoran; Xiu, Joanne; Swensen, Jeff; Vranic, Semir

2018-05-01

Cancer of unknown primary (CUP) accounts for approximately 3% of all malignancies. Avoiding immune destruction is a major cancer characteristic and therapies aimed at immune checkpoint blockade are in use for several specific cancer types. A comprehensive survey of predictive biomarkers to immune checkpoint blockade in CUP were explored in this study. About 389 cases of CUP were analysed for mutations in 592 genes and 52 gene fusions using a massively parallel DNA sequencing platform (next-generation sequencing [NGS]). Total mutational load (TML) and microsatellite instability (MSI) were calculated from NGS data. PD-L1 expression was explored using immunohistochemistry (with 5% cutoff value). High TML was seen in 11.8% (46/389) of tumours. MSI-high (MSI-H) was detected in 7/384 (1.8%) of tumours. Tumour PD-L1 expression was detected in 80/362 CUP (22%). A small proportion of CUP cases harboured genetic alterations of negative predictive biomarkers to immune checkpoint inhibitors (predictors to hyperprogression) including MDM2 gene amplification (2%) and loss of function JAK2 gene mutations (1%). Amplifications of CD274 (PD-L1) and PDCD1LG2 (PD-L2) genes were also rare (1.4% and 0.8%, respectively). The most frequently mutated genes were TP53 (54%), KRAS (22%), ARID1A (13%), PIK3CA (9%), CDKN2A (8%), SMARCA4 (7%) and PBRM1, STK11, APC, RB1 (5%, respectively). Using a multiplex testing approach, 28% of CUP carried one or more predictive biomarkers (MSI-H, PD-L1 and/or TML-H) to the immune checkpoint blockade, providing a novel option for treatment in patients with CUP. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Transcriptomic Analysis of Paulownia Infected by Paulownia Witches'-Broom Phytoplasma

PubMed Central

Zhu, Shui-Fang; Lin, Cai-Li; Tian, Guo-Zhong; Xu, Xia; Zhao, Wen-Jun

2013-01-01

Phytoplasmas are plant pathogenic bacteria that have no cell wall and are responsible for major crop losses throughout the world. Phytoplasma-infected plants show a variety of symptoms and the mechanisms they use to physiologically alter the host plants are of considerable interest, but poorly understood. In this study we undertook a detailed analysis of Paulownia infected by Paulownia witches’-broom (PaWB) Phytoplasma using high-throughput mRNA sequencing (RNA-Seq) and digital gene expression (DGE). RNA-Seq analysis identified 74,831 unigenes, which were subsequently used as reference sequences for DGE analysis of diseased and healthy Paulownia in field grown and tissue cultured plants. Our study revealed that dramatic changes occurred in the gene expression profile of Paulownia after PaWB Phytoplasma infection. Genes encoding key enzymes in cytokinin biosynthesis, such as isopentenyl diphosphate isomerase and isopentenyltransferase, were significantly induced in the infected Paulownia. Genes involved in cell wall biosynthesis and degradation were largely up-regulated and genes related to photosynthesis were down-regulated after PaWB Phytoplasma infection. Our systematic analysis provides comprehensive transcriptomic data about plants infected by Phytoplasma. This information will help further our understanding of the detailed interaction mechanisms between plants and Phytoplasma. PMID:24130859
Maternal Bias and Escape from X Chromosome Imprinting in the Midgestation Mouse Placenta

PubMed Central

Finn, Elizabeth H; Smith, Cheryl L; Rodriguez, Jesse; Sidow, Arend; Baker, Julie C

2014-01-01

To investigate the epigenetic landscape at the interface between mother and fetus, we provide a comprehensive analysis of parent-of-origin bias in the mouse placenta. Using F1 interspecies hybrids between mus musculus (C57BL/6J) and mus musculus castaneus, we sequenced RNA from 23 individual midgestation placentas, five late stage placentas, and two yolk sac samples and then used SNPs to determine whether transcripts were preferentially generated from the maternal or paternal allele. In the placenta, we find 103 genes that show significant and reproducible parent-of-origin bias, of which 78 are novel candidates. Most (96%) show a strong maternal bias which we demonstrate, via multiple mathematical models, pyrosequencing, and FISH, is not due to maternal decidual contamination. Analysis of the X chromosome also reveals paternal expression of Xist and several genes that escape inactivation, most significantly Alas2, Fhl1, and Slc38a5. Finally, sequencing individual placentas allowed us to reveal notable expression similarity between littermates. In all, we observe a striking preference for maternal transcription in the midgestation mouse placenta and a dynamic imprinting landscape in extraembryonic tissues, reflecting the complex nature of epigenetic pathways in the placenta. PMID:24594094
Phylogenomic detection and functional prediction of genes potentially important for plant meiosis.

PubMed

Zhang, Luoyan; Kong, Hongzhi; Ma, Hong; Yang, Ji

2018-02-15

Meiosis is a specialized type of cell division necessary for sexual reproduction in eukaryotes. A better understanding of the cytological procedures of meiosis has been achieved by comprehensive cytogenetic studies in plants, while the genetic mechanisms regulating meiotic progression remain incompletely understood. The increasing accumulation of complete genome sequences and large-scale gene expression datasets has provided a powerful resource for phylogenomic inference and unsupervised identification of genes involved in plant meiosis. By integrating sequence homology and expression data, 164, 131, 124 and 162 genes potentially important for meiosis were identified in the genomes of Arabidopsis thaliana, Oryza sativa, Selaginella moellendorffii and Pogonatum aloides, respectively. The predicted genes were assigned to 45 meiotic GO terms, and their functions were related to different processes occurring during meiosis in various organisms. Most of the predicted meiotic genes underwent lineage-specific duplication events during plant evolution, with about 30% of the predicted genes retaining only a single copy in higher plant genomes. The results of this study provided clues to design experiments for better functional characterization of meiotic genes in plants, promoting the phylogenomic approach to the evolutionary dynamics of the plant meiotic machineries. Copyright © 2017 Elsevier B.V. All rights reserved.
Using RNA Sequence and Structure for the Prediction of Riboswitch Aptamer: A Comprehensive Review of Available Software and Tools

PubMed Central

Antunes, Deborah; Jorge, Natasha A. N.; Caffarena, Ernesto R.; Passetti, Fabio

2018-01-01

RNA molecules are essential players in many fundamental biological processes. Prokaryotes and eukaryotes have distinct RNA classes with specific structural features and functional roles. Computational prediction of protein structures is a research field in which high confidence three-dimensional protein models can be proposed based on the sequence alignment between target and templates. However, to date, only a few approaches have been developed for the computational prediction of RNA structures. Similar to proteins, RNA structures may be altered due to the interaction with various ligands, including proteins, other RNAs, and metabolites. A riboswitch is a molecular mechanism, found in the three kingdoms of life, in which the RNA structure is modified by the binding of a metabolite. It can regulate multiple gene expression mechanisms, such as transcription, translation initiation, and mRNA splicing and processing. Due to their nature, these entities also act on the regulation of gene expression and detection of small metabolites and have the potential to helping in the discovery of new classes of antimicrobial agents. In this review, we describe software and web servers currently available for riboswitch aptamer identification and secondary and tertiary structure prediction, including applications. PMID:29403526
The BIG Data Center: from deposition to integration to translation

PubMed Central

2017-01-01

Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. PMID:27899658
Absolute quantification of microbial taxon abundances.

PubMed

Props, Ruben; Kerckhof, Frederiek-Maarten; Rubbens, Peter; De Vrieze, Jo; Hernandez Sanabria, Emma; Waegeman, Willem; Monsieurs, Pieter; Hammes, Frederik; Boon, Nico

2017-02-01

High-throughput amplicon sequencing has become a well-established approach for microbial community profiling. Correlating shifts in the relative abundances of bacterial taxa with environmental gradients is the goal of many microbiome surveys. As the abundances generated by this technology are semi-quantitative by definition, the observed dynamics may not accurately reflect those of the actual taxon densities. We combined the sequencing approach (16S rRNA gene) with robust single-cell enumeration technologies (flow cytometry) to quantify the absolute taxon abundances. A detailed longitudinal analysis of the absolute abundances resulted in distinct abundance profiles that were less ambiguous and expressed in units that can be directly compared across studies. We further provide evidence that the enrichment of taxa (increase in relative abundance) does not necessarily relate to the outgrowth of taxa (increase in absolute abundance). Our results highlight that both relative and absolute abundances should be considered for a comprehensive biological interpretation of microbiome surveys.
Genomic analyses of primitive, wild and cultivated citrus provide insights into asexual reproduction.

PubMed

Wang, Xia; Xu, Yuantao; Zhang, Siqi; Cao, Li; Huang, Yue; Cheng, Junfeng; Wu, Guizhi; Tian, Shilin; Chen, Chunli; Liu, Yan; Yu, Huiwen; Yang, Xiaoming; Lan, Hong; Wang, Nan; Wang, Lun; Xu, Jidi; Jiang, Xiaolin; Xie, Zongzhou; Tan, Meilian; Larkin, Robert M; Chen, Ling-Ling; Ma, Bin-Guang; Ruan, Yijun; Deng, Xiuxin; Xu, Qiang

2017-05-01

The emergence of apomixis-the transition from sexual to asexual reproduction-is a prominent feature of modern citrus. Here we de novo sequenced and comprehensively studied the genomes of four representative citrus species. Additionally, we sequenced 100 accessions of primitive, wild and cultivated citrus. Comparative population analysis suggested that genomic regions harboring energy- and reproduction-associated genes are probably under selection in cultivated citrus. We also narrowed the genetic locus responsible for citrus polyembryony, a form of apomixis, to an 80-kb region containing 11 candidate genes. One of these, CitRWP, is expressed at higher levels in ovules of polyembryonic cultivars. We found a miniature inverted-repeat transposable element insertion in the promoter region of CitRWP that cosegregated with polyembryony. This study provides new insights into citrus apomixis and constitutes a promising resource for the mining of agriculturally important genes.
Processes of Personality Development in Adulthood: The TESSERA Framework.

PubMed

Wrzus, Cornelia; Roberts, Brent W

2017-08-01

The current article presents a theoretical framework of the short- and long-term processes underlying personality development throughout adulthood. The newly developed TESSERA framework posits that long-term personality development occurs due to repeated short-term, situational processes. These short-term processes can be generalized as recursive sequence of Triggering situations, Expectancy, States/State expressions, and Reactions (TESSERA). Reflective and associative processes on TESSERA sequences can lead to personality development (i.e., continuity and lasting changes in explicit and implicit personality characteristics and behavioral patterns). We illustrate how the TESSERA framework facilitates a more comprehensive understanding of normative and differential personality development at various ages during the life span. The TESSERA framework extends previous theories by explicitly linking short- and long-term processes of personality development, by addressing different manifestations of personality, and by being applicable to different personality characteristics, for example, behavioral traits, motivational orientations, or life narratives.
Development of self-compressing BLSOM for comprehensive analysis of big sequence data.

PubMed

Kikuchi, Akihito; Ikemura, Toshimichi; Abe, Takashi

2015-01-01

With the remarkable increase in genomic sequence data from various organisms, novel tools are needed for comprehensive analyses of available big sequence data. We previously developed a Batch-Learning Self-Organizing Map (BLSOM), which can cluster genomic fragment sequences according to phylotype solely dependent on oligonucleotide composition and applied to genome and metagenomic studies. BLSOM is suitable for high-performance parallel-computing and can analyze big data simultaneously, but a large-scale BLSOM needs a large computational resource. We have developed Self-Compressing BLSOM (SC-BLSOM) for reduction of computation time, which allows us to carry out comprehensive analysis of big sequence data without the use of high-performance supercomputers. The strategy of SC-BLSOM is to hierarchically construct BLSOMs according to data class, such as phylotype. The first-layer BLSOM was constructed with each of the divided input data pieces that represents the data subclass, such as phylotype division, resulting in compression of the number of data pieces. The second BLSOM was constructed with a total of weight vectors obtained in the first-layer BLSOMs. We compared SC-BLSOM with the conventional BLSOM by analyzing bacterial genome sequences. SC-BLSOM could be constructed faster than BLSOM and cluster the sequences according to phylotype with high accuracy, showing the method's suitability for efficient knowledge discovery from big sequence data.
Identification, characterization, and gene expression analysis of nucleotide binding site (NB)-type resistance gene homologues in switchgrass

DOE PAGES

Frazier, Taylor P.; Palmer, Nathan A.; Xie, Fuliang; ...

2016-11-08

Switchgrass ( Panicum virgatum L.) is a warm-season perennial grass that can be used as a second generation bioenergy crop. However, foliar fungal pathogens, like switchgrass rust, have the potential to significantly reduce switchgrass biomass yield. Despite its importance as a prominent bioenergy crop, a genome-wide comprehensive analysis of NB-LRR disease resistance genes has yet to be performed in switchgrass. In this study, we used a homology-based computational approach to identify 1011 potential NB-LRR resistance gene homologs (RGHs) in the switchgrass genome (v 1.1). In addition, we identified 40 RGHs that potentially contain unique domains including major sperm protein domain,more » jacalin-like binding domain, calmodulin-like binding, and thioredoxin. RNA-sequencing analysis of leaf tissue from ‘Alamo’, a rust-resistant switchgrass cultivar, and ‘Dacotah’, a rust-susceptible switchgrass cultivar, identified 2634 high quality variants in the RGHs between the two cultivars. RNA-sequencing data from field-grown cultivar ‘Summer’ plants indicated that the expression of some of these RGHs was developmentally regulated. Our results provide useful insight into the molecular structure, distribution, and expression patterns of members of the NB-LRR gene family in switchgrass. These results also provide a foundation for future work aimed at elucidating the molecular mechanisms underlying disease resistance in this important bioenergy crop.« less
Transcript expression profiling for adventitious roots of Panax ginseng Meyer.

PubMed

Subramaniyam, Sathiyamoorthy; Mathiyalagan, Ramya; Natarajan, Sathishkumar; Kim, Yu-Jin; Jang, Moon-Gi; Park, Jun-Hyung; Yang, Deok Chun

2014-08-01

Panax ginseng Meyer is one of the major medicinal plants in oriental countries belonging to the Araliaceae family which are the primary source for ginsenosides. However, very few genes were characterized for ginsenoside pathway, due to the limited genome information. Through this study, we obtained a comprehensive transcriptome from adventitious roots, which were treated with methyl jasmonic acids for different time points (control, 2h, 6h, 12h, and 24h) and sequenced by RNA 454 pyrosequencing technology. Reference transcriptome 39,304,529 (0.04GB) was obtained from 5,724,987,880 bases (5.7GB) of 22 libraries by de novo assembly and 35,266 (58.5%) transcripts were annotated with biological schemas (GO and KEGG). The digital gene expression patterns were obtained from in vitro grown adventitious root sequences which mapped to reference, from that, 3813 (6.3%) unique transcripts were involved in ≥2 fold up and downregulations. Finally, candidates for ginsenoside pathway genes were predicted from observed expression patterns. Among them, 30 transcription factors, 20 cytochromes, and 11 glycosyl transferases were predicted as ginsenoside candidates. These data can remarkably expand the existing transcriptome resources of Panax, especially to predict existence of gene networks in P. ginseng. The entity of the data provides a valuable platform to reveal more on secondary metabolism and abiotic stresses from P. ginseng in vitro grown adventitious roots. Copyright © 2014 Elsevier B.V. All rights reserved.

Genome-wide analysis of the R2R3-MYB transcription factor gene family in sweet orange (Citrus sinensis).

PubMed

Liu, Chaoyang; Wang, Xia; Xu, Yuantao; Deng, Xiuxin; Xu, Qiang

2014-10-01

MYB transcription factor represents one of the largest gene families in plant genomes. Sweet orange (Citrus sinensis) is one of the most important fruit crops worldwide, and recently the genome has been sequenced. This provides an opportunity to investigate the organization and evolutionary characteristics of sweet orange MYB genes from whole genome view. In the present study, we identified 100 R2R3-MYB genes in the sweet orange genome. A comprehensive analysis of this gene family was performed, including the phylogeny, gene structure, chromosomal localization and expression pattern analyses. The 100 genes were divided into 29 subfamilies based on the sequence similarity and phylogeny, and the classification was also well supported by the highly conserved exon/intron structures and motif composition. The phylogenomic comparison of MYB gene family among sweet orange and related plant species, Arabidopsis, cacao and papaya suggested the existence of functional divergence during evolution. Expression profiling indicated that sweet orange R2R3-MYB genes exhibited distinct temporal and spatial expression patterns. Our analysis suggested that the sweet orange MYB genes may play important roles in different plant biological processes, some of which may be potentially involved in citrus fruit quality. These results will be useful for future functional analysis of the MYB gene family in sweet orange.
Gene therapy for prostate cancer: where are we now?

PubMed

Steiner, M S; Gingrich, J R

2000-10-01

The ability to recombine specifically and alter DNA sequences followed by techniques to transfer these sequences or even whole genes into normal and diseased cells has revolutionized medical research and ushered the clinicians of today into the age of gene therapy. We provide urologists a review of relevant background information, outline current treatment strategies and clinical trials, and delineate current challenges facing the field of gene therapy for advanced prostate cancer. We comprehensively reviewed the literature, including PubMed and recent abstract proceedings from national meetings, relevant to gene therapy and advanced prostate cancer. We selected for review literature representative of the principal scientific background for current gene therapy strategies and National Institutes of Health Recombinant DNA Advisory Committee approved clinical trials. Current prostate cancer gene therapy strategies include correcting aberrant gene expression, exploiting programmed cell death pathways, targeting critical cell biological functions, introducing toxic or cell lytic suicide genes, enhancing the immune system antitumor response and combining treatment with conventional cytotoxic chemotherapy or radiation therapy. Many challenges lie ahead for gene therapy, including improving DNA transfer efficiency to cells locally and at distant sites, enhancing levels of gene expression and overcoming immune responses that limit the time that genes are expressed. Nevertheless, despite these current challenges it is almost certain that gene therapy will be part of the urological armamentarium against prostate cancer in this century.
Zebrafish globin switching occurs in two developmental stages and is controlled by the LCR.

PubMed

Ganis, Jared J; Hsia, Nelson; Trompouki, Eirini; de Jong, Jill L O; DiBiase, Anthony; Lambert, Janelle S; Jia, Zhiying; Sabo, Peter J; Weaver, Molly; Sandstrom, Richard; Stamatoyannopoulos, John A; Zhou, Yi; Zon, Leonard I

2012-06-15

Globin gene switching is a complex, highly regulated process allowing expression of distinct globin genes at specific developmental stages. Here, for the first time, we have characterized all of the zebrafish globins based on the completed genomic sequence. Two distinct chromosomal loci, termed major (chromosome 3) and minor (chromosome 12), harbor the globin genes containing α/β pairs in a 5'-3' to 3'-5' orientation. Both these loci share synteny with the mammalian α-globin locus. Zebrafish globin expression was assayed during development and demonstrated two globin switches, similar to human development. A conserved regulatory element, the locus control region (LCR), was revealed by analyzing DNase I hypersensitive sites, H3K4 trimethylation marks and GATA1 binding sites. Surprisingly, the position of these sites with relation to the globin genes is evolutionarily conserved, despite a lack of overall sequence conservation. Motifs within the zebrafish LCR include CACCC, GATA, and NFE2 sites, suggesting functional interactions with known transcription factors but not the same LCR architecture. Functional homology to the mammalian α-LCR MCS-R2 region was confirmed by robust and specific reporter expression in erythrocytes of transgenic zebrafish. Our studies provide a comprehensive characterization of the zebrafish globin loci and clarify the regulation of globin switching. Copyright © 2012 Elsevier Inc. All rights reserved.
Identification, characterization, and gene expression analysis of nucleotide binding site (NB)-type resistance gene homologues in switchgrass

DOE Office of Scientific and Technical Information (OSTI.GOV)

Frazier, Taylor P.; Palmer, Nathan A.; Xie, Fuliang

Switchgrass ( Panicum virgatum L.) is a warm-season perennial grass that can be used as a second generation bioenergy crop. However, foliar fungal pathogens, like switchgrass rust, have the potential to significantly reduce switchgrass biomass yield. Despite its importance as a prominent bioenergy crop, a genome-wide comprehensive analysis of NB-LRR disease resistance genes has yet to be performed in switchgrass. In this study, we used a homology-based computational approach to identify 1011 potential NB-LRR resistance gene homologs (RGHs) in the switchgrass genome (v 1.1). In addition, we identified 40 RGHs that potentially contain unique domains including major sperm protein domain,more » jacalin-like binding domain, calmodulin-like binding, and thioredoxin. RNA-sequencing analysis of leaf tissue from ‘Alamo’, a rust-resistant switchgrass cultivar, and ‘Dacotah’, a rust-susceptible switchgrass cultivar, identified 2634 high quality variants in the RGHs between the two cultivars. RNA-sequencing data from field-grown cultivar ‘Summer’ plants indicated that the expression of some of these RGHs was developmentally regulated. Our results provide useful insight into the molecular structure, distribution, and expression patterns of members of the NB-LRR gene family in switchgrass. These results also provide a foundation for future work aimed at elucidating the molecular mechanisms underlying disease resistance in this important bioenergy crop.« less
Diplosporous development in Boehmeria tricuspis: Insights from de novo transcriptome assembly and comprehensive expression profiling

PubMed Central

Tang, Qing; Zang, Gonggu; Cheng, Chaohua; Luan, Mingbao; Dai, Zhigang; Xu, Ying; Yang, Zemao; Zhao, Lining; Su, Jianguang

2017-01-01

Boehmeria tricuspis includes sexually reproducing diploid and apomictic triploid individuals. Previously, we established that triploid B. tricuspis reproduces through obligate diplospory. To understand the molecular basis of apomictic development in B. tricuspis, we sequenced and compared transcriptomic profiles of the flowers of sexual and apomictic plants at four key developmental stages. A total of 283,341 unique transcripts were obtained from 1,463 million high-quality paired-end reads. In total, 18,899 unigenes were differentially expressed between the reproductive types at the four stages. By classifying the transcripts into gene ontology categories of differentially expressed genes, we showed that differential plant hormone signal transduction, cell cycle regulation, and transcription factor regulation are possibly involved in apomictic development and/or a polyploidization response in B. tricuspis. Furthermore, we suggest that specific gene families are possibly related to apomixis and might have important effects on diplosporous floral development. These results make a notable contribution to our understanding of the molecular basis of diplosporous development in B. tricuspis. PMID:28382950
Distinct polyadenylation landscapes of diverse human tissues revealed by a modified PA-seq strategy

PubMed Central

2013-01-01

Background Polyadenylation is a key regulatory step in eukaryotic gene expression and one of the major contributors of transcriptome diversity. Aberrant polyadenylation often associates with expression defects and leads to human diseases. Results To better understand global polyadenylation regulation, we have developed a polyadenylation sequencing (PA-seq) approach. By profiling polyadenylation events in 13 human tissues, we found that alternative cleavage and polyadenylation (APA) is prevalent in both protein-coding and noncoding genes. In addition, APA usage, similar to gene expression profiling, exhibits tissue-specific signatures and is sufficient for determining tissue origin. A 3′ untranslated region shortening index (USI) was further developed for genes with tandem APA sites. Strikingly, the results showed that different tissues exhibit distinct patterns of shortening and/or lengthening of 3′ untranslated regions, suggesting the intimate involvement of APA in establishing tissue or cell identity. Conclusions This study provides a comprehensive resource to uncover regulated polyadenylation events in human tissues and to characterize the underlying regulatory mechanism. PMID:24025092
Transcriptome analysis of Mastomys natalensis papillomavirus in productive lesions after natural infection.

PubMed

Salvermoser, Melanie; Chotewutmontri, Sasithorn; Braspenning-Wesch, Ilona; Hasche, Daniel; Rösl, Frank; Vinzón, Sabrina E

2016-07-01

Mastomys coucha, an African rodent, is a useful animal model of papillomavirus infection, as it develops both premalignant and malignant skin tumors as a consequence of a persistent infection with Mastomys natalensis papillomavirus (MnPV). In this study, we mapped the MnPV transcriptome in productive lesions by both classical molecular techniques and high-throughput RNA sequencing. Combination of these methods revealed a complex and comprehensive transcription map, with novel splicing events not described in other papillomaviruses. Furthermore, these splicing occurrences could potentially lead to the expression of novel E2, E1∧E4, E7 and L2 isoforms. Expression level estimation of each transcript showed that late-region mRNAs considerably outnumber early transcripts, with species coding for L1 and E1∧E4 being the most abundant. In summary, the full transcription map assembled in this study will allow us to further understand MnPV gene expression and the mechanisms that lead to natural tumour development.
The Zebrafish Model Organism Database: new support for human disease models, mutation details, gene expression phenotypes and searching

PubMed Central

Howe, Douglas G.; Bradford, Yvonne M.; Eagle, Anne; Fashena, David; Frazer, Ken; Kalita, Patrick; Mani, Prita; Martin, Ryan; Moxon, Sierra Taylor; Paddock, Holly; Pich, Christian; Ramachandran, Sridhar; Ruzicka, Leyla; Schaper, Kevin; Shao, Xiang; Singer, Amy; Toro, Sabrina; Van Slyke, Ceri; Westerfield, Monte

2017-01-01

The Zebrafish Model Organism Database (ZFIN; http://zfin.org) is the central resource for zebrafish (Danio rerio) genetic, genomic, phenotypic and developmental data. ZFIN curators provide expert manual curation and integration of comprehensive data involving zebrafish genes, mutants, transgenic constructs and lines, phenotypes, genotypes, gene expressions, morpholinos, TALENs, CRISPRs, antibodies, anatomical structures, models of human disease and publications. We integrate curated, directly submitted, and collaboratively generated data, making these available to zebrafish research community. Among the vertebrate model organisms, zebrafish are superbly suited for rapid generation of sequence-targeted mutant lines, characterization of phenotypes including gene expression patterns, and generation of human disease models. The recent rapid adoption of zebrafish as human disease models is making management of these data particularly important to both the research and clinical communities. Here, we describe recent enhancements to ZFIN including use of the zebrafish experimental conditions ontology, ‘Fish’ records in the ZFIN database, support for gene expression phenotypes, models of human disease, mutation details at the DNA, RNA and protein levels, and updates to the ZFIN single box search. PMID:27899582
Identification and comparative analysis of the microRNA transcriptome in roots of two contrasting tobacco genotypes in response to cadmium stress

NASA Astrophysics Data System (ADS)

He, Xiaoyan; Zheng, Weite; Cao, Fangbin; Wu, Feibo

2016-09-01

Tobacco (Nicotiana tabacum L.) is more acclimated to cadmium (Cd) uptake and preferentially enriches Cd in leaves than other crops. MicroRNAs (miRNAs) play crucial roles in regulating expression of various stress response genes in plants. However, genome-wide expression of miRNAs and their target genes in response to Cd stress in tobacco are still unknown. Here, miRNA high-throughput sequencing technology was performed using two contrasting tobacco genotypes Guiyan 1 and Yunyan 2 of Cd-sensitive and tolerance. Comprehensive analysis of miRNA expression profiles in control and Cd treated plants identified 72 known (27 families) and 14 novel differentially expressed miRNAs in the two genotypes. Among them, 28 known (14 families) and 5 novel miRNAs were considered as Cd tolerance associated miRNAs, which mainly involved in cell growth, ion homeostasis, stress defense, antioxidant and hormone signaling. Finally, a hypothetical model of Cd tolerance mechanism in Yunyan 2 was presented. Our findings suggest that some miRNAs and their target genes and pathways may play critical roles in Cd tolerance.
Identification of Immunity-Related Genes in Ostrinia furnacalis against Entomopathogenic Fungi by RNA-Seq Analysis

PubMed Central

Zhou, Fan; Wang, Guirong; An, Chunju

2014-01-01

Background The Asian corn borer (Ostrinia furnacalis (Guenée)) is one of the most serious corn pests in Asia. Control of this pest with entomopathogenic fungus Beauveria bassiana has been proposed. However, the molecular mechanisms involved in the interactions between O. furnacalis and B. bassiana are unclear, especially under the conditions that the genomic information of O. furnacalis is currently unavailable. So we sequenced and characterized the transcriptome of O. furnacalis larvae infected by B. bassiana with special emphasis on immunity-related genes. Methodology/Principal Findings Illumina Hiseq2000 was used to sequence 4.64 and 4.72 Gb of the transcriptome from water-injected and B. bassiana-injected O. furnacalis larvae, respectively. De novo assembly generated 62,382 unigenes with mean length of 729 nt. All unigenes were searched against Nt, Nr, Swiss-Prot, COG, and KEGG databases for annotations using BLASTN or BLASTX algorithm with an E-value cut-off of 10−5. A total of 35,700 (57.2%) unigenes were annotated to at least one database. Pairwise comparisons resulted in 13,890 differentially expressed genes, with 5,843 up-regulated and 8,047 down-regulated. Based on sequence similarity to homologs known to participate in immune responses, we totally identified 190 potential immunity-related unigenes. They encode 45 pattern recognition proteins, 33 modulation proteins involved in the prophenoloxidase activation cascade, 46 signal transduction molecules, and 66 immune responsive effectors, respectively. The obtained transcriptome contains putative orthologs for nearly all components of the Toll, Imd, and JAK/STAT pathways. We randomly selected 24 immunity-related unigenes and investigated their expression profiles using quantitative RT-PCR assay. The results revealed variant expression patterns in response to the infection of B. bassiana. Conclusions/Significance This study provides the comprehensive sequence resource and expression profiles of the immunity-related genes of O. furnacalis. The obtained data gives an insight into better understanding the molecular mechanisms of innate immune processes in O. furnacalis larvae against B. bassiana. PMID:24466095
Transcriptome Analysis of the Emerald Ash Borer (EAB), Agrilus planipennis: De Novo Assembly, Functional Annotation and Comparative Analysis.

PubMed

Duan, Jun; Ladd, Tim; Doucet, Daniel; Cusson, Michel; vanFrankenhuyzen, Kees; Mittapalli, Omprakash; Krell, Peter J; Quan, Guoxing

2015-01-01

The Emerald ash borer (EAB), Agrilus planipennis, is an invasive phloem-feeding insect pest of ash trees. Since its initial discovery near the Detroit, US- Windsor, Canada area in 2002, the spread of EAB has had strong negative economic, social and environmental impacts in both countries. Several transcriptomes from specific tissues including midgut, fat body and antenna have recently been generated. However, the relatively low sequence depth, gene coverage and completeness limited the usefulness of these EAB databases. High-throughput deep RNA-Sequencing (RNA-Seq) was used to obtain 473.9 million pairs of 100 bp length paired-end reads from various life stages and tissues. These reads were assembled into 88,907 contigs using the Trinity strategy and integrated into 38,160 unigenes after redundant sequences were removed. We annotated 11,229 unigenes by searching against the public nr, Swiss-Prot and COG. The EAB transcriptome assembly was compared with 13 other sequenced insect species, resulting in the prediction of 536 unigenes that are Coleoptera-specific. Differential gene expression revealed that 290 unigenes are expressed during larval molting and 3,911 unigenes during metamorphosis from larvae to pupae, respectively (FDR< 0.01 and log2 FC>2). In addition, 1,167 differentially expressed unigenes were identified from larval and adult midguts, 435 unigenes were up-regulated in larval midgut and 732 unigenes were up-regulated in adult midgut. Most of the genes involved in RNA interference (RNAi) pathways were identified, which implies the existence of a system RNAi in EAB. This study provides one of the most fundamental and comprehensive transcriptome resources available for EAB to date. Identification of the tissue- stage- or species- specific unigenes will benefit the further study of gene functions during growth and metamorphosis processes in EAB and other pest insects.
Transcriptome Analysis of the Emerald Ash Borer (EAB), Agrilus planipennis: De Novo Assembly, Functional Annotation and Comparative Analysis

PubMed Central

Duan, Jun; Ladd, Tim; Doucet, Daniel; Cusson, Michel; vanFrankenhuyzen, Kees; Mittapalli, Omprakash; Krell, Peter J.; Quan, Guoxing

2015-01-01

Background The Emerald ash borer (EAB), Agrilus planipennis, is an invasive phloem-feeding insect pest of ash trees. Since its initial discovery near the Detroit, US- Windsor, Canada area in 2002, the spread of EAB has had strong negative economic, social and environmental impacts in both countries. Several transcriptomes from specific tissues including midgut, fat body and antenna have recently been generated. However, the relatively low sequence depth, gene coverage and completeness limited the usefulness of these EAB databases. Methodology and Principal Findings High-throughput deep RNA-Sequencing (RNA-Seq) was used to obtain 473.9 million pairs of 100 bp length paired-end reads from various life stages and tissues. These reads were assembled into 88,907 contigs using the Trinity strategy and integrated into 38,160 unigenes after redundant sequences were removed. We annotated 11,229 unigenes by searching against the public nr, Swiss-Prot and COG. The EAB transcriptome assembly was compared with 13 other sequenced insect species, resulting in the prediction of 536 unigenes that are Coleoptera-specific. Differential gene expression revealed that 290 unigenes are expressed during larval molting and 3,911 unigenes during metamorphosis from larvae to pupae, respectively (FDR< 0.01 and log2 FC>2). In addition, 1,167 differentially expressed unigenes were identified from larval and adult midguts, 435 unigenes were up-regulated in larval midgut and 732 unigenes were up-regulated in adult midgut. Most of the genes involved in RNA interference (RNAi) pathways were identified, which implies the existence of a system RNAi in EAB. Conclusions and Significance This study provides one of the most fundamental and comprehensive transcriptome resources available for EAB to date. Identification of the tissue- stage- or species- specific unigenes will benefit the further study of gene functions during growth and metamorphosis processes in EAB and other pest insects. PMID:26244979
Structure and expression of GSL1 and GSL2 genes encoding gibberellin stimulated-like proteins in diploid and highly heterozygous tetraploid potato reveals their highly conserved and essential status.

PubMed

Meiyalaghan, Sathiyamoorthy; Thomson, Susan J; Fiers, Mark W E J; Barrell, Philippa J; Latimer, Julie M; Mohan, Sara; Jones, E Eirian; Conner, Anthony J; Jacobs, Jeanne M E

2014-01-02

GSL1 and GSL2, Gibberellin Stimulated-Like proteins (also known as Snakin-1 and Snakin-2), are cysteine-rich peptides from potato (Solanum tuberosum L.) with antimicrobial properties. Similar peptides in other species have been implicated in diverse biological processes and are hypothesised to play a role in several aspects of plant development, plant responses to biotic or abiotic stress through their participation in hormone crosstalk, and redox homeostasis. To help resolve the biological roles of GSL1 and GSL2 peptides we have undertaken an in depth analysis of the structure and expression of these genes in potato. We have characterised the full length genes for both GSL1 (chromosome 4) and GSL2 (chromosome 1) from diploid and tetraploid potato using the reference genome sequence of potato, coupled with further next generation sequencing of four highly heterozygous tetraploid cultivars. The frequency of SNPs in GSL1 and GSL2 were very low with only one SNP every 67 and 53 nucleotides in exon regions of GSL1 and GSL2, respectively. Analysis of comprehensive RNA-seq data substantiated the role of specific promoter motifs in transcriptional control of gene expression. Expression analysis based on the frequency of next generation sequence reads established that GSL2 was expressed at a higher level than GSL1 in 30 out of 32 tissue and treatment libraries. Furthermore, both the GSL1 and GSL2 genes exhibited constitutive expression that was not up regulated in response to biotic or abiotic stresses, hormone treatments or wounding. Potato transformation with antisense knock-down expression cassettes failed to recover viable plants. The potato GSL1 and GSL2 genes are very highly conserved suggesting they contribute to an important biological function. The known antimicrobial activity of the GSL proteins, coupled with the FPKM analysis from RNA-seq data, implies that both genes contribute to the constitutive defence barriers in potatoes. The lethality of antisense knock-down expression of GSL1 and GSL2, coupled with the rare incidence of SNPs in these genes, suggests an essential role for this gene family. These features are consistent with the GSL protein family playing a role in several aspects of plant development in addition to plant defence against biotic stresses.
Comprehensive analysis of the T-cell receptor beta chain gene in rhesus monkey by high throughput sequencing

PubMed Central

Li, Zhoufang; Liu, Guangjie; Tong, Yin; Zhang, Meng; Xu, Ying; Qin, Li; Wang, Zhanhui; Chen, Xiaoping; He, Jiankui

2015-01-01

Profiling immune repertoires by high throughput sequencing enhances our understanding of immune system complexity and immune-related diseases in humans. Previously, cloning and Sanger sequencing identified limited numbers of T cell receptor (TCR) nucleotide sequences in rhesus monkeys, thus their full immune repertoire is unknown. We applied multiplex PCR and Illumina high throughput sequencing to study the TCRβ of rhesus monkeys. We identified 1.26 million TCRβ sequences corresponding to 643,570 unique TCRβ sequences and 270,557 unique complementarity-determining region 3 (CDR3) gene sequences. Precise measurements of CDR3 length distribution, CDR3 amino acid distribution, length distribution of N nucleotide of junctional region, and TCRV and TCRJ gene usage preferences were performed. A comprehensive profile of rhesus monkey immune repertoire might aid human infectious disease studies using rhesus monkeys. PMID:25961410
Integrated Advanced Microwave Sounding Unit-A (AMSU-A). Performance Verification Report: Initial Comprehensive Performance Test Report, P/N 1331200-2-IT, S/N 105/A2

NASA Technical Reports Server (NTRS)

Platt, R.

1999-01-01

This is the Performance Verification Report, Initial Comprehensive Performance Test Report, P/N 1331200-2-IT, S/N 105/A2, for the Integrated Advanced Microwave Sounding Unit-A (AMSU-A). The specification establishes the requirements for the Comprehensive Performance Test (CPT) and Limited Performance Test (LPT) of the Advanced Microwave Sounding, Unit-A2 (AMSU-A2), referred to herein as the unit. The unit is defined on Drawing 1331200. 1.2 Test procedure sequence. The sequence in which the several phases of this test procedure shall take place is shown in Figure 1, but the sequence can be in any order.
gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes.

PubMed

Nakagawa, So; Takahashi, Mahoko Ueda

2016-01-01

In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp. © The Author(s) 2016. Published by Oxford University Press.
First venom gland transcriptomic analysis of Iranian yellow scorpion "Odonthubuthus doriae" with some new findings.

PubMed

NaderiSoorki, Maryam; Galehdari, Hamid; Baradaran, Masomeh; Jalali, Amir

2016-09-15

Scorpion venom contains mixture of biologic molecules including selective toxins with medical capability. Odonthubuthus doriae (O. doriae) belonged to Buthidae family of scorpions and gained more interest among Iranian dangerous scorpion since 2005. We constructed the first cDNA library to explore the transcriptomic composition of this Iranian scorpiontelson. Then by used of bioinformatic software each expression sequence taq (EST) from the library analyzed and its quiddity was clear. Analysis showed that toxins (42%) had more venom transcript than other component such as antimicrobial peptides, venom peptides and cell proteins. Over 16% of transcripts didn't have any open reading frames (ORF), however their sequences showed similarity by other scorpion sequences. One EST didn't have any similarity by known scorpion peptides. For the first time; we report a comprehensive study of an Iranian scorpion with interesting and novel findings. We characterized a new putative sodium channel modifier in scorpions by some bioinformatics software, and then predicted its structure and function. Copyright © 2016. Published by Elsevier Ltd.
gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes

PubMed Central

Nakagawa, So; Takahashi, Mahoko Ueda

2016-01-01

In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species. Database URL: http://geve.med.u-tokai.ac.jp PMID:27242033
ATGC transcriptomics: a web-based application to integrate, explore and analyze de novo transcriptomic data.

PubMed

Gonzalez, Sergio; Clavijo, Bernardo; Rivarola, Máximo; Moreno, Patricio; Fernandez, Paula; Dopazo, Joaquín; Paniego, Norma

2017-02-22

In the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species. We developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results. ATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net .
snoSeeker: an advanced computational package for screening of guide and orphan snoRNA genes in the human genome.

PubMed

Yang, Jian-Hua; Zhang, Xiao-Chen; Huang, Zhan-Peng; Zhou, Hui; Huang, Mian-Bo; Zhang, Shu; Chen, Yue-Qin; Qu, Liang-Hu

2006-01-01

Small nucleolar RNAs (snoRNAs) represent an abundant group of non-coding RNAs in eukaryotes. They can be divided into guide and orphan snoRNAs according to the presence or absence of antisense sequence to rRNAs or snRNAs. Current snoRNA-searching programs, which are essentially based on sequence complementarity to rRNAs or snRNAs, exist only for the screening of guide snoRNAs. In this study, we have developed an advanced computational package, snoSeeker, which includes CDseeker and ACAseeker programs, for the highly efficient and specific screening of both guide and orphan snoRNA genes in mammalian genomes. By using these programs, we have systematically scanned four human-mammal whole-genome alignment (WGA) sequences and identified 54 novel candidates including 26 orphan candidates as well as 266 known snoRNA genes. Eighteen novel snoRNAs were further experimentally confirmed with four snoRNAs exhibiting a tissue-specific or restricted expression pattern. The results of this study provide the most comprehensive listing of two families of snoRNA genes in the human genome till date.

De novo transcriptome assembly of a fern, Lygodium japonicum, and a web resource database, Ljtrans DB.

PubMed

Aya, Koichiro; Kobayashi, Masaaki; Tanaka, Junmu; Ohyanagi, Hajime; Suzuki, Takayuki; Yano, Kenji; Takano, Tomoyuki; Yano, Kentaro; Matsuoka, Makoto

2015-01-01

During plant evolution, ferns originally evolved as a major vascular plant with a distinctive life cycle in which the haploid and diploid generations are completely separated. However, the low level of genetic resources has limited studies of their physiological events, as well as hindering research on the evolutionary history of land plants. In this study, to identify a comprehensive catalog of transcripts and characterize their expression traits in the fern Lygodium japonicum, nine different RNA samples isolated from prothalli, trophophylls, rhizomes and sporophylls were sequenced using Roche 454 GS-FLX and Illumina HiSeq sequencers. The hybrid assembly of the high-quality 454 GS-FLX and Illumina HiSeq reads generated a set of 37,830 isoforms with an average length of 1,444 bp. Using four open reading frame (ORF) predictors, 38,142 representative ORFs were identified from a total of 37,830 transcript isoforms and 95 contigs, which were annotated by searching against several public databases. Furthermore, an orthoMCL analysis using the protein sequences of L. japonicum and five model plants revealed various sets of lineage-specific genes, including those detected among land plant lineages and those detected in only L. japonicum. We have also examined the expression patterns of all contigs/isoforms, along with the life cycle of L. japonicum, and identified the tissue-specific transcripts using statistical expression analyses. Finally, we developed a public web resource, the L. japonicum transcriptome database at http://bioinf.mind.meiji.ac.jp/kanikusa/, which provides important opportunities to accelerate molecular research in ferns. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
A prior-based integrative framework for functional transcriptional regulatory network inference

PubMed Central

Siahpirani, Alireza F.

2017-01-01

Abstract Transcriptional regulatory networks specify regulatory proteins controlling the context-specific expression levels of genes. Inference of genome-wide regulatory networks is central to understanding gene regulation, but remains an open challenge. Expression-based network inference is among the most popular methods to infer regulatory networks, however, networks inferred from such methods have low overlap with experimentally derived (e.g. ChIP-chip and transcription factor (TF) knockouts) networks. Currently we have a limited understanding of this discrepancy. To address this gap, we first develop a regulatory network inference algorithm, based on probabilistic graphical models, to integrate expression with auxiliary datasets supporting a regulatory edge. Second, we comprehensively analyze our and other state-of-the-art methods on different expression perturbation datasets. Networks inferred by integrating sequence-specific motifs with expression have substantially greater agreement with experimentally derived networks, while remaining more predictive of expression than motif-based networks. Our analysis suggests natural genetic variation as the most informative perturbation for network inference, and, identifies core TFs whose targets are predictable from expression. Multiple reasons make the identification of targets of other TFs difficult, including network architecture and insufficient variation of TF mRNA level. Finally, we demonstrate the utility of our inference algorithm to infer stress-specific regulatory networks and for regulator prioritization. PMID:27794550
Survey of 800+ data sets from human tissue and body fluid reveals xenomiRs are likely artifacts

PubMed Central

Kang, Wenjing; Bang-Berthelsen, Claus Heiner; Holm, Anja; Houben, Anna J.S.; Müller, Anne Holt; Thymann, Thomas; Pociot, Flemming; Estivill, Xavier; Friedländer, Marc R.

2017-01-01

miRNAs are small 22-nucleotide RNAs that can post-transcriptionally regulate gene expression. It has been proposed that dietary plant miRNAs can enter the human bloodstream and regulate host transcripts; however, these findings have been widely disputed. We here conduct the first comprehensive meta-study in the field, surveying the presence and abundances of cross-species miRNAs (xenomiRs) in 824 sequencing data sets from various human tissues and body fluids. We find that xenomiRs are commonly present in tissues (17%) and body fluids (69%); however, the abundances are low, comprising 0.001% of host human miRNA counts. Further, we do not detect a significant enrichment of xenomiRs in sequencing data originating from tissues and body fluids that are exposed to dietary intake (such as liver). Likewise, there is no significant depletion of xenomiRs in tissues and body fluids that are relatively separated from the main bloodstream (such as brain and cerebro-spinal fluids). Interestingly, the majority (81%) of body fluid xenomiRs stem from rodents, which are a rare human dietary contribution but common laboratory animals. Body fluid samples from the same studies tend to group together when clustered by xenomiR compositions, suggesting technical batch effects. Last, we performed carefully designed and controlled animal feeding studies, in which we detected no transfer of plant miRNAs into rat blood, or bovine milk sequences into piglet blood. In summary, our comprehensive computational and experimental results indicate that xenomiRs originate from technical artifacts rather than dietary intake. PMID:28062594
Generation of a foveomacular transcriptome

PubMed Central

Bernstein, Steven; Wong, Paul W.

2014-01-01

Purpose Organizing molecular biologic data is a growing challenge since the rate of data accumulation is steadily increasing. Information relevant to a particular biologic query can be difficult to extract from the comprehensive databases currently available. We present a data collection and organization model designed to ameliorate these problems and applied it to generate an expressed sequence tag (EST)–based foveomacular transcriptome. Methods Using Perl, MySQL, EST libraries, screening, and human foveomacular gene expression as a model system, we generated a foveomacular transcriptome database enriched for molecularly relevant data. Results Using foveomacula as a gene expression model tissue, we identified and organized 6,056 genes expressed in that tissue. Of those identified genes, 3,480 had not been previously described as expressed in the foveomacula. Internal experimental controls as well as comparison of our data set to published data sets suggest we do not yet have a complete description of the foveomacula transcriptome. Conclusions We present an organizational method designed to amplify the utility of data pertinent to a specific research interest. Our method is generic enough to be applicable to a variety of conditions yet focused enough to allow for specialized study. PMID:24991187
Interactions between genetic variation and cellular environment in skeletal muscle gene expression.

PubMed

Taylor, D Leland; Knowles, David A; Scott, Laura J; Ramirez, Andrea H; Casale, Francesco Paolo; Wolford, Brooke N; Guan, Li; Varshney, Arushi; Albanus, Ricardo D'Oliveira; Parker, Stephen C J; Narisu, Narisu; Chines, Peter S; Erdos, Michael R; Welch, Ryan P; Kinnunen, Leena; Saramies, Jouko; Sundvall, Jouko; Lakka, Timo A; Laakso, Markku; Tuomilehto, Jaakko; Koistinen, Heikki A; Stegle, Oliver; Boehnke, Michael; Birney, Ewan; Collins, Francis S

2018-01-01

From whole organisms to individual cells, responses to environmental conditions are influenced by genetic makeup, where the effect of genetic variation on a trait depends on the environmental context. RNA-sequencing quantifies gene expression as a molecular trait, and is capable of capturing both genetic and environmental effects. In this study, we explore opportunities of using allele-specific expression (ASE) to discover cis-acting genotype-environment interactions (GxE)-genetic effects on gene expression that depend on an environmental condition. Treating 17 common, clinical traits as approximations of the cellular environment of 267 skeletal muscle biopsies, we identify 10 candidate environmental response expression quantitative trait loci (reQTLs) across 6 traits (12 unique gene-environment trait pairs; 10% FDR per trait) including sex, systolic blood pressure, and low-density lipoprotein cholesterol. Although using ASE is in principle a promising approach to detect GxE effects, replication of such signals can be challenging as validation requires harmonization of environmental traits across cohorts and a sufficient sampling of heterozygotes for a transcribed SNP. Comprehensive discovery and replication will require large human transcriptome datasets, or the integration of multiple transcribed SNPs, coupled with standardized clinical phenotyping.
Time-series RNA-seq analysis package (TRAP) and its application to the analysis of rice, Oryza sativa L. ssp. Japonica, upon drought stress.

PubMed

Jo, Kyuri; Kwon, Hawk-Bin; Kim, Sun

2014-06-01

Measuring expression levels of genes at the whole genome level can be useful for many purposes, especially for revealing biological pathways underlying specific phenotype conditions. When gene expression is measured over a time period, we have opportunities to understand how organisms react to stress conditions over time. Thus many biologists routinely measure whole genome level gene expressions at multiple time points. However, there are several technical difficulties for analyzing such whole genome expression data. In addition, these days gene expression data is often measured by using RNA-sequencing rather than microarray technologies and then analysis of expression data is much more complicated since the analysis process should start with mapping short reads and produce differentially activated pathways and also possibly interactions among pathways. In addition, many useful tools for analyzing microarray gene expression data are not applicable for the RNA-seq data. Thus a comprehensive package for analyzing time series transcriptome data is much needed. In this article, we present a comprehensive package, Time-series RNA-seq Analysis Package (TRAP), integrating all necessary tasks such as mapping short reads, measuring gene expression levels, finding differentially expressed genes (DEGs), clustering and pathway analysis for time-series data in a single environment. In addition to implementing useful algorithms that are not available for RNA-seq data, we extended existing pathway analysis methods, ORA and SPIA, for time series analysis and estimates statistical values for combined dataset by an advanced metric. TRAP also produces visual summary of pathway interactions. Gene expression change labeling, a practical clustering method used in TRAP, enables more accurate interpretation of the data when combined with pathway analysis. We applied our methods on a real dataset for the analysis of rice (Oryza sativa L. Japonica nipponbare) upon drought stress. The result showed that TRAP was able to detect pathways more accurately than several existing methods. TRAP is available at http://biohealth.snu.ac.kr/software/TRAP/. Copyright © 2014 Elsevier Inc. All rights reserved.
Identification and profiling of novel microRNAs in the Brassica rapa genome based on small RNA deep sequencing

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are one of the functional non-coding small RNAs involved in the epigenetic control of the plant genome. Although plants contain both evolutionary conserved miRNAs and species-specific miRNAs within their genomes, computational methods often only identify evolutionary conserved miRNAs. The recent sequencing of the Brassica rapa genome enables us to identify miRNAs and their putative target genes. In this study, we sought to provide a more comprehensive prediction of B. rapa miRNAs based on high throughput small RNA deep sequencing. Results We sequenced small RNAs from five types of tissue: seedlings, roots, petioles, leaves, and flowers. By analyzing 2.75 million unique reads that mapped to the B. rapa genome, we identified 216 novel and 196 conserved miRNAs that were predicted to target approximately 20% of the genome’s protein coding genes. Quantitative analysis of miRNAs from the five types of tissue revealed that novel miRNAs were expressed in diverse tissues but their expression levels were lower than those of the conserved miRNAs. Comparative analysis of the miRNAs between the B. rapa and Arabidopsis thaliana genomes demonstrated that redundant copies of conserved miRNAs in the B. rapa genome may have been deleted after whole genome triplication. Novel miRNA members seemed to have spontaneously arisen from the B. rapa and A. thaliana genomes, suggesting the species-specific expansion of miRNAs. We have made this data publicly available in a miRNA database of B. rapa called BraMRs. The database allows the user to retrieve miRNA sequences, their expression profiles, and a description of their target genes from the five tissue types investigated here. Conclusions This is the first report to identify novel miRNAs from Brassica crops using genome-wide high throughput techniques. The combination of computational methods and small RNA deep sequencing provides robust predictions of miRNAs in the genome. The finding of numerous novel miRNAs, many with few target genes and low expression levels, suggests the rapid evolution of miRNA genes. The development of a miRNA database, BraMRs, enables us to integrate miRNA identification, target prediction, and functional annotation of target genes. BraMRs will represent a valuable public resource with which to study the epigenetic control of B. rapa and other closely related Brassica species. The database is available at the following link: http://bramrs.rna.kr [1]. PMID:23163954
Transcriptome Wide Identification and Validation of Calcium Sensor Gene Family in the Developing Spikes of Finger Millet Genotypes for Elucidating Its Role in Grain Calcium Accumulation

PubMed Central

Singh, Uma M.; Chandra, Muktesh; Shankhdhar, Shailesh C.; Kumar, Anil

2014-01-01

Background In finger millet, calcium is one of the important and abundant mineral elements. The molecular mechanisms involved in calcium accumulation in plants remains poorly understood. Transcriptome sequencing of genetically diverse genotypes of finger millet differing in grain calcium content will help in understanding the trait. Principal Finding In this study, the transcriptome sequencing of spike tissues of two genotypes of finger millet differing in their grain calcium content, were performed for the first time. Out of 109,218 contigs, 78 contigs in case of GP-1 (Low Ca genotype) and out of 120,130 contigs 76 contigs in case of GP-45 (High Ca genotype), were identified as calcium sensor genes. Through in silico analysis all 82 unique calcium sensor genes were classified into eight calcium sensor gene family viz., CaM & CaMLs, CBLs, CIPKs, CRKs, PEPRKs, CDPKs, CaMKs and CCaMK. Out of 82 genes, 12 were found diverse from the rice orthologs. The differential expression analysis on the basis of FPKM value resulted in 24 genes highly expressed in GP-45 and 11 genes highly expressed in GP-1. Ten of the 35 differentially expressed genes could be assigned to three documented pathways involved mainly in stress responses. Furthermore, validation of selected calcium sensor responder genes was also performed by qPCR, in developing spikes of both genotypes grown on different concentration of exogenous calcium. Conclusion Through de novo transcriptome data assembly and analysis, we reported the comprehensive identification and functional characterization of calcium sensor gene family. The calcium sensor gene family identified and characterized in this study will facilitate in understanding the molecular basis of calcium accumulation and development of calcium biofortified crops. Moreover, this study also supported that identification and characterization of gene family through Illumina paired-end sequencing is a potential tool for generating the genomic information of gene family in non-model species. PMID:25157851
Comparative Transcriptome Analysis of Genes Involved in Anthocyanin Biosynthesis in the Red and Yellow Fruits of Sweet Cherry (Prunus avium L.)

PubMed Central

Wei, Hairong; Chen, Xin; Zong, Xiaojuan; Shu, Huairui; Gao, Dongsheng; Liu, Qingzhong

2015-01-01

Background Fruit color is one of the most important economic traits of the sweet cherry (Prunus avium L.). The red coloration of sweet cherry fruit is mainly attributed to anthocyanins. However, limited information is available regarding the molecular mechanisms underlying anthocyanin biosynthesis and its regulation in sweet cherry. Methodology/Principal Findings In this study, a reference transcriptome of P. avium L. was sequenced and annotated to identify the transcriptional determinants of fruit color. Normalized cDNA libraries from red and yellow fruits were sequenced using the next-generation Illumina/Solexa sequencing platform and de novo assembly. Over 66 million high-quality reads were assembled into 43,128 unigenes using a combined assembly strategy. Then a total of 22,452 unigenes were compared to public databases using homology searches, and 20,095 of these unigenes were annotated in the Nr protein database. Furthermore, transcriptome differences between the four stages of fruit ripening were analyzed using Illumina digital gene expression (DGE) profiling. Biological pathway analysis revealed that 72 unigenes were involved in anthocyanin biosynthesis. The expression patterns of unigenes encoding phenylalanine ammonia-lyase (PAL), 4-coumarate-CoA ligase (4CL), chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H), flavanone 3’-hydroxylase (F3’H), dihydroflavonol 4-reductase (DFR), anthocyanidin synthase (ANS) and UDP glucose: flavonol 3-O-glucosyltransferase (UFGT) during fruit ripening differed between red and yellow fruit. In addition, we identified some transcription factor families (such as MYB, bHLH and WD40) that may control anthocyanin biosynthesis. We confirmed the altered expression levels of eighteen unigenes that encode anthocyanin biosynthetic enzymes and transcription factors using quantitative real-time PCR (qRT-PCR). Conclusions/Significance The obtained sweet cherry transcriptome and DGE profiling data provide comprehensive gene expression information that lends insights into the molecular mechanisms underlying anthocyanin biosynthesis. These results will provide a platform for further functional genomic research on this fruit crop. PMID:25799516
Population genomic scan for candidate signatures of balancing selection to guide antigen characterization in malaria parasites.

PubMed

Amambua-Ngwa, Alfred; Tetteh, Kevin K A; Manske, Magnus; Gomez-Escobar, Natalia; Stewart, Lindsay B; Deerhake, M Elizabeth; Cheeseman, Ian H; Newbold, Christopher I; Holder, Anthony A; Knuepfer, Ellen; Janha, Omar; Jallow, Muminatou; Campino, Susana; Macinnis, Bronwyn; Kwiatkowski, Dominic P; Conway, David J

2012-01-01

Acquired immunity in vertebrates maintains polymorphisms in endemic pathogens, leading to identifiable signatures of balancing selection. To comprehensively survey for genes under such selection in the human malaria parasite Plasmodium falciparum, we generated paired-end short-read sequences of parasites in clinical isolates from an endemic Gambian population, which were mapped to the 3D7 strain reference genome to yield high-quality genome-wide coding sequence data for 65 isolates. A minority of genes did not map reliably, including the hypervariable var, rifin, and stevor families, but 5,056 genes (90.9% of all in the genome) had >70% sequence coverage with minimum read depth of 5 for at least 50 isolates, of which 2,853 genes contained 3 or more single nucleotide polymorphisms (SNPs) for analysis of polymorphic site frequency spectra. Against an overall background of negatively skewed frequencies, as expected from historical population expansion combined with purifying selection, the outlying minority of genes with signatures indicating exceptionally intermediate frequencies were identified. Comparing genes with different stage-specificity, such signatures were most common in those with peak expression at the merozoite stage that invades erythrocytes. Members of clag, PfMC-2TM, surfin, and msp3-like gene families were highly represented, the strongest signature being in the msp3-like gene PF10_0355. Analysis of msp3-like transcripts in 45 clinical and 11 laboratory adapted isolates grown to merozoite-containing schizont stages revealed surprisingly low expression of PF10_0355. In diverse clonal parasite lines the protein product was expressed in a minority of mature schizonts (<1% in most lines and ∼10% in clone HB3), and eight sub-clones of HB3 cultured separately had an intermediate spectrum of positive frequencies (0.9 to 7.5%), indicating phase variable expression of this polymorphic antigen. This and other identified targets of balancing selection are now prioritized for functional study.
Transcriptome wide identification and validation of calcium sensor gene family in the developing spikes of finger millet genotypes for elucidating its role in grain calcium accumulation.

PubMed

Singh, Uma M; Chandra, Muktesh; Shankhdhar, Shailesh C; Kumar, Anil

2014-01-01

In finger millet, calcium is one of the important and abundant mineral elements. The molecular mechanisms involved in calcium accumulation in plants remains poorly understood. Transcriptome sequencing of genetically diverse genotypes of finger millet differing in grain calcium content will help in understanding the trait. In this study, the transcriptome sequencing of spike tissues of two genotypes of finger millet differing in their grain calcium content, were performed for the first time. Out of 109,218 contigs, 78 contigs in case of GP-1 (Low Ca genotype) and out of 120,130 contigs 76 contigs in case of GP-45 (High Ca genotype), were identified as calcium sensor genes. Through in silico analysis all 82 unique calcium sensor genes were classified into eight calcium sensor gene family viz., CaM & CaMLs, CBLs, CIPKs, CRKs, PEPRKs, CDPKs, CaMKs and CCaMK. Out of 82 genes, 12 were found diverse from the rice orthologs. The differential expression analysis on the basis of FPKM value resulted in 24 genes highly expressed in GP-45 and 11 genes highly expressed in GP-1. Ten of the 35 differentially expressed genes could be assigned to three documented pathways involved mainly in stress responses. Furthermore, validation of selected calcium sensor responder genes was also performed by qPCR, in developing spikes of both genotypes grown on different concentration of exogenous calcium. Through de novo transcriptome data assembly and analysis, we reported the comprehensive identification and functional characterization of calcium sensor gene family. The calcium sensor gene family identified and characterized in this study will facilitate in understanding the molecular basis of calcium accumulation and development of calcium biofortified crops. Moreover, this study also supported that identification and characterization of gene family through Illumina paired-end sequencing is a potential tool for generating the genomic information of gene family in non-model species.
RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application

PubMed Central

2015-01-01

Background The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). Moreover, the huge volume of data generated by NGS platforms introduces unprecedented computational and technological challenges to efficiently analyze and store sequence data and results. Methods In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Results Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs. PMID:26046471
The antenna transcriptome changes in mosquito Anopheles sinensis, pre- and post- blood meal.

PubMed

Chen, Qian; Pei, Di; Li, Jianyong; Jing, Chengyu; Wu, Wenjian; Man, Yahui

2017-01-01

Antenna is the main chemosensory organ in mosquitoes. Characterization of the transcriptional changes after blood meal, especially those related to chemoreception, may help to explain mosquito blood sucking behavior and to identify novel targets for mosquito control. Anopheles sinensis is an Asiatic mosquito species which transmits malaria and lymphatic filariasis. However, studies on chemosensory biology in female An. sinensis are quite lacking. Here we report a transcriptome analysis of An. sinensis female antennae pre- and post- blood meal. We created six An. sinensis antenna RNA-seq libraries, three from females without blood meal and three from females five hours after a blood meal. Illumina sequencing was conducted to analyze the transcriptome differences between the two groups. In total, the sequenced fragments created 21,643 genes, 1,828 of them were novel. 12,861 of these genes were considered to be expressed (FPKM >1.0) in at least one of the two groups, with 12,159 genes expressed in both groups. 548 genes were differentially expressed in the blood-fed group, with 331 genes up-regulated and 217 genes down-regulated. GO enrichment analysis of the differentially expressed genes suggested that there were no statistically over represented GO terms among down-regulated genes in blood-fed mosquitoes, while the enriched GO terms of the up-regulated genes occurred mainly in metabolic process. For the chemosensory gene families, a subtle distinction in the expression levels can be observed according to our statistical analysis. However, the firstly comprehensive identification of these chemosensory gene families in An. sinensis antennae will help to characterize the precise function of these proteins in odor recognition in mosquitoes. This study provides a first global view in the changes of transcript accumulation elicited by blood meal in An. sinensis female antennae.
Comprehensive Exploration of Novel Chimeric Transcripts in Clear Cell Renal Cell Carcinomas Using Whole Transcriptome Analysis

PubMed Central

Gotoh, Masahiro; Ichikawa, Hitoshi; Arai, Eri; Chiku, Suenori; Sakamoto, Hiromi; Fujimoto, Hiroyuki; Hiramoto, Masaki; Nammo, Takao; Yasuda, Kazuki; Yoshida, Teruhiko; Kanai, Yae

2014-01-01

The aim of this study was to clarify the participation of expression of chimeric transcripts in renal carcinogenesis. Whole transcriptome analysis (RNA sequencing) and exploration of candidate chimeric transcripts using the deFuse program were performed on 68 specimens of cancerous tissue (T) and 11 specimens of non-cancerous renal cortex tissue (N) obtained from 68 patients with clear cell renal cell carcinomas (RCCs) in an initial cohort. As positive controls, two RCCs associated with Xp11.2 translocation were analyzed. After verification by reverse transcription (RT)-PCR and Sanger sequencing, 26 novel chimeric transcripts were identified in 17 (25%) of the 68 clear cell RCCs. Genomic breakpoints were determined in five of the chimeric transcripts. Quantitative RT-PCR analysis revealed that the mRNA expression levels for the MMACHC, PTER, EPC2, ATXN7, FHIT, KIFAP3, CPEB1, MINPP1, TEX264, FAM107A, UPF3A, CDC16, MCCC1, CPSF3, and ASAP2 genes, being partner genes involved in the chimeric transcripts in the initial cohort, were significantly reduced in 26 T samples relative to the corresponding 26 N samples in the second cohort. Moreover, the mRNA expression levels for the above partner genes in T samples were significantly correlated with tumor aggressiveness and poorer patient outcome, indicating that reduced expression of these genes may participate in malignant progression of RCCs. As is the case when their levels of expression are reduced, these partner genes also may not fully function when involved in chimeric transcripts. These data suggest that generation of chimeric transcripts may participate in renal carcinogenesis by inducing dysfunction of tumor-related genes. PMID:25230976
Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships.

PubMed

Baier, F; Copp, J N; Tokuriki, N

2016-11-22

The sequence and functional diversity of enzyme superfamilies have expanded through billions of years of evolution from a common ancestor. Understanding how protein sequence and functional "space" have expanded, at both the evolutionary and molecular level, is central to biochemistry, molecular biology, and evolutionary biology. Integrative approaches that examine protein sequence, structure, and function have begun to provide comprehensive views of the functional diversity and evolutionary relationships within enzyme superfamilies. In this review, we outline the recent advances in our understanding of enzyme evolution and superfamily functional diversity. We describe the tools that have been used to comprehensively analyze sequence relationships and to characterize sequence and function relationships. We also highlight recent large-scale experimental approaches that systematically determine the activity profiles across enzyme superfamilies. We identify several intriguing insights from this recent body of work. First, promiscuous activities are prevalent among extant enzymes. Second, many divergent proteins retain "function connectivity" via enzyme promiscuity, which can be used to probe the evolutionary potential and history of enzyme superfamilies. Finally, we discuss open questions regarding the intricacies of enzyme divergence, as well as potential research directions that will deepen our understanding of enzyme superfamily evolution.
Immunohistochemical ATRX expression is not a surrogate for 1p19q codeletion.

PubMed

Yamamichi, Akane; Ohka, Fumiharu; Aoki, Kosuke; Suzuki, Hiromichi; Kato, Akira; Hirano, Masaki; Motomura, Kazuya; Tanahashi, Kuniaki; Chalise, Lushun; Maeda, Sachi; Wakabayashi, Toshihiko; Kato, Yukinari; Natsume, Atsushi

2018-04-01

The IDH-mutant and 1p/19q co-deletion (1p19q codel) provides significant diagnostic and prognostic value in lower-grade gliomas. As ATRX mutation and 1p19q codel are mutually exclusive, ATRX immunohistochemistry (IHC) may substitute for 1p19q codel, but this has not been comprehensively examined. In the current study, we performed ATRX-IHC in 78 gliomas whose ATRX statuses were comprehensively determined by whole exome sequencing. Among the 60 IHC-positive and 18 IHC-negative cases, 86.7 and 77.8% were ATRX-wildtype and ATRX-mutant, respectively. ATRX mutational patterns were not consistent with ATRX-IHC. If our cohort had only used IDH status and IHC-based ATRX expression for diagnosis, 78 tumors would have been subtyped as 48 oligodendroglial tumors, 16 IDH-mutant astrocytic tumors, and 14 IDH-wildtype astrocytic tumors. However, when the 1p19q codel test was performed following ATRX-IHC, 8 of 48 ATRX-IHC-positive tumors were classified as "1p19q non-codel" and 3 of 16 ATRX-IHC-negative tumors were classified as "1p19q codel"; a total of 11 tumors (14%) were incorrectly classified. In summary, we observed dissociation between ATRX-IHC and actual 1p19q codel in 11 of 64 IDH-mutant LGGs. In describing the complex IHC expression of ATRX somatic mutations, our results indicate the need for caution when using ATRX-IHC as a surrogate of 1p19q status.
A comprehensive strategy for identifying long-distance mobile peptides in xylem sap.

PubMed

Okamoto, Satoru; Suzuki, Takamasa; Kawaguchi, Masayoshi; Higashiyama, Tetsuya; Matsubayashi, Yoshikatsu

2015-11-01

There is a growing awareness that secreted pemediate organ-to-organ communication in higher plants. Xylem sap peptidomics is an effective but challenging approach for identifying long-distance mobile peptides. In this study we developed a simple, gel-free purification system that combines o-chlorophenol extraction with HPLC separation. Using this system, we successfully identified seven oligopeptides from soybean xylem sap exudate that had one or more post-transcriptional modifications: glycosylation, sulfation and/or hydroxylation. RNA sequencing and quantitative PCR analyses showed that the peptide-encoding genes are expressed in multiple tissues. We further analyzed the long-distance translocation of four of the seven peptides using gene-encoding peptides with single amino acid substitutions, and identified these four peptides as potential root-to-shoot mobile oligopeptides. Promoter-GUS analysis showed that all four peptide-encoding genes were expressed in the inner tissues of the root endodermis. Moreover, we found that some of these peptide-encoding genes responded to biotic and/or abiotic factors. These results indicate that our purification system provides a comprehensive approach for effectively identifying endogenous small peptides and reinforce the concept that higher plants employ various peptides in root-to-shoot signaling. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
The genomic landscape shaped by selection on transposable elements across 18 mouse strains.

PubMed

Nellåker, Christoffer; Keane, Thomas M; Yalcin, Binnaz; Wong, Kim; Agam, Avigail; Belgard, T Grant; Flint, Jonathan; Adams, David J; Frankel, Wayne N; Ponting, Chris P

2012-06-15

Transposable element (TE)-derived sequence dominates the landscape of mammalian genomes and can modulate gene function by dysregulating transcription and translation. Our current knowledge of TEs in laboratory mouse strains is limited primarily to those present in the C57BL/6J reference genome, with most mouse TEs being drawn from three distinct classes, namely short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs) and the endogenous retrovirus (ERV) superfamily. Despite their high prevalence, the different genomic and gene properties controlling whether TEs are preferentially purged from, or are retained by, genetic drift or positive selection in mammalian genomes remain poorly defined. Using whole genome sequencing data from 13 classical laboratory and 4 wild-derived mouse inbred strains, we developed a comprehensive catalogue of 103,798 polymorphic TE variants. We employ this extensive data set to characterize TE variants across the Mus lineage, and to infer neutral and selective processes that have acted over 2 million years. Our results indicate that the majority of TE variants are introduced though the male germline and that only a minority of TE variants exert detectable changes in gene expression. However, among genes with differential expression across the strains there are twice as many TE variants identified as being putative causal variants as expected. Most TE variants that cause gene expression changes appear to be purged rapidly by purifying selection. Our findings demonstrate that past TE insertions have often been highly deleterious, and help to prioritize TE variants according to their likely contribution to gene expression or phenotype variation.
Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis.

PubMed

Ma, Chuang; Wang, Xiangfeng

2012-09-01

One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey's biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses.
Application of the Gini Correlation Coefficient to Infer Regulatory Relationships in Transcriptome Analysis[W][OA

PubMed Central

Ma, Chuang; Wang, Xiangfeng

2012-01-01

One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey’s biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses. PMID:22797655

Differential gene expression in the siphonophore Nanomia bijuga (Cnidaria) assessed with multiple next-generation sequencing workflows.

PubMed

Siebert, Stefan; Robinson, Mark D; Tintori, Sophia C; Goetz, Freya; Helm, Rebecca R; Smith, Stephen A; Shaner, Nathan; Haddock, Steven H D; Dunn, Casey W

2011-01-01

We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing.
Differential Gene Expression in the Siphonophore Nanomia bijuga (Cnidaria) Assessed with Multiple Next-Generation Sequencing Workflows

PubMed Central

Siebert, Stefan; Robinson, Mark D.; Tintori, Sophia C.; Goetz, Freya; Helm, Rebecca R.; Smith, Stephen A.; Shaner, Nathan; Haddock, Steven H. D.; Dunn, Casey W.

2011-01-01

We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing. PMID:21829563
Delimiting regulatory sequences of the Drosophila melanogaster Ddc gene.

PubMed Central

Hirsh, J; Morgan, B A; Scholnick, S B

1986-01-01

We delimited sequences necessary for in vivo expression of the Drosophila melanogaster dopa decarboxylase gene Ddc. The expression of in vitro-altered genes was assayed following germ line integration via P-element vectors. Sequences between -209 and -24 were necessary for normally regulated expression, although genes lacking these sequences could be expressed at 10 to 50% of wild-type levels at specific developmental times. These genes showed components of normal developmental expression, which suggests that they retain some regulatory elements. All Ddc genes lacking the normal immediate 5'-flanking sequences were grossly deficient in larval central nervous system expression. Thus, this upstream region must contain at least one element necessary for this expression. A mutated Ddc gene without a normal TATA boxlike sequence used the normal RNA start points, indicating that this sequences is not required for start point specificity. Images PMID:3099170
A transcriptional blueprint for a spiral-cleaving embryo.

PubMed

Chou, Hsien-Chao; Pruitt, Margaret M; Bastin, Benjamin R; Schneider, Stephan Q

2016-08-05

The spiral cleavage mode of early development is utilized in over one-third of all animal phyla and generates embryonic cells of different size, position, and fate through a conserved set of stereotypic and invariant asymmetric cell divisions. Despite the widespread use of spiral cleavage, regulatory and molecular features for any spiral-cleaving embryo are largely uncharted. To address this gap we use RNA-sequencing on the spiralian model Platynereis dumerilii to capture and quantify the first complete genome-wide transcriptional landscape of early spiral cleavage. RNA-sequencing datasets from seven stages in early Platynereis development, from the zygote to the protrochophore, are described here including the de novo assembly and annotation of ~17,200 Platynereis genes. Depth and quality of the RNA-sequencing datasets allow the identification of the temporal onset and level of transcription for each annotated gene, even if the expression is restricted to a single cell. Over 4000 transcripts are maternally contributed and cleared by the end of the early spiral cleavage phase. Small early waves of zygotic expression are followed by major waves of thousands of genes, demarcating the maternal to zygotic transition shortly after the completion of spiral cleavages in this annelid species. Our comprehensive stage-specific transcriptional analysis of early embryonic stages in Platynereis elucidates the regulatory genome during early spiral embryogenesis and defines the maternal to zygotic transition in Platynereis embryos. This transcriptome assembly provides the first systems-level view of the transcriptional and regulatory landscape for a spiral-cleaving embryo.
Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree

PubMed Central

Rangiah, Kannan; Mahesh, HB; Rajamani, Anantharamanan; Shirke, Meghana D.; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, BN

2015-01-01

Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways. PMID:26290780
Differential Expression and Functional Analysis of High-Throughput -Omics Data Using Open Source Tools.

PubMed

Kebschull, Moritz; Fittler, Melanie Julia; Demmer, Ryan T; Papapanou, Panos N

2017-01-01

Today, -omics analyses, including the systematic cataloging of messenger RNA and microRNA sequences or DNA methylation patterns in a cell population, organ, or tissue sample, allow for an unbiased, comprehensive genome-level analysis of complex diseases, offering a large advantage over earlier "candidate" gene or pathway analyses. A primary goal in the analysis of these high-throughput assays is the detection of those features among several thousand that differ between different groups of samples. In the context of oral biology, our group has successfully utilized -omics technology to identify key molecules and pathways in different diagnostic entities of periodontal disease.A major issue when inferring biological information from high-throughput -omics studies is the fact that the sheer volume of high-dimensional data generated by contemporary technology is not appropriately analyzed using common statistical methods employed in the biomedical sciences.In this chapter, we outline a robust and well-accepted bioinformatics workflow for the initial analysis of -omics data generated using microarrays or next-generation sequencing technology using open-source tools. Starting with quality control measures and necessary preprocessing steps for data originating from different -omics technologies, we next outline a differential expression analysis pipeline that can be used for data from both microarray and sequencing experiments, and offers the possibility to account for random or fixed effects. Finally, we present an overview of the possibilities for a functional analysis of the obtained data.
Circular RNA biogenesis can proceed through an exon-containing lariat precursor.

PubMed

Barrett, Steven P; Wang, Peter L; Salzman, Julia

2015-06-09

Pervasive expression of circular RNA is a recently discovered feature of eukaryotic gene expression programs, yet its function remains largely unknown. The presumed biogenesis of these RNAs involves a non-canonical 'backsplicing' event. Recent studies in mammalian cell culture posit that backsplicing is facilitated by inverted repeats flanking the circularized exon(s). Although such sequence elements are common in mammals, they are rare in lower eukaryotes, making current models insufficient to describe circularization. Through systematic splice site mutagenesis and the identification of splicing intermediates, we show that circular RNA in Schizosaccharomyces pombe is generated through an exon-containing lariat precursor. Furthermore, we have performed high-throughput and comprehensive mutagenesis of a circle-forming exon, which enabled us to discover a systematic effect of exon length on RNA circularization. Our results uncover a mechanism for circular RNA biogenesis that may account for circularization in genes that lack noticeable flanking intronic secondary structure.
Transcriptome analysis of woodland strawberry (Fragaria vesca) response to the infection by Strawberry vein banding virus (SVBV).

PubMed

Chen, Jing; Zhang, Hanping; Feng, Mingfeng; Zuo, Dengpan; Hu, Yahui; Jiang, Tong

2016-07-13

Woodland strawberry (Fragaria vesca) infected with Strawberry vein banding virus (SVBV) exhibits chlorotic symptoms along the leaf veins. However, little is known about the molecular mechanism of strawberry disease caused by SVBV. We performed the next-generation sequencing (RNA-Seq) study to identify gene expression changes induced by SVBV in woodland strawberry using mock-inoculated plants as a control. Using RNA-Seq, we have identified 36,850 unigenes, of which 517 were differentially expressed in the virus-infected plants (DEGs). The unigenes were annotated and classified with Gene Ontology (GO), Clusters of Orthologous Group (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. The KEGG pathway analysis of these genes suggested that strawberry disease caused by SVBV may affect multiple processes including pigment metabolism, photosynthesis and plant-pathogen interactions. Our research provides comprehensive transcriptome information regarding SVBV infection in strawberry.
Subunit Organisation of In Vitro Reconstituted HOPS and CORVET Multisubunit Membrane Tethering Complexes

PubMed Central

Guo, Zhong; Johnston, Wayne; Kovtun, Oleksiy; Mureev, Sergey; Bröcker, Cornelia; Ungermann, Christian; Alexandrov, Kirill

2013-01-01

Biochemical and structural analysis of macromolecular protein assemblies remains challenging due to technical difficulties in recombinant expression, engineering and reconstitution of multisubunit complexes. Here we use a recently developed cell-free protein expression system based on the protozoan Leishmania tarentolae to produce in vitro all six subunits of the 600 kDa HOPS and CORVET membrane tethering complexes. We demonstrate that both subcomplexes and the entire HOPS complex can be reconstituted in vitro resulting in a comprehensive subunit interaction map. To our knowledge this is the largest eukaryotic protein complex in vitro reconstituted to date. Using the truncation and interaction analysis, we demonstrate that the complex is assembled through short hydrophobic sequences located in the C-terminus of the individual Vps subunits. Based on this data we propose a model of the HOPS and CORVET complex assembly that reconciles the available biochemical and structural data. PMID:24312556
MiR-191 Regulates Primary Human Fibroblast Proliferation and Directly Targets Multiple Oncogenes

PubMed Central

Polioudakis, Damon; Abell, Nathan S.; Iyer, Vishwanath R.

2015-01-01

miRNAs play a central role in numerous pathologies including multiple cancer types. miR-191 has predominantly been studied as an oncogene, but the role of miR-191 in the proliferation of primary cells is not well characterized, and the miR-191 targetome has not been experimentally profiled. Here we utilized RNA induced silencing complex immunoprecipitations as well as gene expression profiling to construct a genome wide miR-191 target profile. We show that miR-191 represses proliferation in primary human fibroblasts, identify multiple proto-oncogenes as novel miR-191 targets, including CDK9, NOTCH2, and RPS6KA3, and present evidence that miR-191 extensively mediates target expression through coding sequence (CDS) pairing. Our results provide a comprehensive genome wide miR-191 target profile, and demonstrate miR-191’s regulation of primary human fibroblast proliferation. PMID:25992613
Application of epigenetic markers in molecular breeding of the swine.

PubMed

Zhang, Ke; Feng, Guang-de; Zhang, Bao-yun; Xiang, Wei; Chen, Long; Yang, Fang; Chu, Ming-xing; Wang, Ping-qing

2016-07-20

Livestock phenotypes are determined by the interaction of a variety of factors, including the genome, the epigenome and the environment. Epigenetics refers to gene expression changes without DNA sequence alterations. Epigenetic markers mainly include DNA methylation, histone modifications, non-coding RNAs, and imprinting genes. More and more researches show that epigenetic markers play an important role in the traits of pigs by modulating phenotype changes via gene expression. However, the role of epigenetic markers has caught little attention in swine breeding. The mechanism that influences important traits of swine has not been analyzed in detail, and it still lacks adequate scientific basis for practical applications. From the aspects of nutrition, diseases, important economic traits and trans-generational inheritance, we summarize the research, application prospects and challenges in the field of utilizing epigenetic markers in molecular breeding of pigs, thus providing a more comprehensive theoretical basis to promote more rapid research development in this field.
Bridging epigenomics and complex disease: the basics.

PubMed

Teperino, Raffaele; Lempradl, Adelheid; Pospisilik, J Andrew

2013-05-01

The DNA sequence largely defines gene expression and phenotype. However, it is becoming increasingly clear that an additional chromatin-based regulatory network imparts both stability and plasticity to genome output, modifying phenotype independently of the genetic blueprint. Indeed, alterations in this "epigenetic" control layer underlie, at least in part, the reason for monozygotic twins being discordant for disease. Functionally, this regulatory layer comprises post-translational modifications of DNA and histones, as well as small and large noncoding RNAs. Together these regulate gene expression by changing chromatin organization and DNA accessibility. Successive technological advances over the past decade have enabled researchers to map the chromatin state with increasing accuracy and comprehensiveness, catapulting genetic research into a genome-wide era. Here, aiming particularly at the genomics/epigenomics newcomer, we review the epigenetic basis that has helped drive the technological shift and how this progress is shaping our understanding of complex disease.
The developmental transcriptome of the bamboo snout beetle Cyrtotrachelus buqueti and insights into candidate pheromone-binding proteins

PubMed Central

Yang, Wei; Yang, Chunping; Lu, Lin; Chen, Zhangming

2017-01-01

Cyrtotrachelus buqueti is an extremely harmful bamboo borer, and the larvae of this pest attack clumping bamboo shoots. Pheromone-binding proteins (PBPs) play an important role in identifying insect sex pheromones, but the C. buqueti genome is not readily available for PBP analysis. Developmental transcriptomes of eggs, larvae from the first instar to the prepupal stage, pupae, and adults (females and males) from emergence to mating were built by RNA sequencing (RNA-Seq) in the present study to establish a sequence background of C. buqueti to help understand PBPs. Approximately 164.8 million clean reads were obtained and annotated into 108,854 transcripts. These were assembled into 24,338, 21,597, 24,798, 21,886, 24,642, and 83,115 unigenes for eggs, larvae, pupae, females, males, and the combined datasets, respectively. Unigenes were annotated against NCBI non-redundant protein sequences, NCBI non-redundant nucleotide sequences, Gene Ontology (GO), Protein family, Clusters of Orthologous Groups of Proteins/ Clusters of Eukaryotic Orthologous Groups (KOG), Swiss-Prot, and KEGG Orthology databases. A total of 17,213 unigenes were annotated into 55 sub-categories belonging to three main GO categories; 10,672 unigenes were classified into 26 functional categories by KOG classification, and 8,063 unigenes were classified into five functional KEGG categories. RSEM software for RNA sequencing showed that 4,816, 3,176, 3,661, 2,898, 4,316, 8,019, 7,273, 5,922, 5,844, and 4,570 genes were differentially expressed between larvae and males, larvae and eggs, larvae and pupae, larvae and females, males and females, males and eggs, males and pupae, females and eggs, females and pupae, and eggs and pupae, respectively. Of these, three were confirmed to be significantly differentially expressed between larvae, females, and males. Furthermore, PBP Cbuq7577_g1 was highly expressed in the antenna of males. A comprehensive sequence resource of a desirable quality was constructed from developmental transcriptomes of C. buqueti eggs, larvae, pupae, and adults. This work enriches the genomic data of C. buqueti, and facilitates our understanding of its metamorphosis, development, and response to environmental change. The identified candidate PBP Cbuq7577_g1 might play a crucial role in identifying sex pheromones, and could be used as a targeted gene to control C. buqueti numbers by disrupting sex pheromone communication. PMID:28662071
Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shi, CY; Yang, H; Wei, CL

Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled intomore » 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis.« less
Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

PubMed Central

2011-01-01

Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A)+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). Conclusions An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis. PMID:21356090
Theory of Mind and social functioning in schizophrenia: correlation with figurative language abnormalities, clinical symptoms and general intelligence.

PubMed

Piovan, Cristiano; Gava, Laura; Campeol, Mara

2016-01-01

Over past few decades, studies displayed Theory of Mind (ToM) as a system, including cognitive and affective features, rather than an unitary process. Within domains defining social cognition, ToM stands for the best predictor of poor social functioning in schizophrenia. The current study aimed to explore competence in ToM tasks, in metaphorical and idiomatic language identification tasks and in a conversational rules observance test, as well as relationship with social functioning, in a group of outpatients suffering from schizophrenia. METHODS.: 30 outpatients diagnosed with schizophrenia and 24 healthy subjects have been recruited. Both groups underwent TIB as premorbid IQ evaluation, PANSS, Theory of Mind Picture Sequencing Task, a metaphors and idiomatic expressions comprehension test and a conversational test. Social functioning was assessed with PSP. Results.Mean values of premorbid IQ showed no significant difference between patients and control group. In ToM and pragmatic competence tasks, differences between groups resulted in high significance, due to patients' lower performance. A correlation between metaphors and idiomatic expressions comprehension and second order false beliefs was detected. PSP showed a correlation with PANSS and cognitive-ToM, whereas leaving aside affective-ToM. Results showed how people affected with schizophrenia, in stable clinical condition, do have clear impairments in ToM and figurative language comprehension assignments. In our theoretical framework, correlation arisen between cognitive-ToM, pragmatic deficits, clinical status and social functioning level suggests usefulness of rehabilitative interventions to recover metacognitive functions and pragmatic abilities, in order to reduce social disability in schizophrenia.
Genome-wide dynamics of alternative polyadenylation in rice

PubMed Central

Fu, Haihui; Yang, Dewei; Su, Wenyue; Ma, Liuyin; Shen, Yingjia; Ji, Guoli; Ye, Xinfu; Wu, Xiaohui

2016-01-01

Alternative polyadenylation (APA), in which a transcript uses one of the poly(A) sites to define its 3′-end, is a common regulatory mechanism in eukaryotic gene expression. However, the potential of APA in determining crop agronomic traits remains elusive. This study systematically tallied poly(A) sites of 14 different rice tissues and developmental stages using the poly(A) tag sequencing (PAT-seq) approach. The results indicate significant involvement of APA in developmental and quantitative trait loci (QTL) gene expression. About 48% of all expressed genes use APA to generate transcriptomic and proteomic diversity. Some genes switch APA sites, allowing differentially expressed genes to use alternate 3′ UTRs. Interestingly, APA in mature pollen is distinct where differential expression levels of a set of poly(A) factors and different distributions of APA sites are found, indicating a unique mRNA 3′-end formation regulation during gametophyte development. Equally interesting, statistical analyses showed that QTL tends to use APA for regulation of gene expression of many agronomic traits, suggesting a potential important role of APA in rice production. These results provide thus far the most comprehensive and high-resolution resource for advanced analysis of APA in crops and shed light on how APA is associated with trait formation in eukaryotes. PMID:27733415
Anchoring 9,371 Maize Expressed Sequence Tagged Unigenes to the Bacterial Artificial Chromosome Contig Map by Two-Dimensional Overgo Hybridization1

PubMed Central

Gardiner, Jack; Schroeder, Steven; Polacco, Mary L.; Sanchez-Villeda, Hector; Fang, Zhiwei; Morgante, Michele; Landewe, Tim; Fengler, Kevin; Useche, Francisco; Hanafey, Michael; Tingey, Scott; Chou, Hugh; Wing, Rod; Soderlund, Carol; Coe, Edward H.

2004-01-01

Our goal is to construct a robust physical map for maize (Zea mays) comprehensively integrated with the genetic map. We have used a two-dimensional 24 × 24 overgo pooling strategy to anchor maize expressed sequence tagged (EST) unigenes to 165,888 bacterial artificial chromosomes (BACs) on high-density filters. A set of 70,716 public maize ESTs seeded derivation of 10,723 EST unigene assemblies. From these assemblies, 10,642 overgo sequences of 40 bp were applied as hybridization probes. BAC addresses were obtained for 9,371 overgo probes, representing an 88% success rate. More than 96% of the successful overgo probes identified two or more BACs, while 5% identified more than 50 BACs. The majority of BACs identified (79%) were hybridized with one or two overgos. A small number of BACs hybridized with eight or more overgos, suggesting that these BACs must be gene rich. Approximately 5,670 overgos identified BACs assembled within one contig, indicating that these probes are highly locus specific. A total of 1,795 megabases (Mb; 87%) of the total 2,050 Mb in BAC contigs were associated with one or more overgos, which are serving as sequence-tagged sites for single nucleotide polymorphism development. Overgo density ranged from less than one overgo per megabase to greater than 20 overgos per megabase. The majority of contigs (52%) hit by overgos contained three to nine overgos per megabase. Analysis of approximately 1,022 Mb of genetically anchored BAC contigs indicates that 9,003 of the total 13,900 overgo-contig sites are genetically anchored. Our results indicate overgos are a powerful approach for generating gene-specific hybridization probes that are facilitating the assembly of an integrated genetic and physical map for maize. PMID:15020742
Genome-wide identification of sweet orange (Citrus sinensis) histone modification gene families and their expression analysis during the fruit development and fruit-blue mold infection process

PubMed Central

Xu, Jidi; Xu, Haidan; Liu, Yuanlong; Wang, Xia; Xu, Qiang; Deng, Xiuxin

2015-01-01

In eukaryotes, histone acetylation and methylation have been known to be involved in regulating diverse developmental processes and plant defense. These histone modification events are controlled by a series of histone modification gene families. To date, there is no study regarding genome-wide characterization of histone modification related genes in citrus species. Based on the two recent sequenced sweet orange genome databases, a total of 136 CsHMs (Citrus sinensis histone modification genes), including 47 CsHMTs (histone methyltransferase genes), 23 CsHDMs (histone demethylase genes), 50 CsHATs (histone acetyltransferase genes), and 16 CsHDACs (histone deacetylase genes) were identified. These genes were categorized to 11 gene families. A comprehensive analysis of these 11 gene families was performed with chromosome locations, phylogenetic comparison, gene structures, and conserved domain compositions of proteins. In order to gain an insight into the potential roles of these genes in citrus fruit development, 42 CsHMs with high mRNA abundance in fruit tissues were selected to further analyze their expression profiles at six stages of fruit development. Interestingly, a numbers of genes were expressed highly in flesh of ripening fruit and some of them showed the increasing expression levels along with the fruit development. Furthermore, we analyzed the expression patterns of all 136 CsHMs response to the infection of blue mold (Penicillium digitatum), which is the most devastating pathogen in citrus post-harvest process. The results indicated that 20 of them showed the strong alterations of their expression levels during the fruit-pathogen infection. In conclusion, this study presents a comprehensive analysis of the histone modification gene families in sweet orange and further elucidates their behaviors during the fruit development and the blue mold infection responses. PMID:26300904
Comprehensive RNA-Seq profiling to evaluate lactating sheep mammary gland transcriptome

PubMed Central

Suárez-Vega, Aroa; Gutiérrez-Gil, Beatriz; Klopp, Christophe; Tosser-Klopp, Gwenola; Arranz, Juan-José

2016-01-01

RNA-Seq enables the generation of extensive transcriptome information providing the capability to characterize transcripts (including alternative isoforms and polymorphism), to quantify expression and to identify differential regulation in a single experiment. Our aim in this study was to take advantage of using RNA-Seq high-throughput technology to provide a comprehensive transcriptome profiling of the sheep lactating mammary gland. Eight ewes of two dairy sheep breeds with differences in milk production traits were used in this experiment (four Churra and four Assaf ewes). Milk samples from these animals were collected on days 10, 50, 120 and 150 after lambing to cover the various physiological stages of the mammary gland across the complete lactation. RNA samples were extracted from milk somatic cells. The RNA-Seq dataset was generated using an Illumina HiSeq 2000 sequencer. The information reported here will be useful to understand the biology of lactation in sheep, providing also an opportunity to characterize their different patterns on milk production aptitude. PMID:27377755

Comprehensive RNA-Seq profiling to evaluate lactating sheep mammary gland transcriptome.

PubMed

Suárez-Vega, Aroa; Gutiérrez-Gil, Beatriz; Klopp, Christophe; Tosser-Klopp, Gwenola; Arranz, Juan-José

2016-07-05

RNA-Seq enables the generation of extensive transcriptome information providing the capability to characterize transcripts (including alternative isoforms and polymorphism), to quantify expression and to identify differential regulation in a single experiment. Our aim in this study was to take advantage of using RNA-Seq high-throughput technology to provide a comprehensive transcriptome profiling of the sheep lactating mammary gland. Eight ewes of two dairy sheep breeds with differences in milk production traits were used in this experiment (four Churra and four Assaf ewes). Milk samples from these animals were collected on days 10, 50, 120 and 150 after lambing to cover the various physiological stages of the mammary gland across the complete lactation. RNA samples were extracted from milk somatic cells. The RNA-Seq dataset was generated using an Illumina HiSeq 2000 sequencer. The information reported here will be useful to understand the biology of lactation in sheep, providing also an opportunity to characterize their different patterns on milk production aptitude.
Young infants' generalization of emotional expressions: effects of familiarity.

PubMed

Walker-Andrews, Arlene S; Krogh-Jespersen, Sheila; Mayhew, Estelle M Y; Coffield, Caroline N

2011-08-01

From birth, infants are exposed to a wealth of emotional information in their interactions. Much research has been done to investigate the development of emotion perception, and factors influencing that development. The current study investigates the role of familiarity on 3.5-month-old infants' generalization of emotional expressions. Infants were assigned to one of two habituation sequences: in one sequence, infants were visually habituated to parental expressions of happy or sad. At test, infants viewed either a continuation of the habituation sequence, their mother depicting a novel expression, an unfamiliar female depicting the habituated expression, or an unfamiliar female depicting a novel expression. In the second sequence, a new sample of infants was matched to the infants in the first sequence. These infants viewed the same habituation and test sequences, but the actors were unfamiliar to them. Only those infants who viewed their own mothers and fathers during the habituation sequence increased looking. They dishabituated looking to maternal novel expressions, the unfamiliar female's novel expression, and the unfamiliar female depicting the habituated expression, especially when sad parental expressions were followed by an expression change to happy or to a change in person. Infants are guided in their recognition of emotional expressions by the familiarity of their parents, before generalizing to others. 2011 APA, all rights reserved
Next-generation transcriptome sequencing of the premenopausal breast epithelium using specimens from a normal human breast tissue bank.

PubMed

Pardo, Ivanesa; Lillemoe, Heather A; Blosser, Rachel J; Choi, MiRan; Sauder, Candice A M; Doxey, Diane K; Mathieson, Theresa; Hancock, Bradley A; Baptiste, Dadrie; Atale, Rutuja; Hickenbotham, Matthew; Zhu, Jin; Glasscock, Jarret; Storniolo, Anna Maria V; Zheng, Faye; Doerge, R W; Liu, Yunlong; Badve, Sunil; Radovich, Milan; Clare, Susan E

2014-03-17

Our efforts to prevent and treat breast cancer are significantly impeded by a lack of knowledge of the biology and developmental genetics of the normal mammary gland. In order to provide the specimens that will facilitate such an understanding, The Susan G. Komen for the Cure Tissue Bank at the IU Simon Cancer Center (KTB) was established. The KTB is, to our knowledge, the only biorepository in the world prospectively established to collect normal, healthy breast tissue from volunteer donors. As a first initiative toward a molecular understanding of the biology and developmental genetics of the normal mammary gland, the effect of the menstrual cycle and hormonal contraceptives on DNA expression in the normal breast epithelium was examined. Using normal breast tissue from 20 premenopausal donors to KTB, the changes in the mRNA of the normal breast epithelium as a function of phase of the menstrual cycle and hormonal contraception were assayed using next-generation whole transcriptome sequencing (RNA-Seq). In total, 255 genes representing 1.4% of all genes were deemed to have statistically significant differential expression between the two phases of the menstrual cycle. The overwhelming majority (221; 87%) of the genes have higher expression during the luteal phase. These data provide important insights into the processes occurring during each phase of the menstrual cycle. There was only a single gene significantly differentially expressed when comparing the epithelium of women using hormonal contraception to those in the luteal phase. We have taken advantage of a unique research resource, the KTB, to complete the first-ever next-generation transcriptome sequencing of the epithelial compartment of 20 normal human breast specimens. This work has produced a comprehensive catalog of the differences in the expression of protein-coding genes as a function of the phase of the menstrual cycle. These data constitute the beginning of a reference data set of the normal mammary gland, which can be consulted for comparison with data developed from malignant specimens, or to mine the effects of the hormonal flux that occurs during the menstrual cycle.
Next-generation transcriptome sequencing of the premenopausal breast epithelium using specimens from a normal human breast tissue bank

PubMed Central

2014-01-01

Introduction Our efforts to prevent and treat breast cancer are significantly impeded by a lack of knowledge of the biology and developmental genetics of the normal mammary gland. In order to provide the specimens that will facilitate such an understanding, The Susan G. Komen for the Cure Tissue Bank at the IU Simon Cancer Center (KTB) was established. The KTB is, to our knowledge, the only biorepository in the world prospectively established to collect normal, healthy breast tissue from volunteer donors. As a first initiative toward a molecular understanding of the biology and developmental genetics of the normal mammary gland, the effect of the menstrual cycle and hormonal contraceptives on DNA expression in the normal breast epithelium was examined. Methods Using normal breast tissue from 20 premenopausal donors to KTB, the changes in the mRNA of the normal breast epithelium as a function of phase of the menstrual cycle and hormonal contraception were assayed using next-generation whole transcriptome sequencing (RNA-Seq). Results In total, 255 genes representing 1.4% of all genes were deemed to have statistically significant differential expression between the two phases of the menstrual cycle. The overwhelming majority (221; 87%) of the genes have higher expression during the luteal phase. These data provide important insights into the processes occurring during each phase of the menstrual cycle. There was only a single gene significantly differentially expressed when comparing the epithelium of women using hormonal contraception to those in the luteal phase. Conclusions We have taken advantage of a unique research resource, the KTB, to complete the first-ever next-generation transcriptome sequencing of the epithelial compartment of 20 normal human breast specimens. This work has produced a comprehensive catalog of the differences in the expression of protein-coding genes as a function of the phase of the menstrual cycle. These data constitute the beginning of a reference data set of the normal mammary gland, which can be consulted for comparison with data developed from malignant specimens, or to mine the effects of the hormonal flux that occurs during the menstrual cycle. PMID:24636070
Comprehensive analysis of RNA-seq data reveals the complexity of the transcriptome in Brassica rapa.

PubMed

Tong, Chaobo; Wang, Xiaowu; Yu, Jingyin; Wu, Jian; Li, Wanshun; Huang, Junyan; Dong, Caihua; Hua, Wei; Liu, Shengyi

2013-10-07

The species Brassica rapa (2n=20, AA) is an important vegetable and oilseed crop, and serves as an excellent model for genomic and evolutionary research in Brassica species. With the availability of whole genome sequence of B. rapa, it is essential to further determine the activity of all functional elements of the B. rapa genome and explore the transcriptome on a genome-wide scale. Here, RNA-seq data was employed to provide a genome-wide transcriptional landscape and characterization of the annotated and novel transcripts and alternative splicing events across tissues. RNA-seq reads were generated using the Illumina platform from six different tissues (root, stem, leaf, flower, silique and callus) of the B. rapa accession Chiifu-401-42, the same line used for whole genome sequencing. First, these data detected the widespread transcription of the B. rapa genome, leading to the identification of numerous novel transcripts and definition of 5'/3' UTRs of known genes. Second, 78.8% of the total annotated genes were detected as expressed and 45.8% were constitutively expressed across all tissues. We further defined several groups of genes: housekeeping genes, tissue-specific expressed genes and co-expressed genes across tissues, which will serve as a valuable repository for future crop functional genomics research. Third, alternative splicing (AS) is estimated to occur in more than 29.4% of intron-containing B. rapa genes, and 65% of them were commonly detected in more than two tissues. Interestingly, genes with high rate of AS were over-represented in GO categories relating to transcriptional regulation and signal transduction, suggesting potential importance of AS for playing regulatory role in these genes. Further, we observed that intron retention (IR) is predominant in the AS events and seems to preferentially occurred in genes with short introns. The high-resolution RNA-seq analysis provides a global transcriptional landscape as a complement to the B. rapa genome sequence, which will advance our understanding of the dynamics and complexity of the B. rapa transcriptome. The atlas of gene expression in different tissues will be useful for accelerating research on functional genomics and genome evolution in Brassica species.
Neural Network Processing of Natural Language: II. Towards a Unified Model of Corticostriatal Function in Learning Sentence Comprehension and Non-Linguistic Sequencing

ERIC Educational Resources Information Center

Dominey, Peter Ford; Inui, Toshio; Hoen, Michel

2009-01-01

A central issue in cognitive neuroscience today concerns how distributed neural networks in the brain that are used in language learning and processing can be involved in non-linguistic cognitive sequence learning. This issue is informed by a wealth of functional neurophysiology studies of sentence comprehension, along with a number of recent…
Ligand-mediated protein degradation reveals functional conservation among sequence variants of the CUL4-type E3 ligase substrate receptor cereblon.

PubMed

Akuffo, Afua A; Alontaga, Aileen Y; Metcalf, Rainer; Beatty, Matthew S; Becker, Andreas; McDaniel, Jessica M; Hesterberg, Rebecca S; Goodheart, William E; Gunawan, Steven; Ayaz, Muhammad; Yang, Yan; Karim, Md Rezaul; Orobello, Morgan E; Daniel, Kenyon; Guida, Wayne; Yoder, Jeffrey A; Rajadhyaksha, Anjali M; Schönbrunn, Ernst; Lawrence, Harshani R; Lawrence, Nicholas J; Epling-Burnette, Pearlie K

2018-04-20

Upon binding to thalidomide and other immunomodulatory drugs, the E3 ligase substrate receptor cereblon (CRBN) promotes proteosomal destruction by engaging the DDB1-CUL4A-Roc1-RBX1 E3 ubiquitin ligase in human cells but not in mouse cells, suggesting that sequence variations in CRBN may cause its inactivation. Therapeutically, CRBN engagers have the potential for broad applications in cancer and immune therapy by specifically reducing protein expression through targeted ubiquitin-mediated degradation. To examine the effects of defined sequence changes on CRBN's activity, we performed a comprehensive study using complementary theoretical, biophysical, and biological assays aimed at understanding CRBN's nonprimate sequence variations. With a series of recombinant thalidomide-binding domain (TBD) proteins, we show that CRBN sequence variants retain their drug-binding properties to both classical immunomodulatory drugs and dBET1, a chemical compound and targeting ligand designed to degrade bromodomain-containing 4 (BRD4) via a CRBN-dependent mechanism. We further show that dBET1 stimulates CRBN's E3 ubiquitin-conjugating function and degrades BRD4 in both mouse and human cells. This insight paves the way for studies of CRBN-dependent proteasome-targeting molecules in nonprimate models and provides a new understanding of CRBN's substrate-recruiting function. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.
Pleurochrysome: A Web Database of Pleurochrysis Transcripts and Orthologs Among Heterogeneous Algae

PubMed Central

Fujiwara, Shoko; Takatsuka, Yukiko; Hirokawa, Yasutaka; Tsuzuki, Mikio; Takano, Tomoyuki; Kobayashi, Masaaki; Suda, Kunihiro; Asamizu, Erika; Yokoyama, Koji; Shibata, Daisuke; Tabata, Satoshi; Yano, Kentaro

2016-01-01

Pleurochrysis is a coccolithophorid genus, which belongs to the Coccolithales in the Haptophyta. The genus has been used extensively for biological research, together with Emiliania in the Isochrysidales, to understand distinctive features between the two coccolithophorid-including orders. However, molecular biological research on Pleurochrysis such as elucidation of the molecular mechanism behind coccolith formation has not made great progress at least in part because of lack of comprehensive gene information. To provide such information to the research community, we built an open web database, the Pleurochrysome (http://bioinf.mind.meiji.ac.jp/phapt/), which currently stores 9,023 unique gene sequences (designated as UNIGENEs) assembled from expressed sequence tag sequences of P. haptonemofera as core information. The UNIGENEs were annotated with gene sequences sharing significant homology, conserved domains, Gene Ontology, KEGG Orthology, predicted subcellular localization, open reading frames and orthologous relationship with genes of 10 other algal species, a cyanobacterium and the yeast Saccharomyces cerevisiae. This sequence and annotation information can be easily accessed via several search functions. Besides fundamental functions such as BLAST and keyword searches, this database also offers search functions to explore orthologous genes in the 12 organisms and to seek novel genes. The Pleurochrysome will promote molecular biological and phylogenetic research on coccolithophorids and other haptophytes by helping scientists mine data from the primary transcriptome of P. haptonemofera. PMID:26746174
Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq

PubMed Central

Shepard, Peter J.; Choi, Eun-A; Lu, Jente; Flanagan, Lisa A.; Hertel, Klemens J.; Shi, Yongsheng

2011-01-01

Alternative polyadenylation (APA) of mRNAs has emerged as an important mechanism for post-transcriptional gene regulation in higher eukaryotes. Although microarrays have recently been used to characterize APA globally, they have a number of serious limitations that prevents comprehensive and highly quantitative analysis. To better characterize APA and its regulation, we have developed a deep sequencing-based method called Poly(A) Site Sequencing (PAS-Seq) for quantitatively profiling RNA polyadenylation at the transcriptome level. PAS-Seq not only accurately and comprehensively identifies poly(A) junctions in mRNAs and noncoding RNAs, but also provides quantitative information on the relative abundance of polyadenylated RNAs. PAS-Seq analyses of human and mouse transcriptomes showed that 40%–50% of all expressed genes produce alternatively polyadenylated mRNAs. Furthermore, our study detected evolutionarily conserved polyadenylation of histone mRNAs and revealed novel features of mitochondrial RNA polyadenylation. Finally, PAS-Seq analyses of mouse embryonic stem (ES) cells, neural stem/progenitor (NSP) cells, and neurons not only identified more poly(A) sites than what was found in the entire mouse EST database, but also detected significant changes in the global APA profile that lead to lengthening of 3′ untranslated regions (UTR) in many mRNAs during stem cell differentiation. Together, our PAS-Seq analyses revealed a complex landscape of RNA polyadenylation in mammalian cells and the dynamic regulation of APA during stem cell differentiation. PMID:21343387
VIPER: Visualization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis.

PubMed

Cornwell, MacIntosh; Vangala, Mahesh; Taing, Len; Herbert, Zachary; Köster, Johannes; Li, Bo; Sun, Hanfei; Li, Taiwen; Zhang, Jian; Qiu, Xintao; Pun, Matthew; Jeselsohn, Rinath; Brown, Myles; Liu, X Shirley; Long, Henry W

2018-04-12

RNA sequencing has become a ubiquitous technology used throughout life sciences as an effective method of measuring RNA abundance quantitatively in tissues and cells. The increase in use of RNA-seq technology has led to the continuous development of new tools for every step of analysis from alignment to downstream pathway analysis. However, effectively using these analysis tools in a scalable and reproducible way can be challenging, especially for non-experts. Using the workflow management system Snakemake we have developed a user friendly, fast, efficient, and comprehensive pipeline for RNA-seq analysis. VIPER (Visualization Pipeline for RNA-seq analysis) is an analysis workflow that combines some of the most popular tools to take RNA-seq analysis from raw sequencing data, through alignment and quality control, into downstream differential expression and pathway analysis. VIPER has been created in a modular fashion to allow for the rapid incorporation of new tools to expand the capabilities. This capacity has already been exploited to include very recently developed tools that explore immune infiltrate and T-cell CDR (Complementarity-Determining Regions) reconstruction abilities. The pipeline has been conveniently packaged such that minimal computational skills are required to download and install the dozens of software packages that VIPER uses. VIPER is a comprehensive solution that performs most standard RNA-seq analyses quickly and effectively with a built-in capacity for customization and expansion.
A comprehensive transcriptome assembly of Pigeonpea (Cajanus cajan L.) using sanger and second-generation sequencing platforms.

PubMed

Kudapa, Himabindu; Bharti, Arvind K; Cannon, Steven B; Farmer, Andrew D; Mulaosmanovic, Benjamin; Kramer, Robin; Bohra, Abhishek; Weeks, Nathan T; Crow, John A; Tuteja, Reetu; Shah, Trushar; Dutta, Sutapa; Gupta, Deepak K; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R; May, Gregory D; Singh, Nagendra K; Varshney, Rajeev K

2012-09-01

A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18 353 Sanger expressed sequenced tags from more than 16 genotypes. The resultant transcriptome assembly, referred to as CcTA v2, comprised 21 434 transcript assembly contigs (TACs) with an N50 of 1510 bp, the largest one being ~8 kb. Of the 21 434 TACs, 16 622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters. Based on knowledge of intron junctions, 10 009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs). By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference, putative mapping positions at the chromosome level were predicted for 6284 ISR markers, covering all 11 pigeonpea chromosomes. A subset of 128 ISR markers were analyzed on a set of eight genotypes. While 116 markers were validated, 70 markers showed one to three alleles, with an average of 0.16 polymorphism information content (PIC) value. In summary, the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea.
Genome projects and the functional-genomic era.

PubMed

Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans

2005-12-01

The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.
Combined hairpin-antisense compositions and methods for modulating expression

DOEpatents

Shanklin, John; Nguyen, Tam

2014-08-05

A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.
Combined hairpin-antisense compositions and methods for modulating expression

DOEpatents

Shanklin, John; Nguyen, Tam Huu

2015-11-24

A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.
Capturing the 'ome': the expanding molecular toolbox for RNA and DNA library construction.

PubMed

Boone, Morgane; De Koker, Andries; Callewaert, Nico

2018-04-06

All sequencing experiments and most functional genomics screens rely on the generation of libraries to comprehensively capture pools of targeted sequences. In the past decade especially, driven by the progress in the field of massively parallel sequencing, numerous studies have comprehensively assessed the impact of particular manipulations on library complexity and quality, and characterized the activities and specificities of several key enzymes used in library construction. Fortunately, careful protocol design and reagent choice can substantially mitigate many of these biases, and enable reliable representation of sequences in libraries. This review aims to guide the reader through the vast expanse of literature on the subject to promote informed library generation, independent of the application.
Comprehensive Analysis of Human Endogenous Retrovirus Group HERV-W Locus Transcription in Multiple Sclerosis Brain Lesions by High-Throughput Amplicon Sequencing

PubMed Central

Schmitt, Katja; Richter, Christin; Backes, Christina; Meese, Eckart; Ruprecht, Klemens

2013-01-01

Human endogenous retroviruses (HERVs) of the HERV-W group comprise hundreds of loci in the human genome. Deregulated HERV-W expression and HERV-W locus ERVWE1-encoded Syncytin-1 protein have been implicated in the pathogenesis of multiple sclerosis (MS). However, the actual transcription of HERV-W loci in the MS context has not been comprehensively analyzed. We investigated transcription of HERV-W in MS brain lesions and white matter brain tissue from healthy controls by employing next-generation amplicon sequencing of HERV-W env-specific reverse transcriptase (RT) PCR products, thus revealing transcribed HERV-W loci and the relative transcript levels of those loci. We identified more than 100 HERV-W loci that were transcribed in the human brain, with a limited number of loci being predominantly transcribed. Importantly, relative transcript levels of HERV-W loci were very similar between MS and healthy brain tissue samples, refuting deregulated transcription of HERV-W env in MS brain lesions, including the high-level-transcribed ERVWE1 locus encoding Syncytin-1. Quantitative RT-PCR likewise did not reveal differences in MS regarding HERV-W env general transcript or ERVWE1- and ERVWE2-specific transcript levels. However, we obtained evidence for interindividual differences in HERV-W transcript levels. Reporter gene assays indicated promoter activity of many HERV-W long terminal repeats (LTRs), including structurally incomplete LTRs. Our comprehensive analysis of HERV-W transcription in the human brain thus provides important information on the biology of HERV-W in MS lesions and normal human brain, implications for study design, and mechanisms by which HERV-W may (or may not) be involved in MS. PMID:24109235
A comprehensive comparison of four species of Onchidiidae provides insights on the morphological and molecular adaptations of invertebrates from shallow seas to wetlands.

PubMed

Xu, Guolv; Yang, Tiezhu; Wang, Dongfeng; Li, Jie; Liu, Xin; Wu, Xin; Shen, Heding

2018-01-01

The Onchidiidae family is ideal for studying the evolution of marine invertebrate species from sea to wetland environments. However, comparative studies of Onchidiidae species are rare. A total of 40 samples were collected from four species (10 specimens per onchidiid), and their histological and molecular differences were systematically evaluated to elucidate the morphological foundations underlying the adaptations of these species. A histological analysis was performed to compare the structures of respiratory organs (gill, lung sac, dorsal skin) among onchidiids, and transcriptome sequencing of four representative onchidiids was performed to investigate the molecular mechanisms associated with their respective habitats. Twenty-six SNP markers of Onchidium reevesii revealed some DNA polymorphisms determining visible traits. Non-muscle myosin heavy chain II (NMHC II) and myosin heavy chain (MyHC), which play essential roles in amphibian developmental processes, were found to be differentially expressed in different onchidiids and tissues. The species with higher terrestrial ability and increased integrated expression of Os-MHC (NMHC II gene) and the MyHC gene, illustrating that the expression levels of these genes were associated with the evolutionary degree. This study provides a comprehensive analysis of the adaptions of a diverse and widespread group of invertebrates, the Onchidiidae. Some onchidiids can breathe well through gills and skin when under seawater, and some can breathe well through lung sacs and skin when in wetlands. A histological comparison of respiratory organs and the relative expression levels of two genes provided insights into the adaptions of onchidiids that allowed their transition from shallow seas to wetlands. This work provides a valuable reference and might encourage further study.
A comprehensive comparison of four species of Onchidiidae provides insights on the morphological and molecular adaptations of invertebrates from shallow seas to wetlands

PubMed Central

Wang, Dongfeng; Li, Jie; Liu, Xin; Wu, Xin

2018-01-01

The Onchidiidae family is ideal for studying the evolution of marine invertebrate species from sea to wetland environments. However, comparative studies of Onchidiidae species are rare. A total of 40 samples were collected from four species (10 specimens per onchidiid), and their histological and molecular differences were systematically evaluated to elucidate the morphological foundations underlying the adaptations of these species. A histological analysis was performed to compare the structures of respiratory organs (gill, lung sac, dorsal skin) among onchidiids, and transcriptome sequencing of four representative onchidiids was performed to investigate the molecular mechanisms associated with their respective habitats. Twenty-six SNP markers of Onchidium reevesii revealed some DNA polymorphisms determining visible traits. Non-muscle myosin heavy chain II (NMHC II) and myosin heavy chain (MyHC), which play essential roles in amphibian developmental processes, were found to be differentially expressed in different onchidiids and tissues. The species with higher terrestrial ability and increased integrated expression of Os-MHC (NMHC II gene) and the MyHC gene, illustrating that the expression levels of these genes were associated with the evolutionary degree. This study provides a comprehensive analysis of the adaptions of a diverse and widespread group of invertebrates, the Onchidiidae. Some onchidiids can breathe well through gills and skin when under seawater, and some can breathe well through lung sacs and skin when in wetlands. A histological comparison of respiratory organs and the relative expression levels of two genes provided insights into the adaptions of onchidiids that allowed their transition from shallow seas to wetlands. This work provides a valuable reference and might encourage further study. PMID:29698429
LISTA, LISTA-HOP and LISTA-HON: a comprehensive compilation of protein encoding sequences and its associated homology databases from the yeast Saccharomyces.

PubMed Central

Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P

1994-01-01

We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. In this database each sequence has been attributed a single genetic name. In the case of duplicated sequences a simple method has been applied to distinguish between sequences of one and the same gene from non-allelic sequences of duplicated genes. If necessary, synonyms are given in the case of allelic duplicated sequences. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, Swissprot and EMBL accession numbers. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). PMID:7937046
MIPS: a database for genomes and protein sequences.

PubMed Central

Mewes, H W; Heumann, K; Kaps, A; Mayer, K; Pfeiffer, F; Stocker, S; Frishman, D

1999-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried near Munich, Germany, develops and maintains genome oriented databases. It is commonplace that the amount of sequence data available increases rapidly, but not the capacity of qualified manual annotation at the sequence databases. Therefore, our strategy aims to cope with the data stream by the comprehensive application of analysis tools to sequences of complete genomes, the systematic classification of protein sequences and the active support of sequence analysis and functional genomics projects. This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). MIPS provides access through its WWW server (http://www.mips.biochem.mpg.de) to a spectrum of generic databases, including the above mentioned as well as a database of protein families (PROTFAM), the MITOP database, and the all-against-all FASTA database. PMID:9847138

Identification of a novel MYO7A mutation in Usher syndrome type 1.

PubMed

Cheng, Ling; Yu, Hongsong; Jiang, Yan; He, Juan; Pu, Sisi; Li, Xin; Zhang, Li

2018-01-05

Usher syndrome (USH) is an autosomal recessive disease characterized by deafness and retinitis pigmentosa. In view of the high phenotypic and genetic heterogeneity in USH, performing genetic screening with traditional methods is impractical. In the present study, we carried out targeted next-generation sequencing (NGS) to uncover the underlying gene in an USH family (2 USH patients and 15 unaffected relatives). One hundred and thirty-five genes associated with inherited retinal degeneration were selected for deep exome sequencing. Subsequently, variant analysis, Sanger validation and segregation tests were utilized to identify the disease-causing mutations in this family. All affected individuals had a classic USH type I (USH1) phenotype which included deafness, vestibular dysfunction and retinitis pigmentosa. Targeted NGS and Sanger sequencing validation suggested that USH1 patients carried an unreported splice site mutation, c.5168+1G>A, as a compound heterozygous mutation with c.6070C>T (p.R2024X) in the MYO7A gene. A functional study revealed decreased expression of the MYO7A gene in the individuals carrying heterozygous mutations. In conclusion, targeted next-generation sequencing provided a comprehensive and efficient diagnosis for USH1. This study revealed the genetic defects in the MYO7A gene and expanded the spectrum of clinical phenotypes associated with USH1 mutations.
Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data

PubMed Central

Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P

2018-01-01

Abstract Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets. PMID:29618048
Evaluation of the impact of RNA preservation methods of spiders for de novo transcriptome assembly.

PubMed

Kono, Nobuaki; Nakamura, Hiroyuki; Ito, Yusuke; Tomita, Masaru; Arakawa, Kazuharu

2016-05-01

With advances in high-throughput sequencing technologies, de novo transcriptome sequencing and assembly has become a cost-effective method to obtain comprehensive genetic information of a species of interest, especially in nonmodel species with large genomes such as spiders. However, high-quality RNA is essential for successful sequencing, and sample preservation conditions require careful consideration for the effective storage of field-collected samples. To this end, we report a streamlined feasibility study of various storage conditions and their effects on de novo transcriptome assembly results. The storage parameters considered include temperatures ranging from room temperature to -80°C; preservatives, including ethanol, RNAlater, TRIzol and RNAlater-ICE; and sample submersion states. As a result, intact RNA was extracted and assembly was successful when samples were preserved at low temperatures regardless of the type of preservative used. The assemblies as well as the gene expression profiles were shown to be robust to RNA degradation, when 30 million 150-bp paired-end reads are obtained. The parameters for sample storage, RNA extraction, library preparation, sequencing and in silico assembly considered in this work provide a guideline for the study of field-collected samples of spiders. © 2015 John Wiley & Sons Ltd.
Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing.

PubMed

Legendre, Matthieu; Santini, Sébastien; Rico, Alain; Abergel, Chantal; Claverie, Jean-Michel

2011-03-04

Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs). Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads), and a complete genome re-sequencing (45.3 Million reads). This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.
Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data.

PubMed

Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P

2018-03-01

Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets.
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].

PubMed

Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong

2013-06-01

A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
RISC RNA sequencing for context-specific identification of in vivo microRNA targets.

PubMed

Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W

2011-01-07

MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1645 mRNAs consistently targeted to mouse cardiac RISCs. We used this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing "seed" sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context and is applicable to any tissue and any disease state.
Personalized Oncology Through Integrative High-Throughput Sequencing: A Pilot Study

PubMed Central

Roychowdhury, Sameek; Iyer, Matthew K.; Robinson, Dan R.; Lonigro, Robert J.; Wu, Yi-Mi; Cao, Xuhong; Kalyana-Sundaram, Shanker; Sam, Lee; Balbin, O. Alejandro; Quist, Michael J.; Barrette, Terrence; Everett, Jessica; Siddiqui, Javed; Kunju, Lakshmi P.; Navone, Nora; Araujo, John C.; Troncoso, Patricia; Logothetis, Christopher J.; Innis, Jeffrey W.; Smith, David C.; Lao, Christopher D.; Kim, Scott Y.; Roberts, J. Scott; Gruber, Stephen B.; Pienta, Kenneth J.; Talpaz, Moshe; Chinnaiyan, Arul M.

2012-01-01

Individual cancers harbor a set of genetic aberrations that can be informative for identifying rational therapies currently available or in clinical trials. We implemented a pilot study to explore the practical challenges of applying high-throughput sequencing in clinical oncology. We enrolled patients with advanced or refractory cancer who were eligible for clinical trials. For each patient, we performed whole-genome sequencing of the tumor, targeted whole-exome sequencing of tumor and normal DNA, and transcriptome sequencing (RNA-Seq) of the tumor to identify potentially informative mutations in a clinically relevant time frame of 3 to 4 weeks. With this approach, we detected several classes of cancer mutations including structural rearrangements, copy number alterations, point mutations, and gene expression alterations. A multidisciplinary Sequencing Tumor Board (STB) deliberated on the clinical interpretation of the sequencing results obtained. We tested our sequencing strategy on human prostate cancer xenografts. Next, we enrolled two patients into the clinical protocol and were able to review the results at our STB within 24 days of biopsy. The first patient had metastatic colorectal cancer in which we identified somatic point mutations in NRAS, TP53, AURKA, FAS, and MYH11, plus amplification and overexpression of cyclin-dependent kinase 8 (CDK8). The second patient had malignant melanoma, in which we identified a somatic point mutation in HRAS and a structural rearrangement affecting CDKN2C. The STB identified the CDK8 amplification and Ras mutation as providing a rationale for clinical trials with CDK inhibitors or MEK (mitogenactivated or extracellular signal–regulated protein kinase kinase) and PI3K (phosphatidylinositol 3-kinase) inhibitors, respectively. Integrative high-throughput sequencing of patients with advanced cancer generates a comprehensive, individual mutational landscape to facilitate biomarker-driven clinical trials in oncology. PMID:22133722
Transcriptome sequence analysis of an ornamental plant, Ananas comosus var. bracteatus, revealed the potential unigenes involved in terpenoid and phenylpropanoid biosynthesis.

PubMed

Ma, Jun; Kanakala, S; He, Yehua; Zhang, Junli; Zhong, Xiaolan

2015-01-01

Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis. The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus.
Transcriptome Sequence Analysis of an Ornamental Plant, Ananas comosus var. bracteatus, Revealed the Potential Unigenes Involved in Terpenoid and Phenylpropanoid Biosynthesis

PubMed Central

Ma, Jun; Kanakala, S.; He, Yehua; Zhang, Junli; Zhong, Xiaolan

2015-01-01

Background Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. Results The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis. Conclusion The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus. PMID:25769053
The Physcomitrella patens gene atlas project: large-scale RNA-seq based expression data.

PubMed

Perroud, Pierre-François; Haas, Fabian B; Hiss, Manuel; Ullrich, Kristian K; Alboresi, Alessandro; Amirebrahimi, Mojgan; Barry, Kerrie; Bassi, Roberto; Bonhomme, Sandrine; Chen, Haodong; Coates, Juliet C; Fujita, Tomomichi; Guyon-Debast, Anouchka; Lang, Daniel; Lin, Junyan; Lipzen, Anna; Nogué, Fabien; Oliver, Melvin J; Ponce de León, Inés; Quatrano, Ralph S; Rameau, Catherine; Reiss, Bernd; Reski, Ralf; Ricca, Mariana; Saidi, Younousse; Sun, Ning; Szövényi, Péter; Sreedasyam, Avinash; Grimwood, Jane; Stacey, Gary; Schmutz, Jeremy; Rensing, Stefan A

2018-07-01

High-throughput RNA sequencing (RNA-seq) has recently become the method of choice to define and analyze transcriptomes. For the model moss Physcomitrella patens, although this method has been used to help analyze specific perturbations, no overall reference dataset has yet been established. In the framework of the Gene Atlas project, the Joint Genome Institute selected P. patens as a flagship genome, opening the way to generate the first comprehensive transcriptome dataset for this moss. The first round of sequencing described here is composed of 99 independent libraries spanning 34 different developmental stages and conditions. Upon dataset quality control and processing through read mapping, 28 509 of the 34 361 v3.3 gene models (83%) were detected to be expressed across the samples. Differentially expressed genes (DEGs) were calculated across the dataset to permit perturbation comparisons between conditions. The analysis of the three most distinct and abundant P. patens growth stages - protonema, gametophore and sporophyte - allowed us to define both general transcriptional patterns and stage-specific transcripts. As an example of variation of physico-chemical growth conditions, we detail here the impact of ammonium supplementation under standard growth conditions on the protonemal transcriptome. Finally, the cooperative nature of this project allowed us to analyze inter-laboratory variation, as 13 different laboratories around the world provided samples. We compare differences in the replication of experiments in a single laboratory and between different laboratories. © 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.
Multiple hot-deck imputation for network inference from RNA sequencing data.

PubMed

Imbert, Alyssa; Valsesia, Armand; Le Gall, Caroline; Armenise, Claudia; Lefebvre, Gregory; Gourraud, Pierre-Antoine; Viguerie, Nathalie; Villa-Vialaneix, Nathalie

2018-05-15

Network inference provides a global view of the relations existing between gene expression in a given transcriptomic experiment (often only for a restricted list of chosen genes). However, it is still a challenging problem: even if the cost of sequencing techniques has decreased over the last years, the number of samples in a given experiment is still (very) small compared to the number of genes. We propose a method to increase the reliability of the inference when RNA-seq expression data have been measured together with an auxiliary dataset that can provide external information on gene expression similarity between samples. Our statistical approach, hd-MI, is based on imputation for samples without available RNA-seq data that are considered as missing data but are observed on the secondary dataset. hd-MI can improve the reliability of the inference for missing rates up to 30% and provides more stable networks with a smaller number of false positive edges. On a biological point of view, hd-MI was also found relevant to infer networks from RNA-seq data acquired in adipose tissue during a nutritional intervention in obese individuals. In these networks, novel links between genes were highlighted, as well as an improved comparability between the two steps of the nutritional intervention. Software and sample data are available as an R package, RNAseqNet, that can be downloaded from the Comprehensive R Archive Network (CRAN). alyssa.imbert@inra.fr or nathalie.villa-vialaneix@inra.fr. Supplementary data are available at Bioinformatics online.
Analysis of de novo sequencing and transcriptome assembly and lignocellulolytic enzymes gene expression of Coriolopsis gallica HTC.

PubMed

Chen, Yuehong; Cao, Qinghua; Tao, Xiang; Shao, Huanhuan; Zhang, Kun; Zhang, Yizheng; Tan, Xuemei

2017-03-01

White-rot basidiomycete Coriolopsis gallica HTC is one of the main biodegraders of poplar. In our previous study, we have shown the strong capacity of C. gallica HTC to degrade lignocellulose. In this study, equal amounts of total RNA fromC. Gallica HTC cultures grown in different conditions were pooled together. Illumina paired-end RNA sequencing was performed, and 13.2 million 90-bp paired-end reads were generated. We chose the Merged Assembly of Oases data-set for the following blast searches and gene ontology analyses. The reads were assembled de novo into 28,034 transcripts (≥ 100 bp) using combined assembly strategy MAO. The transcripts were annotated using Blast2GO. In all, 18,810 transcripts (≥100 bp) achieved BLASTX hits, of which, 7048 transcripts had GO term and 2074 had ECs. The expression level of 11 lignocellulolytic enzyme genes from the assembled C. gallica HTC transcriptome were detected by real-time quantitative polymerase chain reaction. The results showed that expression levels of these genes were affected by carbon source and nitrogen source at the level of transcription. The current abundant transcriptome data allowed the identification of many new transcripts in C. gallica HTC. Data provided here represent the most comprehensive and integrated genomic resources for cloning and identifying genes of interest from C. gallica HTC. Characterization of C. gallica HTC transcriptome provides an effective tool to understand mechanisms underlying cellular and molecular functions of C. gallica HTC.
Transcriptome Analysis of Flower Sex Differentiation in Jatropha curcas L. Using RNA Sequencing.

PubMed

Xu, Gang; Huang, Jian; Yang, Yong; Yao, Yin-an

2016-01-01

Jatropha curcas is thought to be a promising biofuel material, but its yield is restricted by a low ratio of instaminate/staminate flowers (1/10-1/30). Furthermore, valuable information about flower sex differentiation in this plant is scarce. To explore the mechanism of this process in J. curcas, transcriptome profiling of flower development was carried out, and certain genes related with sex differentiation were obtained through digital gene expression analysis of flower buds from different phases of floral development. After Illumina sequencing and clustering, 57,962 unigenes were identified. A total of 47,423 unigenes were annotated, with 85 being related to carpel and stamen differentiation, 126 involved in carpel and stamen development, and 592 functioning in the later development stage for the maturation of staminate or instaminate flowers. Annotation of these genes provided comprehensive information regarding the sex differentiation of flowers, including the signaling system, hormone biosynthesis and regulation, transcription regulation and ubiquitin-mediated proteolysis. A further expression pattern analysis of 15 sex-related genes using quantitative real-time PCR revealed that gibberellin-regulated protein 4-like protein and AMP-activated protein kinase are associated with stamen differentiation, whereas auxin response factor 6-like protein, AGAMOUS-like 20 protein, CLAVATA1, RING-H2 finger protein ATL3J, auxin-induced protein 22D, and r2r3-myb transcription factor contribute to embryo sac development in the instaminate flower. Cytokinin oxidase, Unigene28, auxin repressed-like protein ARP1, gibberellin receptor protein GID1 and auxin-induced protein X10A are involved in both stages mentioned above. In addition to its function in the differentiation and development of the stamens, the gibberellin signaling pathway also functions in embryo sac development for the instaminate flower. The auxin signaling pathway also participates in both stamen development and embryo sac development. Our transcriptome data provide a comprehensive gene expression profile for flower sex differentiation in Jatropha curcas, as well as new clues and information for further study in this field.
Transcriptome Analysis of Flower Sex Differentiation in Jatropha curcas L. Using RNA Sequencing

PubMed Central

Xu, Gang; Huang, Jian; Yang, Yong; Yao, Yin-an

2016-01-01

Background Jatropha curcas is thought to be a promising biofuel material, but its yield is restricted by a low ratio of instaminate / staminate flowers (1/10-1/30). Furthermore, valuable information about flower sex differentiation in this plant is scarce. To explore the mechanism of this process in J. curcas, transcriptome profiling of flower development was carried out, and certain genes related with sex differentiation were obtained through digital gene expression analysis of flower buds from different phases of floral development. Results After Illumina sequencing and clustering, 57,962 unigenes were identified. A total of 47,423 unigenes were annotated, with 85 being related to carpel and stamen differentiation, 126 involved in carpel and stamen development, and 592 functioning in the later development stage for the maturation of staminate or instaminate flowers. Annotation of these genes provided comprehensive information regarding the sex differentiation of flowers, including the signaling system, hormone biosynthesis and regulation, transcription regulation and ubiquitin-mediated proteolysis. A further expression pattern analysis of 15 sex-related genes using quantitative real-time PCR revealed that gibberellin-regulated protein 4-like protein and AMP-activated protein kinase are associated with stamen differentiation, whereas auxin response factor 6-like protein, AGAMOUS-like 20 protein, CLAVATA1, RING-H2 finger protein ATL3J, auxin-induced protein 22D, and r2r3-myb transcription factor contribute to embryo sac development in the instaminate flower. Cytokinin oxidase, Unigene28, auxin repressed-like protein ARP1, gibberellin receptor protein GID1 and auxin-induced protein X10A are involved in both stages mentioned above. In addition to its function in the differentiation and development of the stamens, the gibberellin signaling pathway also functions in embryo sac development for the instaminate flower. The auxin signaling pathway also participates in both stamen development and embryo sac development. Conclusions Our transcriptome data provide a comprehensive gene expression profile for flower sex differentiation in Jatropha curcas, as well as new clues and information for further study in this field. PMID:26848843
Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

PubMed Central

Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

2014-01-01

The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia-nigra, compared to controls. This novel workflow allows deep multi-level inspection of RNA-Seq datasets and provides a comprehensive new resource for understanding disease transcriptome modifications in PD and other neurodegenerative diseases. PMID:24651478
Comprehensive Identification of Long Non-coding RNAs in Purified Cell Types from the Brain Reveals Functional LncRNA in OPC Fate Determination.

PubMed

Dong, Xiaomin; Chen, Kenian; Cuevas-Diaz Duran, Raquel; You, Yanan; Sloan, Steven A; Zhang, Ye; Zong, Shan; Cao, Qilin; Barres, Ben A; Wu, Jia Qian

2015-12-01

Long non-coding RNAs (lncRNAs) (> 200 bp) play crucial roles in transcriptional regulation during numerous biological processes. However, it is challenging to comprehensively identify lncRNAs, because they are often expressed at low levels and with more cell-type specificity than are protein-coding genes. In the present study, we performed ab initio transcriptome reconstruction using eight purified cell populations from mouse cortex and detected more than 5000 lncRNAs. Predicting the functions of lncRNAs using cell-type specific data revealed their potential functional roles in Central Nervous System (CNS) development. We performed motif searches in ENCODE DNase I digital footprint data and Mouse ENCODE promoters to infer transcription factor (TF) occupancy. By integrating TF binding and cell-type specific transcriptomic data, we constructed a novel framework that is useful for systematically identifying lncRNAs that are potentially essential for brain cell fate determination. Based on this integrative analysis, we identified lncRNAs that are regulated during Oligodendrocyte Precursor Cell (OPC) differentiation from Neural Stem Cells (NSCs) and that are likely to be involved in oligodendrogenesis. The top candidate, lnc-OPC, shows highly specific expression in OPCs and remarkable sequence conservation among placental mammals. Interestingly, lnc-OPC is significantly up-regulated in glial progenitors from experimental autoimmune encephalomyelitis (EAE) mouse models compared to wild-type mice. OLIG2-binding sites in the upstream regulatory region of lnc-OPC were identified by ChIP (chromatin immunoprecipitation)-Sequencing and validated by luciferase assays. Loss-of-function experiments confirmed that lnc-OPC plays a functional role in OPC genesis. Overall, our results substantiated the role of lncRNA in OPC fate determination and provided an unprecedented data source for future functional investigations in CNS cell types. We present our datasets and analysis results via the interactive genome browser at our laboratory website that is freely accessible to the research community. This is the first lncRNA expression database of collective populations of glia, vascular cells, and neurons. We anticipate that these studies will advance the knowledge of this major class of non-coding genes and their potential roles in neurological development and diseases.
σI from Bacillus subtilis: Impact on Gene Expression and Characterization of σI-dependent Transcription that Requires New Types of Promoters with Extended -35 and -10 Elements.

PubMed

Ramaniuk, Olga; Převorovský, Martin; Pospíšil, Jiří; Vítovská, Dragana; Kofroňová, Olga; Benada, Oldřich; Schwarz, Marek; Šanderová, Hana; Hnilicová, Jarmila; Krásný, Libor

2018-06-18

σ I from Bacillus subtilis is a σ factor associating with RNA polymerase (RNAP) that was previously implicated in adaptation of the cell to elevated temperature. Here we provide a comprehensive characterization of this transcriptional regulator. By RNA-seq of wt and σ I -null strains at 37°C and 52°C we identified ∼130 genes affected by the absence of σ I Further analysis revealed that the majority of these genes were affected by σ I indirectly. The σ I regulon, i.e., the genes directly regulated by σ I , consists of 16 genes of which eight (the dhb and yku operons) are involved in iron metabolism. The involvement of σ I in iron metabolism was confirmed phenotypically. Next, we set up an in vitro transcription system and defined and experimentally validated the promoter sequence logo that, in addition to -35 and -10 regions, also contains extended -35 and -10 motifs. Thus, σ I -dependent promoters are relatively information-rich in comparison with most other promoters. In summary, this study supplies information about the least explored σ factor from the industrially important model organism B. subtilis Importance In bacteria, σ factors are essential for transcription initiation. Knowledge about their regulons ( i.e., genes transcribed from promoters dependent on these σ factors) is the key for understanding how bacteria cope with the changing environment and could be instrumental for biotechnologically motivated rewiring of gene expression. Here, we characterize the σ I regulon from the industrially important model Gram-positive bacterium - Bacillus subtilis We reveal that σ I affects expression of ∼ 130 genes, of which 16 are directly regulated by σ I , including genes encoding proteins involved in iron homeostasis. Detailed analysis of promoter elements then identifies unique sequences important for σ I -dependent transcription. This study thus provides a comprehensive view on this underexplored component of the B. subtilis transcription machinery. Copyright © 2018 American Society for Microbiology.
Genome-wide proteomics analysis on longissimus muscles in Qinchuan beef cattle.

PubMed

He, Hua; Chen, Si; Liang, Wei; Liu, Xiaolin

2017-04-01

To gain further insight into the molecular mechanism of bovine muscle development, we combined mass spectrometry characterization of proteins with Illumina deep sequencing of RNAs obtained from bovine longissimus muscle (LD) at prenatal and postnatal stages. For the proteomic study, each group of LD proteins was extracted and labeled using isobaric tags for relative and absolute quantitation (iTRAQ) method. Among the 1321 proteins identified from six samples, 390 proteins were differentially expressed in embryos at day 135 post-fertilization (Emb135d) vs. 30-month-old adult cattle (Emb135d vs. 30M) samples. Gene Ontology, Cluster of Orthologous Groups and Kyoto Encyclopedia of Genes and Genomes analyses were further conducted to better understand the different functions. Furthermore, we analyzed the relationship between transcript and protein regulation between samples by direct comparison of expression levels from transcriptomic and iTRAQ-based proteomics. Association results indicated that 1295 of 1321 proteins could be mapped to transcriptome sequencing data. This study provides the most comprehensive, targeted survey of bovine LD proteins to date and has shown the power of combining transcriptomic and proteomic approaches to provide molecular insights for understanding the developmental characteristics in bovine muscle, and even in other mammals. © 2016 Stichting International Foundation for Animal Genetics.
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses

PubMed Central

Liu, Ruijie; Holik, Aliaksei Z.; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E.; Asselin-Labat, Marie-Liesse; Smyth, Gordon K.; Ritchie, Matthew E.

2015-01-01

Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package. PMID:25925576

NCBI Epigenomics: a new public resource for exploring epigenomic data sets

PubMed Central

Fingerman, Ian M.; McDaniel, Lee; Zhang, Xuan; Ratzat, Walter; Hassan, Tarek; Jiang, Zhifang; Cohen, Robert F.; Schuler, Gregory D.

2011-01-01

The Epigenomics database at the National Center for Biotechnology Information (NCBI) is a new resource that has been created to serve as a comprehensive public resource for whole-genome epigenetic data sets (www.ncbi.nlm.nih.gov/epigenomics). Epigenetics is the study of stable and heritable changes in gene expression that occur independently of the primary DNA sequence. Epigenetic mechanisms include post-translational modifications of histones, DNA methylation, chromatin conformation and non-coding RNAs. It has been observed that misregulation of epigenetic processes has been associated with human disease. We have constructed the new resource by selecting the subset of epigenetics-specific data from general-purpose archives, such as the Gene Expression Omnibus, and Sequence Read Archives, and then subjecting them to further review, annotation and reorganization. Raw data is processed and mapped to genomic coordinates to generate ‘tracks’ that are a visual representation of the data. These data tracks can be viewed using popular genome browsers or downloaded for local analysis. The Epigenomics resource also provides the user with a unique interface that allows for intuitive browsing and searching of data sets based on biological attributes. Currently, there are 69 studies, 337 samples and over 1100 data tracks from five well-studied species that are viewable and downloadable in Epigenomics. PMID:21075792
NCBI Epigenomics: a new public resource for exploring epigenomic data sets.

PubMed

Fingerman, Ian M; McDaniel, Lee; Zhang, Xuan; Ratzat, Walter; Hassan, Tarek; Jiang, Zhifang; Cohen, Robert F; Schuler, Gregory D

2011-01-01

The Epigenomics database at the National Center for Biotechnology Information (NCBI) is a new resource that has been created to serve as a comprehensive public resource for whole-genome epigenetic data sets (www.ncbi.nlm.nih.gov/epigenomics). Epigenetics is the study of stable and heritable changes in gene expression that occur independently of the primary DNA sequence. Epigenetic mechanisms include post-translational modifications of histones, DNA methylation, chromatin conformation and non-coding RNAs. It has been observed that misregulation of epigenetic processes has been associated with human disease. We have constructed the new resource by selecting the subset of epigenetics-specific data from general-purpose archives, such as the Gene Expression Omnibus, and Sequence Read Archives, and then subjecting them to further review, annotation and reorganization. Raw data is processed and mapped to genomic coordinates to generate 'tracks' that are a visual representation of the data. These data tracks can be viewed using popular genome browsers or downloaded for local analysis. The Epigenomics resource also provides the user with a unique interface that allows for intuitive browsing and searching of data sets based on biological attributes. Currently, there are 69 studies, 337 samples and over 1100 data tracks from five well-studied species that are viewable and downloadable in Epigenomics.
Identification of the group IIa WRKY subfamily and the functional analysis of GhWRKY17 in upland cotton (Gossypium hirsutum L.).

PubMed

Gu, Lijiao; Li, Libei; Wei, Hengling; Wang, Hantao; Su, Junji; Guo, Yaning; Yu, Shuxun

2018-01-01

WRKY transcription factors play important roles in plant defense, stress response, leaf senescence, and plant growth and development. Previous studies have revealed the important roles of the group IIa GhWRKY genes in cotton. To comprehensively analyze the group IIa GhWRKY genes in upland cotton, we identified 15 candidate group IIa GhWRKY genes in the Gossypium hirsutum genome. The phylogenetic tree, intron-exon structure, motif prediction and Ka/Ks analyses indicated that most group IIa GhWRKY genes shared high similarity and conservation and underwent purifying selection during evolution. In addition, we detected the expression patterns of several group IIa GhWRKY genes in individual tissues as well as during leaf senescence using public RNA sequencing data and real-time quantitative PCR. To better understand the functions of group IIa GhWRKYs in cotton, GhWRKY17 (KF669857) was isolated from upland cotton, and its sequence alignment, promoter cis-acting elements and subcellular localization were characterized. Moreover, the over-expression of GhWRKY17 in Arabidopsis up-regulated the senescence-associated genes AtWRKY53, AtSAG12 and AtSAG13, enhancing the plant's susceptibility to leaf senescence. These findings lay the foundation for further analysis and study of the functions of WRKY genes in cotton.
An integrated global regulatory network of hematopoietic precursor cell self-renewal and differentiation.

PubMed

You, Yanan; Cuevas-Diaz Duran, Raquel; Jiang, Lihua; Dong, Xiaomin; Zong, Shan; Snyder, Michael; Wu, Jia Qian

2018-06-12

Systematic study of the regulatory mechanisms of Hematopoietic Stem Cell and Progenitor Cell (HSPC) self-renewal is fundamentally important for understanding hematopoiesis and for manipulating HSPCs for therapeutic purposes. Previously, we have characterized gene expression and identified important transcription factors (TFs) regulating the switch between self-renewal and differentiation in a multipotent Hematopoietic Progenitor Cell (HPC) line, EML (Erythroid, Myeloid, and Lymphoid) cells. Herein, we report binding maps for additional TFs (SOX4 and STAT3) by using chromatin immunoprecipitation (ChIP)-Sequencing, to address the underlying mechanisms regulating self-renewal properties of lineage-CD34+ subpopulation (Lin-CD34+ EML cells). Furthermore, we applied the Assay for Transposase Accessible Chromatin (ATAC)-Sequencing to globally identify the open chromatin regions associated with TF binding in the self-renewing Lin-CD34+ EML cells. Mass spectrometry (MS) was also used to quantify protein relative expression levels. Finally, by integrating the protein-protein interaction database, we built an expanded transcriptional regulatory and interaction network. We found that MAPK (Mitogen-activated protein kinase) pathway and TGF-β/SMAD signaling pathway components were highly enriched among the binding targets of these TFs in Lin-CD34+ EML cells. The present study integrates regulatory information at multiple levels to paint a more comprehensive picture of the HSPC self-renewal mechanisms.
The rubber tree genome shows expansion of gene family associated with rubber biosynthesis.

PubMed

Lau, Nyok-Sean; Makita, Yuko; Kawashima, Mika; Taylor, Todd D; Kondo, Shinji; Othman, Ahmad Sofiman; Shu-Chien, Alexander Chong; Matsui, Minami

2016-06-24

Hevea brasiliensis Muell. Arg, a member of the family Euphorbiaceae, is the sole natural resource exploited for commercial production of high-quality natural rubber. The properties of natural rubber latex are almost irreplaceable by synthetic counterparts for many industrial applications. A paucity of knowledge on the molecular mechanisms of rubber biosynthesis in high yield traits still persists. Here we report the comprehensive genome-wide analysis of the widely planted H. brasiliensis clone, RRIM 600. The genome was assembled based on ~155-fold combined coverage with Illumina and PacBio sequence data and has a total length of 1.55 Gb with 72.5% comprising repetitive DNA sequences. A total of 84,440 high-confidence protein-coding genes were predicted. Comparative genomic analysis revealed strong synteny between H. brasiliensis and other Euphorbiaceae genomes. Our data suggest that H. brasiliensis's capacity to produce high levels of latex can be attributed to the expansion of rubber biosynthesis-related genes in its genome and the high expression of these genes in latex. Using cap analysis gene expression data, we illustrate the tissue-specific transcription profiles of rubber biosynthesis-related genes, revealing alternative means of transcriptional regulation. Our study adds to the understanding of H. brasiliensis biology and provides valuable genomic resources for future agronomic-related improvement of the rubber tree.
The BIG Data Center: from deposition to integration to translation.

PubMed

2017-01-04

Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)

NASA Astrophysics Data System (ADS)

Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song

2010-01-01

In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).
Transcriptome analysis using next generation sequencing reveals molecular signatures of diabetic retinopathy and efficacy of candidate drugs.

PubMed

Kandpal, Raj P; Rajasimha, Harsha K; Brooks, Matthew J; Nellissery, Jacob; Wan, Jun; Qian, Jiang; Kern, Timothy S; Swaroop, Anand

2012-01-01

To define gene expression changes associated with diabetic retinopathy in a mouse model using next generation sequencing, and to utilize transcriptome signatures to assess molecular pathways by which pharmacological agents inhibit diabetic retinopathy. We applied a high throughput RNA sequencing (RNA-seq) strategy using Illumina GAIIx to characterize the entire retinal transcriptome from nondiabetic and from streptozotocin-treated mice 32 weeks after induction of diabetes. Some of the diabetic mice were treated with inhibitors of receptor for advanced glycation endproducts (RAGE) and p38 mitogen activated protein (MAP) kinase, which have previously been shown to inhibit diabetic retinopathy in rodent models. The transcripts and alternatively spliced variants were determined in all experimental groups. Next generation sequencing-based RNA-seq profiles provided comprehensive signatures of transcripts that are altered in early stages of diabetic retinopathy. These transcripts encoded proteins involved in distinct yet physiologically relevant disease-associated pathways such as inflammation, microvasculature formation, apoptosis, glucose metabolism, Wnt signaling, xenobiotic metabolism, and photoreceptor biology. Significant upregulation of crystallin transcripts was observed in diabetic animals, and the diabetes-induced upregulation of these transcripts was inhibited in diabetic animals treated with inhibitors of either RAGE or p38 MAP kinase. These two therapies also showed dissimilar regulation of some subsets of transcripts that included alternatively spliced versions of arrestin, neutral sphingomyelinase activation associated factor (Nsmaf), SH3-domain GRB2-like interacting protein 1 (Sgip1), and axin. Diabetes alters many transcripts in the retina, and two therapies that inhibit the vascular pathology similarly inhibit a portion of these changes, pointing to possible molecular mechanisms for their beneficial effects. These therapies also changed the abundance of various alternatively spliced versions of signaling transcripts, suggesting a possible role of alternative splicing in disease etiology. Our studies clearly demonstrate RNA-seq as a comprehensive strategy for identifying disease-specific transcripts, and for determining comparative profiles of molecular changes mediated by candidate drugs.
RoBuST: an integrated genomics resource for the root and bulb crop families Apiaceae and Alliaceae

PubMed Central

2010-01-01

Background Root and bulb vegetables (RBV) include carrots, celeriac (root celery), parsnips (Apiaceae), onions, garlic, and leek (Alliaceae)—food crops grown globally and consumed worldwide. Few data analysis platforms are currently available where data collection, annotation and integration initiatives are focused on RBV plant groups. Scientists working on RBV include breeders, geneticists, taxonomists, plant pathologists, and plant physiologists who use genomic data for a wide range of activities including the development of molecular genetic maps, delineation of taxonomic relationships, and investigation of molecular aspects of gene expression in biochemical pathways and disease responses. With genomic data coming from such diverse areas of plant science, availability of a community resource focused on these RBV data types would be of great interest to this scientific community. Description The RoBuST database has been developed to initiate a platform for collecting and organizing genomic information useful for RBV researchers. The current release of RoBuST contains genomics data for 294 Alliaceae and 816 Apiaceae plant species and has the following features: (1) comprehensive sequence annotations of 3663 genes 5959 RNAs, 22,723 ESTs and 11,438 regulatory sequence elements from Apiaceae and Alliaceae plant families; (2) graphical tools for visualization and analysis of sequence data; (3) access to traits, biosynthetic pathways, genetic linkage maps and molecular taxonomy data associated with Alliaceae and Apiaceae plants; and (4) comprehensive plant splice signal repository of 659,369 splice signals collected from 6015 plant species for comparative analysis of plant splicing patterns. Conclusions RoBuST, available at http://robust.genome.com, provides an integrated platform for researchers to effortlessly explore and analyze genomic data associated with root and bulb vegetables. PMID:20691054
Survey of 800+ data sets from human tissue and body fluid reveals xenomiRs are likely artifacts.

PubMed

Kang, Wenjing; Bang-Berthelsen, Claus Heiner; Holm, Anja; Houben, Anna J S; Müller, Anne Holt; Thymann, Thomas; Pociot, Flemming; Estivill, Xavier; Friedländer, Marc R

2017-04-01

miRNAs are small 22-nucleotide RNAs that can post-transcriptionally regulate gene expression. It has been proposed that dietary plant miRNAs can enter the human bloodstream and regulate host transcripts; however, these findings have been widely disputed. We here conduct the first comprehensive meta-study in the field, surveying the presence and abundances of cross-species miRNAs (xenomiRs) in 824 sequencing data sets from various human tissues and body fluids. We find that xenomiRs are commonly present in tissues (17%) and body fluids (69%); however, the abundances are low, comprising 0.001% of host human miRNA counts. Further, we do not detect a significant enrichment of xenomiRs in sequencing data originating from tissues and body fluids that are exposed to dietary intake (such as liver). Likewise, there is no significant depletion of xenomiRs in tissues and body fluids that are relatively separated from the main bloodstream (such as brain and cerebro-spinal fluids). Interestingly, the majority (81%) of body fluid xenomiRs stem from rodents, which are a rare human dietary contribution but common laboratory animals. Body fluid samples from the same studies tend to group together when clustered by xenomiR compositions, suggesting technical batch effects. Last, we performed carefully designed and controlled animal feeding studies, in which we detected no transfer of plant miRNAs into rat blood, or bovine milk sequences into piglet blood. In summary, our comprehensive computational and experimental results indicate that xenomiRs originate from technical artifacts rather than dietary intake. © 2017 Kang et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Toward the human cellular microRNAome.

PubMed

McCall, Matthew N; Kim, Min-Sik; Adil, Mohammed; Patil, Arun H; Lu, Yin; Mitchell, Christopher J; Leal-Rojas, Pamela; Xu, Jinchong; Kumar, Manoj; Dawson, Valina L; Dawson, Ted M; Baras, Alexander S; Rosenberg, Avi Z; Arking, Dan E; Burns, Kathleen H; Pandey, Akhilesh; Halushka, Marc K

2017-10-01

MicroRNAs are short RNAs that serve as regulators of gene expression and are essential components of normal development as well as modulators of disease. MicroRNAs generally act cell-autonomously, and thus their localization to specific cell types is needed to guide our understanding of microRNA activity. Current tissue-level data have caused considerable confusion, and comprehensive cell-level data do not yet exist. Here, we establish the landscape of human cell-specific microRNA expression. This project evaluated 8 billion small RNA-seq reads from 46 primary cell types, 42 cancer or immortalized cell lines, and 26 tissues. It identified both specific and ubiquitous patterns of expression that strongly correlate with adjacent superenhancer activity. Analysis of unaligned RNA reads uncovered 207 unknown minor strand (passenger) microRNAs of known microRNA loci and 495 novel putative microRNA loci. Although cancer cell lines generally recapitulated the expression patterns of matched primary cells, their isomiR sequence families exhibited increased disorder, suggesting DROSHA- and DICER1-dependent microRNA processing variability. Cell-specific patterns of microRNA expression were used to de-convolute variable cellular composition of colon and adipose tissue samples, highlighting one use of these cell-specific microRNA expression data. Characterization of cellular microRNA expression across a wide variety of cell types provides a new understanding of this critical regulatory RNA species. © 2017 McCall et al.; Published by Cold Spring Harbor Laboratory Press.
The RNA-Seq-based high resolution gene expression atlas of chickpea (Cicer arietinum L.) reveals dynamic spatio-temporal changes associated with growth and development.

PubMed

Kudapa, Himabindu; Garg, Vanika; Chitikineni, Annapurna; Varshney, Rajeev K

2018-04-10

Chickpea is one of the world's largest cultivated food legumes and is an excellent source of high-quality protein to the human diet. Plant growth and development are controlled by programmed expression of a suite of genes at the given time, stage, and tissue. Understanding how the underlying genome sequence translates into specific plant phenotypes at key developmental stages, information on gene expression patterns is crucial. Here, we present a comprehensive Cicer arietinum Gene Expression Atlas (CaGEA) across different plant developmental stages and organs covering the entire life cycle of chickpea. One of the widely used drought tolerant cultivars, ICC 4958 has been used to generate RNA-Seq data from 27 samples at 5 major developmental stages of the plant. A total of 816 million raw reads were generated and of these, 794 million filtered reads after quality control (QC) were subjected to downstream analysis. A total of 15,947 unique number of differentially expressed genes across different pairwise tissue combinations were identified. Significant differences in gene expression patterns contributing in the process of flowering, nodulation, and seed and root development were inferred in this study. Furthermore, differentially expressed candidate genes from "QTL-hotspot" region associated with drought stress response in chickpea were validated. © 2018 The Authors. Plant, Cell & Environment Published by John Wiley & Sons Ltd.
Transcriptome analysis of Petunia axillaris flowers reveals genes involved in morphological differentiation and metabolite transport

PubMed Central

Amano, Ikuko; Kitajima, Sakihito; Suzuki, Hideyuki; Koeduka, Takao

2018-01-01

The biosynthesis of plant secondary metabolites is associated with morphological and metabolic differentiation. As a consequence, gene expression profiles can change drastically, and primary and secondary metabolites, including intermediate and end-products, move dynamically within and between cells. However, little is known about the molecular mechanisms underlying differentiation and transport mechanisms. In this study, we performed a transcriptome analysis of Petunia axillaris subsp. parodii, which produces various volatiles in its corolla limbs and emits metabolites to attract pollinators. RNA-sequencing from leaves, buds, and limbs identified 53,243 unigenes. Analysis of differentially expressed genes, combined with gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses, showed that many biological processes were highly enriched in limbs. These included catabolic processes and signaling pathways of hormones, such as gibberellins, and metabolic pathways, including phenylpropanoids and fatty acids. Moreover, we identified five transporter genes that showed high expression in limbs, and we performed spatiotemporal expression analyses and homology searches to infer their putative functions. Our systematic analysis provides comprehensive transcriptomic information regarding morphological differentiation and metabolite transport in the Petunia flower and lays the foundation for establishing the specific mechanisms that control secondary metabolite biosynthesis in plants. PMID:29902274
Functional analysis of regulatory single-nucleotide polymorphisms.

PubMed

Pampín, Sandra; Rodríguez-Rey, José C

2007-04-01

The identification of regulatory polymorphisms has become a key problem in human genetics. In the past few years there has been a conceptual change in the way in which regulatory single-nucleotide polymorphisms are studied. We revise the new approaches and discuss how gene expression studies can contribute to a better knowledge of the genetics of common diseases. New techniques for the association of single-nucleotide polymorphisms with changes in gene expression have been recently developed. This, together with a more comprehensive use of the old in-vitro methods, has produced a great amount of genetic information. When added to current databases, it will help to design better tools for the detection of regulatory single-nucleotide polymorphisms. The identification of functional regulatory single-nucleotide polymorphisms cannot be done by the simple inspection of DNA sequence. In-vivo techniques, based on primer-extension, and the more recently developed 'haploChIP' allow the association of gene variants to changes in gene expression. Gene expression analysis by conventional in-vitro techniques is the only way to identify the functional consequences of regulatory single-nucleotide polymorphisms. The amount of information produced in the last few years will help to refine the tools for the future analysis of regulatory gene variants.
Transcriptome and selected metabolite analyses reveal points of sugar metabolism in jackfruit (Artocarpus heterophyllus Lam.).

PubMed

Hu, Lisong; Wu, Gang; Hao, Chaoyun; Yu, Huan; Tan, Lehe

2016-07-01

Artocarpus heterophyllus Lam., commonly known as jackfruit, produces the largest tree-borne fruit known thus far. The edible part of the fruit develops from the perianths, and contains many sugar-derived compounds. However, its sugar metabolism is poorly understood. A fruit perianth transcriptome was sequenced on an Illumina HiSeq 2500 platform, producing 32,459 unigenes with an average length of 1345nt. Sugar metabolism was characterized by comparing expression patterns of genes related to sugar metabolism and evaluating correlations with enzyme activity and sugar accumulation during fruit perianth development. During early development, high expression levels of acid invertases and corresponding enzyme activities were responsible for the rapid utilization of imported sucrose for fruit growth. The differential expression of starch metabolism-related genes and corresponding enzyme activities were responsible for starch accumulated before fruit ripening but decreased during ripening. Sucrose accumulated during ripening, when the expression levels of genes for sucrose synthesis were elevated and high enzyme activity was observed. The comprehensive transcriptome analysis presents fundamental information on sugar metabolism and will be a useful reference for further research on fruit perianth development in jackfruit. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Adipocyte Long-Noncoding RNA Transcriptome Analysis of Obese Mice Identified Lnc-Leptin, Which Regulates Leptin.

PubMed

Lo, Kinyui Alice; Huang, Shiqi; Walet, Arcinas Camille Esther; Zhang, Zhi-Chun; Leow, Melvin Khee-Shing; Liu, Meihui; Sun, Lei

2018-06-01

Obesity induces profound transcriptome changes in adipocytes, and recent evidence suggests that long-noncoding RNAs (lncRNAs) play key roles in this process. We performed a comprehensive transcriptome study by RNA sequencing in adipocytes isolated from interscapular brown, inguinal, and epididymal white adipose tissue in diet-induced obese mice. The analysis revealed a set of obesity-dysregulated lncRNAs, many of which exhibit dynamic changes in the fed versus fasted state, potentially serving as novel molecular markers of adipose energy status. Among the most prominent lncRNAs is Lnc-leptin , which is transcribed from an enhancer region upstream of leptin ( Lep ). Expression of Lnc-leptin is sensitive to insulin and closely correlates to Lep expression across diverse pathophysiological conditions. Functionally, induction of Lnc-leptin is essential for adipogenesis, and its presence is required for the maintenance of Lep expression in vitro and in vivo. Direct interaction was detected between DNA loci of Lnc-leptin and Lep in mature adipocytes, which diminished upon Lnc-leptin knockdown. Our study establishes Lnc-leptin as a new regulator of Lep . © 2018 by the American Diabetes Association.
Genomic organization, evolution, and expression of photoprotein and opsin genes in Mnemiopsis leidyi: a new view of ctenophore photocytes.

PubMed

Schnitzler, Christine E; Pang, Kevin; Powers, Meghan L; Reitzel, Adam M; Ryan, Joseph F; Simmons, David; Tada, Takashi; Park, Morgan; Gupta, Jyoti; Brooks, Shelise Y; Blakesley, Robert W; Yokoyama, Shozo; Haddock, Steven Hd; Martindale, Mark Q; Baxevanis, Andreas D

2012-12-21

Calcium-activated photoproteins are luciferase variants found in photocyte cells of bioluminescent jellyfish (Phylum Cnidaria) and comb jellies (Phylum Ctenophora). The complete genomic sequence from the ctenophore Mnemiopsis leidyi, a representative of the earliest branch of animals that emit light, provided an opportunity to examine the genome of an organism that uses this class of luciferase for bioluminescence and to look for genes involved in light reception. To determine when photoprotein genes first arose, we examined the genomic sequence from other early-branching taxa. We combined our genomic survey with gene trees, developmental expression patterns, and functional protein assays of photoproteins and opsins to provide a comprehensive view of light production and light reception in Mnemiopsis. The Mnemiopsis genome has 10 full-length photoprotein genes situated within two genomic clusters with high sequence conservation that are maintained due to strong purifying selection and concerted evolution. Photoprotein-like genes were also identified in the genomes of the non-luminescent sponge Amphimedon queenslandica and the non-luminescent cnidarian Nematostella vectensis, and phylogenomic analysis demonstrated that photoprotein genes arose at the base of all animals. Photoprotein gene expression in Mnemiopsis embryos begins during gastrulation in migrating precursors to photocytes and persists throughout development in the canals where photocytes reside. We identified three putative opsin genes in the Mnemiopsis genome and show that they do not group with well-known bilaterian opsin subfamilies. Interestingly, photoprotein transcripts are co-expressed with two of the putative opsins in developing photocytes. Opsin expression is also seen in the apical sensory organ. We present evidence that one opsin functions as a photopigment in vitro, absorbing light at wavelengths that overlap with peak photoprotein light emission, raising the hypothesis that light production and light reception may be functionally connected in ctenophore photocytes. We also present genomic evidence of a complete ciliary phototransduction cascade in Mnemiopsis. This study elucidates the genomic organization, evolutionary history, and developmental expression of photoprotein and opsin genes in the ctenophore Mnemiopsis leidyi, introduces a novel dual role for ctenophore photocytes in both bioluminescence and phototransduction, and raises the possibility that light production and light reception are linked in this early-branching non-bilaterian animal.
Genomic organization, evolution, and expression of photoprotein and opsin genes in Mnemiopsis leidyi: a new view of ctenophore photocytes

PubMed Central

2012-01-01

Background Calcium-activated photoproteins are luciferase variants found in photocyte cells of bioluminescent jellyfish (Phylum Cnidaria) and comb jellies (Phylum Ctenophora). The complete genomic sequence from the ctenophore Mnemiopsis leidyi, a representative of the earliest branch of animals that emit light, provided an opportunity to examine the genome of an organism that uses this class of luciferase for bioluminescence and to look for genes involved in light reception. To determine when photoprotein genes first arose, we examined the genomic sequence from other early-branching taxa. We combined our genomic survey with gene trees, developmental expression patterns, and functional protein assays of photoproteins and opsins to provide a comprehensive view of light production and light reception in Mnemiopsis. Results The Mnemiopsis genome has 10 full-length photoprotein genes situated within two genomic clusters with high sequence conservation that are maintained due to strong purifying selection and concerted evolution. Photoprotein-like genes were also identified in the genomes of the non-luminescent sponge Amphimedon queenslandica and the non-luminescent cnidarian Nematostella vectensis, and phylogenomic analysis demonstrated that photoprotein genes arose at the base of all animals. Photoprotein gene expression in Mnemiopsis embryos begins during gastrulation in migrating precursors to photocytes and persists throughout development in the canals where photocytes reside. We identified three putative opsin genes in the Mnemiopsis genome and show that they do not group with well-known bilaterian opsin subfamilies. Interestingly, photoprotein transcripts are co-expressed with two of the putative opsins in developing photocytes. Opsin expression is also seen in the apical sensory organ. We present evidence that one opsin functions as a photopigment in vitro, absorbing light at wavelengths that overlap with peak photoprotein light emission, raising the hypothesis that light production and light reception may be functionally connected in ctenophore photocytes. We also present genomic evidence of a complete ciliary phototransduction cascade in Mnemiopsis. Conclusions This study elucidates the genomic organization, evolutionary history, and developmental expression of photoprotein and opsin genes in the ctenophore Mnemiopsis leidyi, introduces a novel dual role for ctenophore photocytes in both bioluminescence and phototransduction, and raises the possibility that light production and light reception are linked in this early-branching non-bilaterian animal. PMID:23259493
Comparable contributions of structural-functional constraints and expression level to the rate of protein sequence evolution

PubMed Central

Wolf, Maxim Y; Wolf, Yuri I; Koonin, Eugene V

2008-01-01

Background Proteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate. Results This work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude. Conclusion Fusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution. Reviewers This article was reviewed by Sergei Maslov, Dennis Vitkup, Claus Wilke (nominated by Orly Alter), and Allan Drummond (nominated by Joel Bader). For the full reviews, please go to the Reviewers' Reports section. PMID:18840284
PomBase: a comprehensive online resource for fission yeast

PubMed Central

Wood, Valerie; Harris, Midori A.; McDowall, Mark D.; Rutherford, Kim; Vaughan, Brendan W.; Staines, Daniel M.; Aslett, Martin; Lock, Antonia; Bähler, Jürg; Kersey, Paul J.; Oliver, Stephen G.

2012-01-01

PomBase (www.pombase.org) is a new model organism database established to provide access to comprehensive, accurate, and up-to-date molecular data and biological information for the fission yeast Schizosaccharomyces pombe to effectively support both exploratory and hypothesis-driven research. PomBase encompasses annotation of genomic sequence and features, comprehensive manual literature curation and genome-wide data sets, and supports sophisticated user-defined queries. The implementation of PomBase integrates a Chado relational database that houses manually curated data with Ensembl software that supports sequence-based annotation and web access. PomBase will provide user-friendly tools to promote curation by experts within the fission yeast community. This will make a key contribution to shaping its content and ensuring its comprehensiveness and long-term relevance. PMID:22039153

Epigenetics of prostate cancer.

PubMed

McKee, Tawnya C; Tricoli, James V

2015-01-01

The introduction of novel technologies that can be applied to the investigation of the molecular underpinnings of human cancer has allowed for new insights into the mechanisms associated with tumor development and progression. They have also advanced the diagnosis, prognosis and treatment of cancer. These technologies include microarray and other analysis methods for the generation of large-scale gene expression data on both mRNA and miRNA, next-generation DNA sequencing technologies utilizing a number of platforms to perform whole genome, whole exome, or targeted DNA sequencing to determine somatic mutational differences and gene rearrangements, and a variety of proteomic analysis platforms including liquid chromatography/mass spectrometry (LC/MS) analysis to survey alterations in protein profiles in tumors. One other important advancement has been our current ability to survey the methylome of human tumors in a comprehensive fashion through the use of sequence-based and array-based methylation analysis (Bock et al., Nat Biotechnol 28:1106-1114, 2010; Harris et al., Nat Biotechnol 28:1097-1105, 2010). The focus of this chapter is to present and discuss the evidence for key genes involved in prostate tumor development, progression, or resistance to therapy that are regulated by methylation-induced silencing.
Age-Related Gene Expression Differences in Monocytes from Human Neonates, Young Adults, and Older Adults

PubMed Central

Tong, Ann-Jay; Kollmann, Tobias R.; Smale, Stephen T.

2015-01-01

A variety of age-related differences in the innate and adaptive immune systems have been proposed to contribute to the increased susceptibility to infection of human neonates and older adults. The emergence of RNA sequencing (RNA-seq) provides an opportunity to obtain an unbiased, comprehensive, and quantitative view of gene expression differences in defined cell types from different age groups. An examination of ex vivo human monocyte responses to lipopolysaccharide stimulation or Listeria monocytogenes infection by RNA-seq revealed extensive similarities between neonates, young adults, and older adults, with an unexpectedly small number of genes exhibiting statistically significant age-dependent differences. By examining the differentially induced genes in the context of transcription factor binding motifs and RNA-seq data sets from mutant mouse strains, a previously described deficiency in interferon response factor-3 activity could be implicated in most of the differences between newborns and young adults. Contrary to these observations, older adults exhibited elevated expression of inflammatory genes at baseline, yet the responses following stimulation correlated more closely with those observed in younger adults. Notably, major differences in the expression of constitutively expressed genes were not observed, suggesting that the age-related differences are driven by environmental influences rather than cell-autonomous differences in monocyte development. PMID:26147648
Comprehensive analysis of fibroblast growth factor receptor expression patterns during chick forelimb development.

PubMed

Sheeba, Caroline J; Andrade, Raquel P; Duprez, Delphine; Palmeirim, Isabel

2010-01-01

Specific interactions between fibroblast growth factors (Fgf1-22) and their tyrosine kinase receptors (FgfR1-4) activate different signalling pathways that are responsible for the biological processes in which Fgf signalling is implicated during embryonic development. In the chick, several Fgf ligands (Fgf2, 4, 8, 9, 10, 12, 13 and 18) and the four FgfRs (FgfR 1, 2, 3 and 4) have been reported to be expressed in the developing limb. The precise spatial and temporal expression of these transcripts is important to guide the limb bud to develop into a wing/leg. In this paper, we present a detailed and systematic analysis of the expression patterns of FgfR1, 2, 3 and 4 throughout chick wing development, by in situ hybridisation on whole mounts and sections. Moreover, we characterize for the first time the different isoforms of FGFR1-3 by analysing their differential expression in limb ectoderm and mesodermal tissues, using RT-PCR and in situ hybridisation on sections. Finally, isoform-specific sequences for FgfR1IIIb, FgfR1IIIc, FgfR3IIIb and FgfR3IIIc were determined and deposited in GenBank with the following accession numbers: GU053725, GU065444, GU053726, GU065445, respectively.
RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application.

PubMed

D'Antonio, Mattia; D'Onorio De Meo, Paolo; Pallocca, Matteo; Picardi, Ernesto; D'Erchia, Anna Maria; Calogero, Raffaele A; Castrignanò, Tiziana; Pesole, Graziano

2015-01-01

The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs.
TCL1A, a Novel Transcription Factor and a Coregulator of Nuclear Factor κB p65: Single Nucleotide Polymorphism and Estrogen Dependence.

PubMed

Ho, Ming-Fen; Lummertz da Rocha, Edroaldo; Zhang, Cheng; Ingle, James N; Goss, Paul E; Shepherd, Lois E; Kubo, Michiaki; Wang, Liewei; Li, Hu; Weinshilboum, Richard M

2018-06-01

T-cell leukemia 1A ( TCL1A ) single-nucleotide polymorphisms (SNPs) have been associated with aromatase inhibitor-induced musculoskeletal adverse events. We previously demonstrated that TCL1A is inducible by estradiol (E 2 ) and plays a critical role in the regulation of cytokines, chemokines, and Toll-like receptors in a TCL1A SNP genotype and estrogen-dependent fashion. Furthermore, TCLIA SNP-dependent expression phenotypes can be "reversed" by exposure to selective estrogen receptor modulators such as 4-hydroxytamoxifen (4OH-TAM). The present study was designed to comprehensively characterize the role of TCL1A in transcriptional regulation across the genome by performing RNA sequencing (RNA-seq) and chromatin immunoprecipitation sequencing (ChIP-seq) assays with lymphoblastoid cell lines. RNA-seq identified 357 genes that were regulated in a TCL1A SNP- and E 2 -dependent fashion with expression patterns that were 4OH-TAM reversible. ChIP-seq for the same cells identified 57 TCL1A binding sites that could be regulated by E 2 in a SNP-dependent fashion. Even more striking, nuclear factor- κ B (NF- κ B) p65 bound to those same DNA regions. In summary, TCL1A is a novel transcription factor with expression that is regulated in a SNP- and E 2 -dependent fashion-a pattern of expression that can be reversed by 4OH-TAM. Integrated RNA-seq and ChIP-seq results suggest that TCL1A also acts as a transcriptional coregulator with NF- κ B p65, an important immune system transcription factor. Copyright © 2018 by The American Society for Pharmacology and Experimental Therapeutics.
The Listening and Reading Comprehension (LARC) Program....Experiential Based Sequential Training.

ERIC Educational Resources Information Center

Blumenstyk, Holly; And Others

The LARC (Listening and Reading Comprehension) Program, an experiential based story grammar approach to listening and reading comprehension is described, and a pilot study of its effectiveness with communication handicapped children is reviewed. The LARC framework translates children's own recent experiences into sequenced story episodes which are…
Comprehensive molecular diagnosis of Epstein-Barr virus-associated lymphoproliferative diseases using next-generation sequencing.

PubMed

Ono, Shintaro; Nakayama, Manabu; Kanegane, Hirokazu; Hoshino, Akihiro; Shimodera, Saeko; Shibata, Hirofumi; Fujino, Hisanori; Fujino, Takahiro; Yunomae, Yuta; Okano, Tsubasa; Yamashita, Motoi; Yasumi, Takahiro; Izawa, Kazushi; Takagi, Masatoshi; Imai, Kohsuke; Zhang, Kejian; Marsh, Rebecca; Picard, Capucine; Latour, Sylvain; Ohara, Osamu; Morio, Tomohiro

2018-05-18

Epstein-Barr virus (EBV) is associated with several life-threatening diseases, such as lymphoproliferative disease (LPD), particularly in immunocompromised hosts. Some categories of primary immunodeficiency diseases (PIDs) including X-linked lymphoproliferative syndrome (XLP), are characterized by susceptibility and vulnerability to EBV infection. The number of genetically defined PIDs is rapidly increasing, and clinical genetic testing plays an important role in establishing a definitive diagnosis. Whole-exome sequencing is performed for diagnosing rare genetic diseases, but is both expensive and time-consuming. Low-cost, high-throughput gene analysis systems are thus necessary. We developed a comprehensive molecular diagnostic method using a two-step tailed polymerase chain reaction (PCR) and a next-generation sequencing (NGS) platform to detect mutations in 23 candidate genes responsible for XLP or XLP-like diseases. Samples from 19 patients suspected of having EBV-associated LPD were used in this comprehensive molecular diagnosis. Causative gene mutations (involving PRF1 and SH2D1A) were detected in two of the 19 patients studied. This comprehensive diagnosis method effectively detected mutations in all coding exons of 23 genes with sufficient read numbers for each amplicon. This comprehensive molecular diagnostic method using PCR and NGS provides a rapid, accurate, low-cost diagnosis for patients with XLP or XLP-like diseases.
Comprehensive gene expression analysis of canine invasive urothelial bladder carcinoma by RNA-Seq.

PubMed

Maeda, Shingo; Tomiyasu, Hirotaka; Tsuboi, Masaya; Inoue, Akiko; Ishihara, Genki; Uchikai, Takao; Chambers, James K; Uchida, Kazuyuki; Yonezawa, Tomohiro; Matsuki, Naoaki

2018-04-27

Invasive urothelial carcinoma (iUC) is a major cause of death in humans, and approximately 165,000 individuals succumb to this cancer annually worldwide. Comparative oncology using relevant animal models is necessary to improve our understanding of progression, diagnosis, and treatment of iUC. Companion canines are a preferred animal model of iUC due to spontaneous tumor development and similarity to human disease in terms of histopathology, metastatic behavior, and treatment response. However, the comprehensive molecular characterization of canine iUC is not well documented. In this study, we performed transcriptome analysis of tissue samples from canine iUC and normal bladders using an RNA sequencing (RNA-Seq) approach to identify key molecular pathways in canine iUC. Total RNA was extracted from bladder tissues of 11 dogs with iUC and five healthy dogs, and RNA-Seq was conducted. Ingenuity Pathway Analysis (IPA) was used to assign differentially expressed genes to known upstream regulators and functional networks. Differential gene expression analysis of the RNA-Seq data revealed 2531 differentially expressed genes, comprising 1007 upregulated and 1524 downregulated genes, in canine iUC. IPA revealed that the most activated upstream regulator was PTGER2 (encoding the prostaglandin E 2 receptor EP2), which is consistent with the therapeutic efficiency of cyclooxygenase inhibitors in canine iUC. Similar to human iUC, canine iUC exhibited upregulated ERBB2 and downregulated TP53 pathways. Biological functions associated with cancer, cell proliferation, and leukocyte migration were predicted to be activated, while muscle functions were predicted to be inhibited, indicating muscle-invasive tumor property. Our data confirmed similarities in gene expression patterns between canine and human iUC and identified potential therapeutic targets (PTGER2, ERBB2, CCND1, Vegf, and EGFR), suggesting the value of naturally occurring canine iUC as a relevant animal model for human iUC.
Improved serial analysis of V1 ribosomal sequence tags (SARST-V1) provides a rapid, comprehensive, sequence-based characterization of bacterial diversity and community composition.

PubMed

Yu, Zhongtang; Yu, Marie; Morrison, Mark

2006-04-01

Serial analysis of ribosomal sequence tags (SARST) is a recently developed technology that can generate large 16S rRNA gene (rrs) sequence data sets from microbiomes, but there are numerous enzymatic and purification steps required to construct the ribosomal sequence tag (RST) clone libraries. We report here an improved SARST method, which still targets the V1 hypervariable region of rrs genes, but reduces the number of enzymes, oligonucleotides, reagents, and technical steps needed to produce the RST clone libraries. The new method, hereafter referred to as SARST-V1, was used to examine the eubacterial diversity present in community DNA recovered from the microbiome resident in the ovine rumen. The 190 sequenced clones contained 1055 RSTs and no less than 236 unique phylotypes (based on > or = 95% sequence identity) that were assigned to eight different eubacterial phyla. Rarefaction and monomolecular curve analyses predicted that the complete RST clone library contains 99% of the 353 unique phylotypes predicted to exist in this microbiome. When compared with ribosomal intergenic spacer analysis (RISA) of the same community DNA sample, as well as a compilation of nine previously published conventional rrs clone libraries prepared from the same type of samples, the RST clone library provided a more comprehensive characterization of the eubacterial diversity present in rumen microbiomes. As such, SARST-V1 should be a useful tool applicable to comprehensive examination of diversity and composition in microbiomes and offers an affordable, sequence-based method for diversity analysis.
Transcriptomic Analysis of the Underground Renewal Buds during Dormancy Transition and Release in ‘Hangbaishao’ Peony (Paeonia lactiflora)

PubMed Central

Zhang, Jiaping; Wang, Guanqun; Li, Xin; Xia, Yiping

2015-01-01

Paeonia lactiflora is one of the most famous species of herbaceous peonies with gorgeous flowers. Bud dormancy is a crucial developmental process that allows P. lactiflora to survive unfavorable environmental conditions. However, little information is available on the molecular mechanism of the bud dormancy in P. lactiflora. We performed de novo transcriptome sequencing using the Illumina RNA sequencing platform for the underground renewal buds of P. lactiflora ‘Hangbaishao’ to study the molecular mechanism underlying its bud dormancy transition (the period from endodormancy to ecodormancy) and release (the period from ecodormancy to bud elongation and sprouting). Approximately 300 million high-quality clean reads were generated and assembled into 207,827 (mean length = 828 bp) and 51,481 (mean length = 1250 bp) unigenes using two assembly methods named “Trinity” and “Trinity+PRICE”, respectively. Based on the data obtained by the latter method, 32,316 unigenes were annotated by BLAST against various databases. Approximately 1,251 putative transcription factors were obtained, of which the largest number of unique transcripts belonged to the basic helix-loop-helix protein (bHLH) transcription factor family, and five of the top ten highly expressed transcripts were annotated as dehydrin (DHN). A total of 17,705 simple sequence repeat (SSR) motifs distributed in 13,797 sequences were obtained. The budbreak morphology, levels of indole-3-acetic acid (IAA) and abscisic acid (ABA), and activities of guaiacol peroxidase (POD) and catalase (CAT) were observed. The expression of 20 interested unigenes, which annotated as DHN, heat shock protein (HSP), histone, late elongated hypocotyl (LHY), and phytochrome (PHY), and so on, were also analyzed. These studies were based on morphological, physiological, biochemical, and molecular levels and provide comprehensive insight into the mechanism of dormancy transition and release in P. lactiflora. Transcriptome dataset can be highly valuable for future investigation on gene expression networks in P. lactiflora as well as research on dormancy in other non-model perennial horticultural crops of commercial significance. PMID:25790307
Comprehensive transcriptome analysis reveals distinct regulatory programs during vernalization and floral bud development of orchardgrass (Dactylis glomerata L.).

PubMed

Feng, Guangyan; Huang, Linkai; Li, Ji; Wang, Jianping; Xu, Lei; Pan, Ling; Zhao, Xinxin; Wang, Xia; Huang, Ting; Zhang, Xinquan

2017-11-22

Vernalization and the transition from vegetative to reproductive growth involve multiple pathways, vital for controlling floral organ formation and flowering time. However, little transcription information is available about the mechanisms behind environmental adaption and growth regulation. Here, we used high-throughput sequencing to analyze the comprehensive transcriptome of Dactylis glomerata L. during six different growth periods. During vernalization, 4689 differentially expressed genes (DEGs) significantly increased in abundance, while 3841 decreased. Furthermore, 12,967 DEGs were identified during booting stage and flowering stage, including 7750 up-regulated and 5219 down-regulated DEGs. Pathway analysis indicated that transcripts related to circadian rhythm, photoperiod, photosynthesis, flavonoid biosynthesis, starch, and sucrose metabolism changed significantly at different stages. Coexpression and weighted correlation network analysis (WGCNA) analysis linked different stages to transcriptional changes and provided evidence of inner relation modules associated with signal transduction, stress responses, cell division, and hormonal transport. We found enrichment in transcription factors (TFs) related to WRKY, NAC, AP2/EREBP, AUX/IAA, MADS-BOX, ABI3/VP1, bHLH, and the CCAAT family during vernalization and floral bud development. TFs expression patterns revealed intricate temporal variations, suggesting relatively separate regulatory programs of TF modules. Further study will unlock insights into the ability of the circadian rhythm and photoperiod to regulate vernalization and flowering time in perennial grass.
Transcriptome Dynamics during Maize Endosperm Development

PubMed Central

Feng, Jiaojiao; Xu, Shutu; Wang, Lei; Li, Feifei; Li, Yibo; Zhang, Renhe; Zhang, Xinghua; Xue, Jiquan; Guo, Dongwei

2016-01-01

The endosperm is a major organ of the seed that plays vital roles in determining seed weight and quality. However, genome-wide transcriptome patterns throughout maize endosperm development have not been comprehensively investigated to date. Accordingly, we performed a high-throughput RNA sequencing (RNA-seq) analysis of the maize endosperm transcriptome at 5, 10, 15 and 20 days after pollination (DAP). We found that more than 11,000 protein-coding genes underwent alternative splicing (AS) events during the four developmental stages studied. These genes were mainly involved in intracellular protein transport, signal transmission, cellular carbohydrate metabolism, cellular lipid metabolism, lipid biosynthesis, protein modification, histone modification, cellular amino acid metabolism, and DNA repair. Additionally, 7,633 genes, including 473 transcription factors (TFs), were differentially expressed among the four developmental stages. The differentially expressed TFs were from 50 families, including the bZIP, WRKY, GeBP and ARF families. Further analysis of the stage-specific TFs showed that binding, nucleus and ligand-dependent nuclear receptor activities might be important at 5 DAP, that immune responses, signalling, binding and lumen development are involved at 10 DAP, that protein metabolic processes and the cytoplasm might be important at 15 DAP, and that the responses to various stimuli are different at 20 DAP compared with the other developmental stages. This RNA-seq analysis provides novel, comprehensive insights into the transcriptome dynamics during early endosperm development in maize. PMID:27695101
HIV-1 Infection of Primary CD4+ T Cells Regulates the Expression of Specific Human Endogenous Retrovirus HERV-K (HML-2) Elements.

PubMed

Young, George R; Terry, Sandra N; Manganaro, Lara; Cuesta-Dominguez, Alvaro; Deikus, Gintaras; Bernal-Rubio, Dabeiba; Campisi, Laura; Fernandez-Sesma, Ana; Sebra, Robert; Simon, Viviana; Mulder, Lubbertus C F

2018-01-01

Endogenous retroviruses (ERVs) occupy extensive regions of the human genome. Although many of these retroviral elements have lost their ability to replicate, those whose insertion took place more recently, such as the HML-2 group of HERV-K elements, still retain intact open reading frames and the capacity to produce certain viral RNA and/or proteins. Transcription of these ERVs is, however, tightly regulated by dedicated epigenetic control mechanisms. Nonetheless, it has been reported that some pathological states, such as viral infections and certain cancers, coincide with ERV expression, suggesting that transcriptional reawakening is possible. HML-2 elements are reportedly induced during HIV-1 infection, but the conserved nature of these elements has, until recently, rendered their expression profiling problematic. Here, we provide comprehensive HERV-K HML-2 expression profiles specific for productively HIV-1-infected primary human CD4 + T cells. We combined enrichment of HIV-1 infected cells using a reporter virus expressing a surface reporter for gentle and efficient purification with long-read single-molecule real-time sequencing. We show that three HML-2 proviruses-6q25.1, 8q24.3, and 19q13.42-are upregulated on average between 3- and 5-fold in HIV-1-infected CD4 + T cells. One provirus, HML-2 12q24.33, in contrast, was repressed in the presence of active HIV replication. In conclusion, this report identifies the HERV-K HML-2 loci whose expression profiles differ upon HIV-1 infection in primary human CD4 + T cells. These data will help pave the way for further studies on the influence of endogenous retroviruses on HIV-1 replication. IMPORTANCE Endogenous retroviruses inhabit big portions of our genome. Moreover, although they are mainly inert, some of the evolutionarily younger members maintain the ability to express both RNA and proteins. We have developed an approach using long-read single-molecule real-time (SMRT) sequencing that produces long reads that allow us to obtain detailed and accurate HERV-K HML-2 expression profiles. We applied this approach to study HERV-K expression in the presence or absence of productive HIV-1 infection of primary human CD4 + T cells. In addition to using SMRT sequencing, our strategy also includes the magnetic selection of the infected cells so that levels of background expression due to uninfected cells are kept at a minimum. The results presented here provide a blueprint for in-depth studies of the interactions of the authentic upregulated HERV-K HML-2 elements and HIV-1. Copyright © 2017 American Society for Microbiology.
Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates

PubMed Central

Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

2009-01-01

Background Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. Results We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. Conclusion These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes. PMID:19138430
Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates.

PubMed

Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

2009-01-12

Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes.
[The ENCODE project and functional genomics studies].

PubMed

Ding, Nan; Qu, Hongzhu; Fang, Xiangdong

2014-03-01

Upon the completion of the Human Genome Project, scientists have been trying to interpret the underlying genomic code for human biology. Since 2003, National Human Genome Research Institute (NHGRI) has invested nearly $0.3 billion and gathered over 440 scientists from more than 32 institutions in the United States, China, United Kingdom, Japan, Spain and Singapore to initiate the Encyclopedia of DNA Elements (ENCODE) project, aiming to identify and analyze all regulatory elements in the human genome. Taking advantage of the development of next-generation sequencing technologies and continuous improvement of experimental methods, ENCODE had made remarkable achievements: identified methylation and histone modification of DNA sequences and their regulatory effects on gene expression through altering chromatin structures, categorized binding sites of various transcription factors and constructed their regulatory networks, further revised and updated database for pseudogenes and non-coding RNA, and identified SNPs in regulatory sequences associated with diseases. These findings help to comprehensively understand information embedded in gene and genome sequences, the function of regulatory elements as well as the molecular mechanism underlying the transcriptional regulation by noncoding regions, and provide extensive data resource for life sciences, particularly for translational medicine. We re-viewed the contributions of high-throughput sequencing platform development and bioinformatical technology improve-ment to the ENCODE project, the association between epigenetics studies and the ENCODE project, and the major achievement of the ENCODE project. We also provided our prospective on the role of the ENCODE project in promoting the development of basic and clinical medicine.
Sialome of a Generalist Lepidopteran Herbivore: Identification of Transcripts and Proteins from Helicoverpa armigera Labial Salivary Glands

PubMed Central

Celorio-Mancera, Maria de la Paz; Courtiade, Juliette; Muck, Alexander; Heckel, David G.; Musser, Richard O.; Vogel, Heiko

2011-01-01

Although the importance of insect saliva in insect-host plant interactions has been acknowledged, there is very limited information on the nature and complexity of the salivary proteome in lepidopteran herbivores. We inspected the labial salivary transcriptome and proteome of Helicoverpa armigera, an important polyphagous pest species. To identify the majority of the salivary proteins we have randomly sequenced 19,389 expressed sequence tags (ESTs) from a normalized cDNA library of salivary glands. In parallel, a non-cytosolic enriched protein fraction was obtained from labial salivary glands and subjected to two-dimensional gel electrophoresis (2-DE) and de novo peptide sequencing. This procedure allowed comparison of peptides and EST sequences and enabled us to identify 65 protein spots from the secreted labial saliva 2DE proteome. The mass spectrometry analysis revealed ecdysone, glucose oxidase, fructosidase, carboxyl/cholinesterase and an uncharacterized protein previously detected in H. armigera midgut proteome. Consistently, their corresponding transcripts are among the most abundant in our cDNA library. We did find redundancy of sequence identification of saliva-secreted proteins suggesting multiple isoforms. As expected, we found several enzymes responsible for digestion and plant offense. In addition, we identified non-digestive proteins such as an arginine kinase and abundant proteins of unknown function. This identification of secreted salivary gland proteins allows a more comprehensive understanding of insect feeding and poses new challenges for the elucidation of protein function. PMID:22046331
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1987-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1990-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1988-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330

A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1989-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
The Impact of Reading Expressiveness on the Listening Comprehension of Storybooks by Prekindergarten Children

ERIC Educational Resources Information Center

Mira, William A.; Schwanenflugel, Paula J.

2013-01-01

Purpose: The purpose of this study was to determine the effect of oral reading expressiveness on the comprehension of storybooks by 4- and 5-year-old prekindergarten children. The possible impact of prosody on listening comprehension was explored. Method: Ninety-two prekindergarten children (M age = 57.26 months, SD = 3.89 months) listened to an…
Leveraging long read sequencing from a single individual to provide a comprehensive resource for benchmarking variant calling methods

PubMed Central

Mu, John C.; Tootoonchi Afshar, Pegah; Mohiyuddin, Marghoob; Chen, Xi; Li, Jian; Bani Asadi, Narges; Gerstein, Mark B.; Wong, Wing H.; Lam, Hugo Y. K.

2015-01-01

A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus, we leveraged the massive high-quality Sanger sequences from the HuRef genome to construct by far the most comprehensive gold set of a single individual, which was cross validated with deep Illumina sequencing, population datasets, and well-established algorithms. It was a necessary effort to completely reanalyze the HuRef genome as its previously published variants were mostly reported five years ago, suffering from compatibility, organization, and accuracy issues that prevent their direct use in benchmarking. Our extensive analysis and validation resulted in a gold set with high specificity and sensitivity. In contrast to the current gold sets of the NA12878 or HS1011 genomes, our gold set is the first that includes small variants, deletion SVs and insertion SVs up to a hundred thousand base-pairs. We demonstrate the utility of our HuRef gold set to benchmark several published SV detection tools. PMID:26412485
MetaMetaDB: a database and analytic system for investigating microbial habitability.

PubMed

Yang, Ching-chia; Iwasaki, Wataru

2014-01-01

MetaMetaDB (http://mmdb.aori.u-tokyo.ac.jp/) is a database and analytic system for investigating microbial habitability, i.e., how a prokaryotic group can inhabit different environments. The interaction between prokaryotes and the environment is a key issue in microbiology because distinct prokaryotic communities maintain distinct ecosystems. Because 16S ribosomal RNA (rRNA) sequences play pivotal roles in identifying prokaryotic species, a system that comprehensively links diverse environments to 16S rRNA sequences of the inhabitant prokaryotes is necessary for the systematic understanding of the microbial habitability. However, existing databases are biased to culturable prokaryotes and exhibit limitations in the comprehensiveness of the data because most prokaryotes are unculturable. Recently, metagenomic and 16S rRNA amplicon sequencing approaches have generated abundant 16S rRNA sequence data that encompass unculturable prokaryotes across diverse environments; however, these data are usually buried in large databases and are difficult to access. In this study, we developed MetaMetaDB (Meta-Metagenomic DataBase), which comprehensively and compactly covers 16S rRNA sequences retrieved from public datasets. Using MetaMetaDB, users can quickly generate hypotheses regarding the types of environments a prokaryotic group may be adapted to. We anticipate that MetaMetaDB will improve our understanding of the diversity and evolution of prokaryotes.
Polyphenism in social insects: insights from a transcriptome-wide analysis of gene expression in the life stages of the key pollinator, Bombus terrestris

PubMed Central

2011-01-01

Background Understanding polyphenism, the ability of a single genome to express multiple morphologically and behaviourally distinct phenotypes, is an important goal for evolutionary and developmental biology. Polyphenism has been key to the evolution of the Hymenoptera, and particularly the social Hymenoptera where the genome of a single species regulates distinct larval stages, sexual dimorphism and physical castes within the female sex. Transcriptomic analyses of social Hymenoptera will therefore provide unique insights into how changes in gene expression underlie such complexity. Here we describe gene expression in individual specimens of the pre-adult stages, sexes and castes of the key pollinator, the buff-tailed bumblebee Bombus terrestris. Results cDNA was prepared from mRNA from five life cycle stages (one larva, one pupa, one male, one gyne and two workers) and a total of 1,610,742 expressed sequence tags (ESTs) were generated using Roche 454 technology, substantially increasing the sequence data available for this important species. Overlapping ESTs were assembled into 36,354 B. terrestris putative transcripts, and functionally annotated. A preliminary assessment of differences in gene expression across non-replicated specimens from the pre-adult stages, castes and sexes was performed using R-STAT analysis. Individual samples from the life cycle stages of the bumblebee differed in the expression of a wide array of genes, including genes involved in amino acid storage, metabolism, immunity and olfaction. Conclusions Detailed analyses of immune and olfaction gene expression across phenotypes demonstrated how transcriptomic analyses can inform our understanding of processes central to the biology of B. terrestris and the social Hymenoptera in general. For example, examination of immunity-related genes identified high conservation of important immunity pathway components across individual specimens from the life cycle stages while olfactory-related genes exhibited differential expression with a wider repertoire of gene expression within adults, especially sexuals, in comparison to immature stages. As there is an absence of replication across the samples, the results of this study are preliminary but provide a number of candidate genes which may be related to distinct phenotypic stage expression. This comprehensive transcriptome catalogue will provide an important gene discovery resource for directed programmes in ecology, evolution and conservation of a key pollinator. PMID:22185240
Listening Comprehension Strategies: A Review of the Literature

ERIC Educational Resources Information Center

Berne, Jane E.

2004-01-01

Numerous studies related to listening comprehension strategies have been published in the past two decades. The present study seeks to build upon two previous reviews of listening comprehension strategies research. Of particular interest in this review are studies dealing with the types of cues used by listeners, the sequence of listening,…
Coordinate cytokine regulatory sequences

DOEpatents

Frazer, Kelly A.; Rubin, Edward M.; Loots, Gabriela G.

2005-05-10

The present invention provides CNS sequences that regulate the cytokine gene expression, expression cassettes and vectors comprising or lacking the CNS sequences, host cells and non-human transgenic animals comprising the CNS sequences or lacking the CNS sequences. The present invention also provides methods for identifying compounds that modulate the functions of CNS sequences as well as methods for diagnosing defects in the CNS sequences of patients.
Exploring the Genomic Roadmap and Molecular Phylogenetics Associated with MODY Cascades Using Computational Biology.

PubMed

Chakraborty, Chiranjib; Bandyopadhyay, Sanghamitra; Doss, C George Priya; Agoramoorthy, Govindasamy

2015-04-01

Maturity onset diabetes of the young (MODY) is a metabolic and genetic disorder. It is different from type 1 and type 2 diabetes with low occurrence level (1-2%) among all diabetes. This disorder is a consequence of β-cell dysfunction. Till date, 11 subtypes of MODY have been identified, and all of them can cause gene mutations. However, very little is known about the gene mapping, molecular phylogenetics, and co-expression among MODY genes and networking between cascades. This study has used latest servers and software such as VarioWatch, ClustalW, MUSCLE, G Blocks, Phylogeny.fr, iTOL, WebLogo, STRING, and KEGG PATHWAY to perform comprehensive analyses of gene mapping, multiple sequences alignment, molecular phylogenetics, protein-protein network design, co-expression analysis of MODY genes, and pathway development. The MODY genes are located in chromosomes-2, 7, 8, 9, 11, 12, 13, 17, and 20. Highly aligned block shows Pro, Gly, Leu, Arg, and Pro residues are highly aligned in the positions of 296, 386, 437, 455, 456 and 598, respectively. Alignment scores inform us that HNF1A and HNF1B proteins have shown high sequence similarity among MODY proteins. Protein-protein network design shows that HNF1A, HNF1B, HNF4A, NEUROD1, PDX1, PAX4, INS, and GCK are strongly connected, and the co-expression analyses between MODY genes also show distinct association between HNF1A and HNF4A genes. This study has used latest tools of bioinformatics to develop a rapid method to assess the evolutionary relationship, the network development, and the associations among eleven MODY genes and cascades. The prediction of sequence conservation, molecular phylogenetics, protein-protein network and the association between the MODY cascades enhances opportunities to get more insights into the less-known MODY disease.
An Evolutionary Landscape of A-to-I RNA Editome across Metazoan Species

PubMed Central

Hung, Li-Yuan; Chen, Yen-Ju; Mai, Te-Lun; Chen, Chia-Ying; Yang, Min-Yu; Chiang, Tai-Wei; Wang, Yi-Da

2018-01-01

Abstract Adenosine-to-inosine (A-to-I) editing is widespread across the kingdom Metazoa. However, for the lack of comprehensive analysis in nonmodel animals, the evolutionary history of A-to-I editing remains largely unexplored. Here, we detect high-confidence editing sites using clustering and conservation strategies based on RNA sequencing data alone, without using single-nucleotide polymorphism information or genome sequencing data from the same sample. We thereby unveil the first evolutionary landscape of A-to-I editing maps across 20 metazoan species (from worm to human), providing unprecedented evidence on how the editing mechanism gradually expands its territory and increases its influence along the history of evolution. Our result revealed that highly clustered and conserved editing sites tended to have a higher editing level and a higher magnitude of the ADAR motif. The ratio of the frequencies of nonsynonymous editing to that of synonymous editing remarkably increased with increasing the conservation level of A-to-I editing. These results thus suggest potentially functional benefit of highly clustered and conserved editing sites. In addition, spatiotemporal dynamics analyses reveal a conserved enrichment of editing and ADAR expression in the central nervous system throughout more than 300 Myr of divergent evolution in complex animals and the comparability of editing patterns between invertebrates and between vertebrates during development. This study provides evolutionary and dynamic aspects of A-to-I editome across metazoan species, expanding this important but understudied class of nongenomically encoded events for comprehensive characterization. PMID:29294013
Molecular cloning, sequence characterization and recombinant expression of Nanog gene in goat fibroblast cells using lentiviral based expression system.

PubMed

Singhal, Dinesh K; Singhal, Raxita; Malik, Hruda N; Kumar, Surender; Kumar, Sudarshan; Mohanty, Ashok K; Kaushik, Jai K; Malakar, Dhruba

2014-01-01

Nanog is a homeodomain containing protein which plays important roles in regulation of signaling pathways for maintenance and induction of pluripotency in stem cells. Because of its unique expression in stem cells it is also regarded as pluripotency marker. In this study goat Nanog (gNanog) gene has been amplified, cloned and characterized at sequence level with successful over-expression in CHO-K1 cell line using a lentiviral based system. gNanog ORF is 903 bp long which codes for Nanog protein of size 300 amino acids (aas). Complete nucleotide sequence shows some evolutionary mutation in goat in comparision to other species. Protein sequence of goat is highly similar to other species. Overall, gNanog nucleotide sequence and predicted protein sequence showed high similarity and minimum divergence with cattle (96 % identity/4 % divergence) and buffalo (94/5 %) while low similarity and high divergence with pig (84/15 %), human (81/23 %) and mouse (69/40 %) indicating evolutionary closeness of gNanog to cattle and buffalo. gNanog lentiviral expression construct was prepared for over-expression of Nanog gene in adult goat fibroblast cells. Lentiviral expression construct of Nanog enabled continuous protein expression for induction and maintenance of pluripotency. Western blotting revealed the expression of Nanog gene at protein level which supported that the lentiviral expression system is highly promising for Nanog protein expression in differentiated goat cell.
A dominant variant in the PDE1C gene is associated with nonsyndromic hearing loss.

PubMed

Wang, Li; Feng, Yong; Yan, Denise; Qin, Litao; Grati, M'hamed; Mittal, Rahul; Li, Tao; Sundhari, Abhiraami Kannan; Liu, Yalan; Chapagain, Prem; Blanton, Susan H; Liao, Shixiu; Liu, Xuezhong

2018-06-02

Identification of genes with variants causing non-syndromic hearing loss (NSHL) is challenging due to genetic heterogeneity. The difficulty is compounded by technical limitations that in the past prevented comprehensive gene identification. Recent advances in technology, using targeted capture and next-generation sequencing (NGS), is changing the face of gene identification and making it possible to rapidly and cost-effectively sequence the whole human exome. Here, we characterize a five-generation Chinese family with progressive, postlingual autosomal dominant nonsyndromic hearing loss (ADNSHL). By combining population-specific mutation arrays, targeted deafness genes panel, whole exome sequencing (WES), we identified PDE1C (Phosphodiesterase 1C) c.958G>T (p.A320S) as the disease-associated variant. Structural modeling insights into p.A320S strongly suggest that the sequence alteration will likely affect the substrate-binding pocket of PDE1C. By whole-mount immunofluorescence on postnatal day 3 mouse cochlea, we show its expression in outer (OHC) and inner (IHC) hair cells cytosol co-localizing with Lamp-1 in lysosomes. Furthermore, we provide evidence that the variant alters the PDE1C hydrolytic activity for both cyclic adenosine monophosphate (cAMP) and cyclic guanosine monophosphate (cGMP). Collectively, our findings indicate that the c.958G>T variant in PDE1C may disrupt the cross talk between cGMP-signaling and cAMP pathways in Ca 2+ homeostasis.
Large-Scale Collection and Analysis of Full-Length cDNAs from Brachypodium distachyon and Integration with Pooideae Sequence Resources

PubMed Central

Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Takahashi, Fuminori; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

2013-01-01

A comprehensive collection of full-length cDNAs is essential for correct structural gene annotation and functional analyses of genes. We constructed a mixed full-length cDNA library from 21 different tissues of Brachypodium distachyon Bd21, and obtained 78,163 high quality expressed sequence tags (ESTs) from both ends of ca. 40,000 clones (including 16,079 contigs). We updated gene structure annotations of Brachypodium genes based on full-length cDNA sequences in comparison with the latest publicly available annotations. About 10,000 non-redundant gene models were supported by full-length cDNAs; ca. 6,000 showed some transcription unit modifications. We also found ca. 580 novel gene models, including 362 newly identified in Bd21. Using the updated transcription start sites, we searched a total of 580 plant cis-motifs in the −3 kb promoter regions and determined a genome-wide Brachypodium promoter architecture. Furthermore, we integrated the Brachypodium full-length cDNAs and updated gene structures with available sequence resources in wheat and barley in a web-accessible database, the RIKEN Brachypodium FL cDNA database. The database represents a “one-stop” information resource for all genomic information in the Pooideae, facilitating functional analysis of genes in this model grass plant and seamless knowledge transfer to the Triticeae crops. PMID:24130698
RNA-seq analysis of Rubus idaeus cv. Nova: transcriptome sequencing and de novo assembly for subsequent functional genomics approaches.

PubMed

Hyun, Tae Kyung; Lee, Sarah; Kumar, Dhinesh; Rim, Yeonggil; Kumar, Ritesh; Lee, Sang Yeol; Lee, Choong Hwan; Kim, Jae-Yean

2014-10-01

Using Illumina sequencing technology, we have generated the large-scale transcriptome sequencing data containing abundant information on genes involved in the metabolic pathways in R. idaeus cv. Nova fruits. Rubus idaeus (Red raspberry) is one of the important economical crops that possess numerous nutrients, micronutrients and phytochemicals with essential health benefits to human. The molecular mechanism underlying the ripening process and phytochemical biosynthesis in red raspberry is attributed to the changes in gene expression, but very limited transcriptomic and genomic information in public databases is available. To address this issue, we generated more than 51 million sequencing reads from R. idaeus cv. Nova fruit using Illumina RNA-Seq technology. After de novo assembly, we obtained 42,604 unigenes with an average length of 812 bp. At the protein level, Nova fruit transcriptome showed 77 and 68 % sequence similarities with Rubus coreanus and Fragaria versa, respectively, indicating the evolutionary relationship between them. In addition, 69 % of assembled unigenes were annotated using public databases including NCBI non-redundant, Cluster of Orthologous Groups and Gene ontology database, suggesting that our transcriptome dataset provides a valuable resource for investigating metabolic processes in red raspberry. To analyze the relationship between several novel transcripts and the amounts of metabolites such as γ-aminobutyric acid and anthocyanins, real-time PCR and target metabolite analysis were performed on two different ripening stages of Nova. This is the first attempt using Illumina sequencing platform for RNA sequencing and de novo assembly of Nova fruit without reference genome. Our data provide the most comprehensive transcriptome resource available for Rubus fruits, and will be useful for understanding the ripening process and for breeding R. idaeus cultivars with improved fruit quality.
History and current status of wheat miRNAs using next-generation sequencing and their roles in development and stress.

PubMed

Budak, Hikmet; Khan, Zaeema; Kantar, Melda

2015-05-01

As small molecules that aid in posttranscriptional silencing, microRNA (miRNA) discovery and characterization have vastly benefited from the recent development and widespread application of next-generation sequencing (NGS) technologies. Several miRNAs were identified through sequencing of constructed small RNA libraries, whereas others were predicted by in silico methods using the recently accumulating sequence data. NGS was a major breakthrough in efforts to sequence and dissect the genomes of plants, including bread wheat and its progenitors, which have large, repetitive and complex genomes. Availability of survey sequences of wheat whole genome and its individual chromosomes enabled researchers to predict and assess wheat miRNAs both in the subgenomic and whole genome levels. Moreover, small RNA construction and sequencing-based studies identified several putative development- and stress-related wheat miRNAs, revealing their differential expression patterns in specific developmental stages and/or in response to stress conditions. With the vast amount of wheat miRNAs identified in recent years, we are approaching to an overall knowledge on the wheat miRNA repertoire. In the following years, more comprehensive research in relation to miRNA conservation or divergence across wheat and its close relatives or progenitors should be performed. Results may serve valuable in understanding both the significant roles of species-specific miRNAs and also provide us information in relation to the dynamics between miRNAs and evolution in wheat. Furthermore, putative development- or stress-related miRNAs identified should be subjected to further functional analysis, which may be valuable in efforts to develop wheat with better resistance and/or yield. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation.

PubMed

Hara, Yuichiro; Tatsumi, Kaori; Yoshida, Michio; Kajikawa, Eriko; Kiyonari, Hiroshi; Kuraku, Shigehiro

2015-11-18

RNA-seq enables gene expression profiling in selected spatiotemporal windows and yields massive sequence information with relatively low cost and time investment, even for non-model species. However, there remains a large room for optimizing its workflow, in order to take full advantage of continuously developing sequencing capacity. Transcriptome sequencing for three embryonic stages of Madagascar ground gecko (Paroedura picta) was performed with the Illumina platform. The output reads were assembled de novo for reconstructing transcript sequences. In order to evaluate the completeness of transcriptome assemblies, we prepared a reference gene set consisting of vertebrate one-to-one orthologs. To take advantage of increased read length of >150 nt, we demonstrated shortened RNA fragmentation time, which resulted in a dramatic shift of insert size distribution. To evaluate products of multiple de novo assembly runs incorporating reads with different RNA sources, read lengths, and insert sizes, we introduce a new reference gene set, core vertebrate genes (CVG), consisting of 233 genes that are shared as one-to-one orthologs by all vertebrate genomes examined (29 species)., The completeness assessment performed by the computational pipelines CEGMA and BUSCO referring to CVG, demonstrated higher accuracy and resolution than with the gene set previously established for this purpose. As a result of the assessment with CVG, we have derived the most comprehensive transcript sequence set of the Madagascar ground gecko by means of assembling individual libraries followed by clustering the assembled sequences based on their overall similarities. Our results provide several insights into optimizing de novo RNA-seq workflow, including the coordination between library insert size and read length, which manifested in improved connectivity of assemblies. The approach and assembly assessment with CVG demonstrated here would be applicable to transcriptome analysis of other species as well as whole genome analyses.
Expressed sequence tags from heat-shocked seagrass Zostera noltii (Hornemann) from its southern distribution range.

PubMed

Massa, Sónia I; Pearson, Gareth A; Aires, Tânia; Kube, Michael; Olsen, Jeanine L; Reinhardt, Richard; Serrão, Ester A; Arnaud-Haond, Sophie

2011-09-01

Predicted global climate change threatens the distributional ranges of species worldwide. We identified genes expressed in the intertidal seagrass Zostera noltii during recovery from a simulated low tide heat-shock exposure. Five Expressed Sequence Tag (EST) libraries were compared, corresponding to four recovery times following sub-lethal temperature stress, and a non-stressed control. We sequenced and analyzed 7009 sequence reads from 30min, 2h, 4h and 24h after the beginning of the heat-shock (AHS), and 1585 from the control library, for a total of 8594 sequence reads. Among 51 Tentative UniGenes (TUGs) exhibiting significantly different expression between libraries, 19 (37.3%) were identified as 'molecular chaperones' and were over-expressed following heat-shock, while 12 (23.5%) were 'photosynthesis TUGs' generally under-expressed in heat-shocked plants. A time course analysis of expression showed a rapid increase in expression of the molecular chaperone class, most of which were heat-shock proteins; which increased from 2 sequence reads in the control library to almost 230 in the 30min AHS library, followed by a slow decrease during further recovery. In contrast, 'photosynthesis TUGs' were under-expressed 30min AHS compared with the control library, and declined progressively with recovery time in the stress libraries, with a total of 29 sequence reads 24h AHS, compared with 125 in the control. A total of 4734 TUGs were screened for EST-Single Sequence Repeats (EST-SSRs) and 86 microsatellites were identified. Copyright © 2011 Elsevier B.V. All rights reserved.
Contributions of Phonological Memory, Language Comprehension and Hearing to the Expressive Language of Adolescents and Young Adults with Down Syndrome

ERIC Educational Resources Information Center

Laws, Glynis

2004-01-01

Background: Expressive language constitutes a major challenge to the development of individuals with Down syndrome. This paper investigates the relationships between expressive language abilities, language comprehension and the deficits in verbal short-term memory and hearing which are also associated with the syndrome. Methods: Tests of nonverbal…
Methods and compositions for regulating gene expression in plant cells

NASA Technical Reports Server (NTRS)

Dai, Shunhong (Inventor); Beachy, Roger N. (Inventor); Luis, Maria Isabel Ordiz (Inventor)

2010-01-01

Novel chimeric plant promoter sequences are provided, together with plant gene expression cassettes comprising such sequences. In certain preferred embodiments, the chimeric plant promoters comprise the BoxII cis element and/or derivatives thereof. In addition, novel transcription factors are provided, together with nucleic acid sequences encoding such transcription factors and plant gene expression cassettes comprising such nucleic acid sequences. In certain preferred embodiments, the novel transcription factors comprise the acidic domain, or fragments thereof, of the RF2a transcription factor. Methods for using the chimeric plant promoter sequences and novel transcription factors in regulating the expression of at least one gene of interest are provided, together with transgenic plants comprising such chimeric plant promoter sequences and novel transcription factors.
Genome-wide analysis of TCP family in tobacco.

PubMed

Chen, L; Chen, Y Q; Ding, A M; Chen, H; Xia, F; Wang, W F; Sun, Y H

2016-05-23

The TCP family is a transcription factor family, members of which are extensively involved in plant growth and development as well as in signal transduction in the response against many physiological and biochemical stimuli. In the present study, 61 TCP genes were identified in tobacco (Nicotiana tabacum) genome. Bioinformatic methods were employed for predicting and analyzing the gene structure, gene expression, phylogenetic analysis, and conserved domains of TCP proteins in tobacco. The 61 NtTCP genes were divided into three diverse groups, based on the division of TCP genes in tomato and Arabidopsis, and the results of the conserved domain and sequence analyses further confirmed the classification of the NtTCP genes. The expression pattern of NtTCP also demonstrated that majority of these genes play important roles in all the tissues, while some special genes exercise their functions only in specific tissues. In brief, the comprehensive and thorough study of the TCP family in other plants provides sufficient resources for studying the structure and functions of TCPs in tobacco.
Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs

PubMed Central

Guttman, Mitchell; Garber, Manuel; Levin, Joshua Z.; Donaghey, Julie; Robinson, James; Adiconis, Xian; Fan, Lin; Koziol, Magdalena J.; Gnirke, Andreas; Nusbaum, Chad; Rinn, John L.; Lander, Eric S.; Regev, Aviv

2010-01-01

RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes. PMID:20436462

Strategies to identify natural antisense transcripts.

PubMed

Sun, Yulong; Li, Dijie; Zhang, Ru; Peng, Shang; Zhang, Ge; Yang, Tuanmin; Qian, Airong

2017-01-01

Natural antisense transcripts, originally considered as transcriptional noises arising from so-called "junk DNA″, are recently recognized as important modulators for gene regulation. They are prevalent in nearly all realms of life and have been found to modulate gene expression positively or negatively. By affecting almost all stages of gene expression range from pre-transcriptional, transcriptional and post-transcriptional to translation, NATs are fundamentally involved in various biological processes. However, compared to increasing huge data from transcriptional analysis especially high-throughput sequencing technologies (such as RNA-seq), limited functional NATs (around 70) are so far reported, which hinder our advanced comprehensive understanding for this field. Hence, efficient strategies for identifying NATs are urgently desired. In this review, we discussed the current strategies for identifying NATs, with a focus on the advantages, disadvantages, and applications of methods isolating functional NATs. Moreover, publicly available databases for NATs were also discussed. Copyright © 2016 Elsevier B.V. and Société Française de Biochimie et Biologie Moléculaire (SFBBM). All rights reserved.
Experimental annotation of the human genome using microarray technology.

PubMed

Shoemaker, D D; Schadt, E E; Armour, C D; He, Y D; Garrett-Engele, P; McDonagh, P D; Loerch, P M; Leonardson, A; Lum, P Y; Cavet, G; Wu, L F; Altschuler, S J; Edwards, S; King, J; Tsang, J S; Schimmack, G; Schelter, J M; Koch, J; Ziman, M; Marton, M J; Li, B; Cundiff, P; Ward, T; Castle, J; Krolewski, M; Meyer, M R; Mao, M; Burchard, J; Kidd, M J; Dai, H; Phillips, J W; Linsley, P S; Stoughton, R; Scherer, S; Boguski, M S

2001-02-15

The most important product of the sequencing of a genome is a complete, accurate catalogue of genes and their products, primarily messenger RNA transcripts and their cognate proteins. Such a catalogue cannot be constructed by computational annotation alone; it requires experimental validation on a genome scale. Using 'exon' and 'tiling' arrays fabricated by ink-jet oligonucleotide synthesis, we devised an experimental approach to validate and refine computational gene predictions and define full-length transcripts on the basis of co-regulated expression of their exons. These methods can provide more accurate gene numbers and allow the detection of mRNA splice variants and identification of the tissue- and disease-specific conditions under which genes are expressed. We apply our technique to chromosome 22q under 69 experimental condition pairs, and to the entire human genome under two experimental conditions. We discuss implications for more comprehensive, consistent and reliable genome annotation, more efficient, full-length complementary DNA cloning strategies and application to complex diseases.
Circular RNA biogenesis can proceed through an exon-containing lariat precursor

PubMed Central

Barrett, Steven P; Wang, Peter L; Salzman, Julia

2015-01-01

Pervasive expression of circular RNA is a recently discovered feature of eukaryotic gene expression programs, yet its function remains largely unknown. The presumed biogenesis of these RNAs involves a non-canonical ‘backsplicing’ event. Recent studies in mammalian cell culture posit that backsplicing is facilitated by inverted repeats flanking the circularized exon(s). Although such sequence elements are common in mammals, they are rare in lower eukaryotes, making current models insufficient to describe circularization. Through systematic splice site mutagenesis and the identification of splicing intermediates, we show that circular RNA in Schizosaccharomyces pombe is generated through an exon-containing lariat precursor. Furthermore, we have performed high-throughput and comprehensive mutagenesis of a circle-forming exon, which enabled us to discover a systematic effect of exon length on RNA circularization. Our results uncover a mechanism for circular RNA biogenesis that may account for circularization in genes that lack noticeable flanking intronic secondary structure. DOI: http://dx.doi.org/10.7554/eLife.07540.001 PMID:26057830
Initiation and termination of DNA replication during S phase in relation to cyclins D1, E and A, p21WAF1, Cdt1 and the p12 subunit of DNA polymerase δ revealed in individual cells by cytometry

PubMed Central

Darzynkiewicz, Zbigniew; Zhao, Hong; Zhang, Sufang; Marietta, Y.W.T. Lee; Ernest, Y.C. Lee; Zhang, Zhongtao

2015-01-01

During our recent studies on mechanism of the regulation of human DNA polymerase δ in preparation for DNA replication or repair, multiparameter imaging cytometry as exemplified by laser scanning cytometry (LSC) has been used to assess changes in expression of the following nuclear proteins associated with initiation of DNA replication: cyclin A, PCNA, Ki-67, p21WAF1, DNA replication factor Cdt1 and the smallest subunit of DNA polymerase δ, p12. In the present review, rather than focusing on Pol δ, we emphasize the application of LSC in these studies and outline possibilities offered by the concurrent differential analysis of DNA replication in conjunction with expression of the nuclear proteins. A more extensive analysis of the data on a correlation between rates of EdU incorporation, likely reporting DNA replication, and expression of these proteins, is presently provided. New data, specifically on the expression of cyclin D1 and cyclin E with respect to EdU incorporation as well as on a relationship between expression of cyclin A vs. p21WAF1 and Ki-67 vs. Cdt1, are also reported. Of particular interest is the observation that this approach makes it possible to assess the temporal sequence of degradation of cyclin D1, p21WAF1, Cdt1 and p12, each with respect to initiation of DNA replication and with respect to each other. Also the sequence or reappearance of these proteins in G2 after termination of DNA replication is assessed. The reviewed data provide a more comprehensive presentation of potential markers, whose presence or absence marks the DNA replicating cells. Discussed is also usefulness of these markers as indicators of proliferative activity in cancer tissues that may bear information on tumor progression and have a prognostic value. PMID:26059433
Gene Expression Differences in Infected and Noninfected Middle Ear Complementary DNA Libraries

PubMed Central

Kerschner, Joseph E.; Horsey, Edward; Ahmed, Azad; Erbe, Christy; Khampang, Pawjai; Cioffi, Joseph; Hu, Fen Ze; Post, James Christopher; Ehrlich, Garth D.

2010-01-01

Objectives To investigate genetic differences in middle ear mucosa (MEM) with nontypeable Haemophilus influenzae (NTHi) infection. Genetic upregulation and downregulation occurs in MEM during otitis media (OM) pathogenesis. A comprehensive assessment of these genetic differences using the techniques of complementary DNA (cDNA) library creation has not been performed. Design The cDNA libraries were constructed from NTHi-infected and noninfected chinchilla MEM. Random clones were picked, sequenced bidirectionally, and submitted to the National Center for Biotechnology Information (NCBI) Expressed Sequence Tags database, where they were assigned accession numbers. These numbers were used with the basic local alignment search tool (BLAST) to align clones against the nonredundant nucleotide database at NCBI. Results Analysis with the Web-based statistical program FatiGO identified several biological processes with significant differences in numbers of represented genes. Processes involved in immune, stress, and wound responses were more prevalent in the NTHi-infected library. S100 calcium-binding protein A9 (S100A9); secretory leukoprotease inhibitor (SLPI); β2-microglobulin (B2M); ferritin, heavy-chain polypeptide 1 (FTH1); and S100 calcium-binding protein A8 (S100A8) were expressed at significantly higher levels in the NTHi-infected library. Calcium-binding proteins S100A9 and S100A8 serve as markers for inflammation and have antibacterial effects. Secretory leukoprotease inhibitor is an antibacterial protein that inhibits stimuli-induced MUC1, MUC2, and MUC5AC production. Conclusions A number of genes demonstrate changes during the pathogenesis of OM, including SLPI, which has an impact on mucin gene expression; this expression is known to be an important regulator in OM. The techniques described herein provide a framework for future investigations to more thoroughly understand molecular changes in the middle ear, which will likely be important in developing new therapeutic and intervention strategies. PMID:19153305
Genome-Wide Analysis of Differentially Expressed Genes Relevant to Rhizome Formation in Lotus Root (Nelumbo nucifera Gaertn)

PubMed Central

Yin, Jingjing; Li, Liangjun; Chen, Xuehao

2013-01-01

Lotus root is a popular wetland vegetable which produces edible rhizome. At the molecular level, the regulation of rhizome formation is very complex, which has not been sufficiently addressed in research. In this study, to identify differentially expressed genes (DEGs) in lotus root, four libraries (L1 library: stolon stage, L2 library: initial swelling stage, L3 library: middle swelling stage, L4: later swelling stage) were constructed from the rhizome development stages. High-throughput tag-sequencing technique was used which is based on Solexa Genome Analyzer Platform. Approximately 5.0 million tags were sequenced, and 4542104, 4474755, 4777919, and 4750348 clean tags including 151282, 137476, 215872, and 166005 distinct tags were obtained after removal of low quality tags from each library respectively. More than 43% distinct tags were unambiguous tags mapping to the reference genes, and 40% were unambiguous tag-mapped genes. From L1, L2, L3, and L4, total 20471, 18785, 23448, and 21778 genes were annotated, after mapping their functions in existing databases. Profiling of gene expression in L1/L2, L2/L3, and L3/L4 libraries were different among most of the selected 20 DEGs. Most of the DEGs in L1/L2 libraries were relevant to fiber development and stress response, while in L2/L3 and L3/L4 libraries, major of the DEGs were involved in metabolism of energy and storage. All up-regulated transcriptional factors in four libraries and 14 important rhizome formation-related genes in four libraries were also identified. In addition, the expression of 9 genes from identified DEGs was performed by qRT-PCR method. In a summary, this study provides a comprehensive understanding of gene expression during the rhizome formation in lotus root. PMID:23840598
Initiation and termination of DNA replication during S phase in relation to cyclins D1, E and A, p21WAF1, Cdt1 and the p12 subunit of DNA polymerase δ revealed in individual cells by cytometry.

PubMed

Darzynkiewicz, Zbigniew; Zhao, Hong; Zhang, Sufang; Lee, Marietta Y W T; Lee, Ernest Y C; Zhang, Zhongtao

2015-05-20

During our recent studies on mechanism of the regulation of human DNA polymerase δ in preparation for DNA replication or repair, multiparameter imaging cytometry as exemplified by laser scanning cytometry (LSC) has been used to assess changes in expression of the following nuclear proteins associated with initiation of DNA replication: cyclin A, PCNA, Ki-67, p21(WAF1), DNA replication factor Cdt1 and the smallest subunit of DNA polymerase δ, p12. In the present review, rather than focusing on Pol δ, we emphasize the application of LSC in these studies and outline possibilities offered by the concurrent differential analysis of DNA replication in conjunction with expression of the nuclear proteins. A more extensive analysis of the data on a correlation between rates of EdU incorporation, likely reporting DNA replication, and expression of these proteins, is presently provided. New data, specifically on the expression of cyclin D1 and cyclin E with respect to EdU incorporation as well as on a relationship between expression of cyclin A vs. p21(WAF1) and Ki-67 vs. Cdt1, are also reported. Of particular interest is the observation that this approach makes it possible to assess the temporal sequence of degradation of cyclin D1, p21(WAF1), Cdt1 and p12, each with respect to initiation of DNA replication and with respect to each other. Also the sequence or reappearance of these proteins in G2 after termination of DNA replication is assessed. The reviewed data provide a more comprehensive presentation of potential markers, whose presence or absence marks the DNA replicating cells. Discussed is also usefulness of these markers as indicators of proliferative activity in cancer tissues that may bear information on tumor progression and have a prognostic value.
Loss of Chromosome 18 in Neuroendocrine Tumors of the Small Intestine: The Enigma Remains.

PubMed

Nieser, Maike; Henopp, Tobias; Brix, Joachim; Stoß, Laura; Sitek, Barbara; Naboulsi, Wael; Anlauf, Martin; Schlitter, Anna M; Klöppel, Günter; Gress, Thomas; Moll, Roland; Bartsch, Detlef K; Heverhagen, Anna E; Knoefel, Wolfram T; Kaemmerer, Daniel; Haybaeck, Johannes; Fend, Falko; Sperveslage, Jan; Sipos, Bence

2017-01-01

Neuroendocrine tumors of the small intestine (SI-NETs) exhibit an increasing incidence and high mortality rate. Until now, no fundamental molecular event has been linked to the tumorigenesis and progression of these tumors. Only the loss of chromosome 18 (Chr18) has been shown in up to two thirds of SI-NETs, whereby the significance of this alteration is still not understood. We therefore performed the first comprehensive study to identify Chr18-related events at the genetic, epigenetic and gene/protein expression levels. We did expression analysis of all seven putative Chr18-related tumor suppressors by quantitative real-time PCR (qRT-PCR), Western blot and immunohistochemistry. Next-generation exome sequencing and SNP array analysis were performed with five SI-NETs with (partial) loss of Chr18. Finally, we analyzed all microRNAs (miRNAs) located on Chr18 by qRT-PCR, comparing Chr18+/- and Chr18+/+ SI-NETs. Only DCC (deleted in colorectal cancer) revealed loss of/greatly reduced expression in 6/21 cases (29%). No relevant loss of SMAD2, SMAD4, elongin A3 and CABLES was detected. PMAIP1 and maspin were absent at the protein level. Next-generation sequencing did not reveal relevant recurrent somatic mutations on Chr18 either in an exploratory cohort of five SI-NETs, or in a validation cohort (n = 30). SNP array analysis showed no additional losses. The quantitative analysis of all 27 Chr18-related miRNAs revealed no difference in expression between Chr18+/- and Chr18+/+ SI-NETs. DCC seems to be the only Chr18-related tumor suppressor affected by the monoallelic loss of Chr18 resulting in a loss of DCC protein expression in one third of SI-NETs. No additional genetic or epigenetic alterations were present on Chr18. © 2016 S. Karger AG, Basel.
Transcriptome Analysis of Gene Expression during Chinese Water Chestnut Storage Organ Formation

PubMed Central

Chen, Sainan; Wang, Yan; Yu, Meizhen; Chen, Xuehao; Li, Liangjun; Yin, Jingjing

2016-01-01

The product organ (storage organ; corm) of the Chinese water chestnut has become a very popular food in Asian countries because of its unique nutritional value. Corm formation is a complex biological process, and extensive whole genome analysis of transcripts during corm development has not been carried out. In this study, four corm libraries at different developmental stages were constructed, and gene expression was identified using a high-throughput tag sequencing technique. Approximately 4.9 million tags were sequenced, and 4,371,386, 4,372,602, 4,782,494, and 5,276,540 clean tags, including 119,676, 110,701, 100,089, and 101,239 distinct tags, respectively, were obtained after removal of low-quality tags from each library. More than 39% of the distinct tags were unambiguous and could be mapped to reference genes, while 40% were unambiguous tag-mapped genes. After mapping their functions in existing databases, a total of 11,592, 10,949, 10,585, and 7,111 genes were annotated from the B1, B2, B3, and B4 libraries, respectively. Analysis of the differentially expressed genes (DEGs) in B1/B2, B2/B3, and B3/B4 libraries showed that most of the DEGs at the B1/B2 stages were involved in carbohydrate and hormone metabolism, while the majority of DEGs were involved in energy metabolism and carbohydrate metabolism at the B2/B3 and B3/B4 stages. All of the upregulated transcription factors and 9 important genes related to product organ formation in the above four stages were also identified. The expression changes of nine of the identified DEGs were validated using a quantitative PCR approach. This study provides a comprehensive understanding of gene expression during corm formation in the Chinese water chestnut. PMID:27716802
Sequences 5' to translation start regulate expression of petunia rbcS genes.

PubMed Central

Dean, C; Favreau, M; Bedbrook, J; Dunsmuir, P

1989-01-01

The promoter sequences that contribute to quantitative differences in expression of the petunia genes (rbcS) encoding the small subunit of ribulose bisphosphate carboxylase have been characterized. The promoter regions of the two most abundantly expressed petunia rbcS genes, SSU301 and SSU611, show sequence similarity not present in other rbcS genes. We investigated the significance of these and other sequences by adding specific regions from the SSU301 promoter (the most strongly expressed gene) to equivalent regions in the SSU911 promoter (the least strongly expressed gene) and assaying the expression of the fusions in transgenic tobacco plants. In this way, we characterized an SSU301 promoter region (either from -285 to -178 or -291 to -204) which, when added to SSU911, in either orientation, increased SSU911 expression 25-fold. This increase was equivalent to that caused by addition of the entire SSU301 5'-flanking region. Replacement of SSU911 promoter sequences between -198 and the start codon with sequences from the equivalent region of SSU301 did not increase SSU911 expression significantly. The -291 to -204 SSU301 promoter fragment contributes significantly to quantitative differences in expression between the petunia rbcS genes. PMID:2535543
The FaceBase Consortium: A comprehensive program to facilitate craniofacial research

PubMed Central

Hochheiser, Harry; Aronow, Bruce J.; Artinger, Kristin; Beaty, Terri H.; Brinkley, James F.; Chai, Yang; Clouthier, David; Cunningham, Michael L.; Dixon, Michael; Donahue, Leah Rae; Fraser, Scott E.; Hallgrimsson, Benedikt; Iwata, Junichi; Klein, Ophir; Marazita, Mary L.; Murray, Jeffrey C.; Murray, Stephen; de Villena, Fernando Pardo-Manuel; Postlethwait, John; Potter, Steven; Shapiro, Linda; Spritz, Richard; Visel, Axel; Weinberg, Seth M.; Trainor, Paul A.

2012-01-01

The FaceBase Consortium consists of ten interlinked research and technology projects whose goal is to generate craniofacial research data and technology for use by the research community through a central data management and integrated bioinformatics hub. Funded by the National Institute of Dental and Craniofacial Research (NIDCR) and currently focused on studying the development of the middle region of the face, the Consortium will produce comprehensive datasets of global gene expression patterns, regulatory elements and sequencing; will generate anatomical and molecular atlases; will provide human normative facial data and other phenotypes; conduct follow up studies of a completed genome-wide association study; generate independent data on the genetics of craniofacial development, build repositories of animal models and of human samples and data for community access and analysis; and will develop software tools and animal models for analyzing and functionally testing and integrating these data. The FaceBase website (http://www.facebase.org) will serve as a web home for these efforts, providing interactive tools for exploring these datasets, together with discussion forums and other services to support and foster collaboration within the craniofacial research community. PMID:21458441
Lessons from single-cell transcriptome analysis of oxygen-sensing cells.

PubMed

Zhou, Ting; Matsunami, Hiroaki

2018-05-01

The advent of single-cell RNA-sequencing (RNA-Seq) technology has enabled transcriptome profiling of individual cells. Comprehensive gene expression analysis at the single-cell level has proven to be effective in characterizing the most fundamental aspects of cellular function and identity. This unbiased approach is revolutionary for small and/or heterogeneous tissues like oxygen-sensing cells in identifying key molecules. Here, we review the major methods of current single-cell RNA-Seq technology. We discuss how this technology has advanced the understanding of oxygen-sensing glomus cells in the carotid body and helped uncover novel oxygen-sensing cells and mechanisms in the mice olfactory system. We conclude by providing our perspective on future single-cell RNA-Seq research directed at oxygen-sensing cells.
Integrated analysis of miRNA and mRNA expression profiles in tilapia gonads at an early stage of sex differentiation.

PubMed

Tao, Wenjing; Sun, Lina; Shi, Hongjuan; Cheng, Yunying; Jiang, Dongneng; Fu, Beide; Conte, Matthew A; Gammerdinger, William J; Kocher, Thomas D; Wang, Deshou

2016-05-04

MicroRNAs (miRNAs) represent a second regulatory network that has important effects on gene expression and protein translation during biological process. However, the possible role of miRNAs in the early stages of fish sex differentiation is not well understood. In this study, we carried an integrated analysis of miRNA and mRNA expression profiles to explore their possibly regulatory patterns at the critical stage of sex differentiation in tilapia. We identified 279 pre-miRNA genes in tilapia genome, which were highly conserved in other fish species. Based on small RNA library sequencing, we identified 635 mature miRNAs in tilapia gonads, in which 62 and 49 miRNAs showed higher expression in XX and XY gonads, respectively. The predicted targets of these sex-biased miRNAs (e.g., miR-9, miR-21, miR-30a, miR-96, miR-200b, miR-212 and miR-7977) included genes encoding key enzymes in steroidogenic pathways (Cyp11a1, Hsd3b, Cyp19a1a, Hsd11b) and key molecules involved in vertebrate sex differentiation (Foxl2, Amh, Star1, Sf1, Dmrt1, and Gsdf). These genes also showed sex-biased expression in tilapia gonads at 5 dah. Some miRNAs (e.g., miR-96 and miR-737) targeted multiple genes involved in steroid synthesis, suggesting a complex miRNA regulatory network during early sex differentiation in this fish. The sequence and expression patterns of most miRNAs in tilapia are conserved in fishes, indicating the basic functions of vertebrate miRNAs might share a common evolutionary origin. This comprehensive analysis of miRNA and mRNA at the early stage of molecular sex differentiation in tilapia XX and XY gonads lead to the discovery of differentially expressed miRNAs and their putative targets, which will facilitate studies of the regulatory network of molecular sex determination and differentiation in fishes.
A comprehensive data mining study shows that most nuclear receptors act as newly proposed homeostasis-associated molecular pattern receptors.

PubMed

Wang, Luqiao; Nanayakkara, Gayani; Yang, Qian; Tan, Hongmei; Drummer, Charles; Sun, Yu; Shao, Ying; Fu, Hangfei; Cueto, Ramon; Shan, Huimin; Bottiglieri, Teodoro; Li, Ya-Feng; Johnson, Candice; Yang, William Y; Yang, Fan; Xu, Yanjie; Xi, Hang; Liu, Weiqing; Yu, Jun; Choi, Eric T; Cheng, Xiaoshu; Wang, Hong; Yang, Xiaofeng

2017-10-24

Nuclear receptors (NRs) can regulate gene expression; therefore, they are classified as transcription factors. Despite the extensive research carried out on NRs, still several issues including (1) the expression profile of NRs in human tissues, (2) how the NR expression is modulated during atherosclerosis and metabolic diseases, and (3) the overview of the role of NRs in inflammatory conditions are not fully understood. To determine whether and how the expression of NRs are regulated in physiological/pathological conditions, we took an experimental database analysis to determine expression of all 48 known NRs in 21 human and 17 murine tissues as well as in pathological conditions. We made the following significant findings: (1) NRs are differentially expressed in tissues, which may be under regulation by oxygen sensors, angiogenesis pathway, stem cell master regulators, inflammasomes, and tissue hypo-/hypermethylation indexes; (2) NR sequence mutations are associated with increased risks for development of cancers and metabolic, cardiovascular, and autoimmune diseases; (3) NRs have less tendency to be upregulated than downregulated in cancers, and autoimmune and metabolic diseases, which may be regulated by inflammation pathways and mitochondrial energy enzymes; and (4) the innate immune sensor inflammasome/caspase-1 pathway regulates the expression of most NRs. Based on our findings, we propose a new paradigm that most nuclear receptors are anti-inflammatory homeostasis-associated molecular pattern receptors (HAMPRs). Our results have provided a novel insight on NRs as therapeutic targets in metabolic diseases, inflammations, and malignancies.
Microarray analysis of differentially expressed genes engaged in fruit development between Prunus mume and Prunus armeniaca.

PubMed

Li, Xiaoying; Korir, Nicholas Kibet; Liu, Lili; Shangguan, Lingfei; Wang, Yuzhu; Han, Jian; Chen, Ming; Fang, Jinggui

2012-11-15

Microarray analysis is a technique that can be employed to provide expression profiles of single genes and new insights to elucidate the biological mechanisms responsible for fruit development. To evaluate expression of genes mostly engaged in fruit development between Prunus mume and Prunus armeniaca, we first identified differentially expressed transcripts along the entire fruit life cycle by using microarrays spotted with 10,641 ESTs collected from P. mume and other Prunus EST sequences. A total of 1418 ESTs were selected after quality control of microarray spots and analysis for differential gene expression patterns during fruit development of P. mume and P. Armeniaca. From these, 707 up-regulated and 711 down-regulated genes showing more than two-fold differences in expression level were annotated by GO based on biological processes, molecular functions and cellular components. These differentially expressed genes were found to be involved in several important pathways of carbohydrate, galactose, and starch and sucrose metabolism as well as in biosynthesis of other secondary metabolites via KEGG. This could provide detailed information on the fruit quality differences during development and ripening of these two species. With the results obtained, we provide a practical database for comprehensive understanding of molecular events during fruit development and also lay a theoretical foundation for the cloning of genes regulating in a series of important rate-limiting enzymes involved in vital metabolic pathways during fruit development. Copyright © 2012 Elsevier GmbH. All rights reserved.
RNA-seq Transcriptional Profiling of an Arbuscular Mycorrhiza Provides Insights into Regulated and Coordinated Gene Expression in Lotus japonicus and Rhizophagus irregularis.

PubMed

Handa, Yoshihiro; Nishide, Hiroyo; Takeda, Naoya; Suzuki, Yutaka; Kawaguchi, Masayoshi; Saito, Katsuharu

2015-08-01

Gene expression during arbuscular mycorrhizal development is highly orchestrated in both plants and arbuscular mycorrhizal fungi. To elucidate the gene expression profiles of the symbiotic association, we performed a digital gene expression analysis of Lotus japonicus and Rhizophagus irregularis using a HiSeq 2000 next-generation sequencer with a Cufflinks assembly and de novo transcriptome assembly. There were 3,641 genes differentially expressed during arbuscular mycorrhizal development in L. japonicus, approximately 80% of which were up-regulated. The up-regulated genes included secreted proteins, transporters, proteins involved in lipid and amino acid metabolism, ribosomes and histones. We also detected many genes that were differentially expressed in small-secreted peptides and transcription factors, which may be involved in signal transduction or transcription regulation during symbiosis. Co-regulated genes between arbuscular mycorrhizal and root nodule symbiosis were not particularly abundant, but transcripts encoding for membrane traffic-related proteins, transporters and iron transport-related proteins were found to be highly co-up-regulated. In transcripts of arbuscular mycorrhizal fungi, expansion of cytochrome P450 was observed, which may contribute to various metabolic pathways required to accommodate roots and soil. The comprehensive gene expression data of both plants and arbuscular mycorrhizal fungi provide a powerful platform for investigating the functional and molecular mechanisms underlying arbuscular mycorrhizal symbiosis. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Correlation analysis of the mRNA and miRNA expression profiles in the nascent synthetic allotetraploid Raphanobrassica

PubMed Central

Ye, Bingyuan; Wang, Ruihua; Wang, Jianbo

2016-01-01

Raphanobrassica is an allopolyploid species derived from inter-generic hybridization that combines the R genome from R. sativus and the C genome from B. oleracea var. alboglabra. In the present study, we used a high-throughput sequencing method to identify the mRNA and miRNA profiles in Raphanobrassica and its parents. A total of 33,561 mRNAs and 283 miRNAs were detected, 9,209 mRNAs and 134 miRNAs were differentially expressed respectively, 7,633 mRNAs and 39 miRNAs showed ELD expression, 5,219 mRNAs and 57 miRNAs were non-additively expressed in Raphanobrassica. Remarkably, differentially expressed genes (DEGs) were up-regulated and maternal bias was detected in Raphanobrassica. In addition, a miRNA-mRNA interaction network was constructed based on reverse regulated miRNA-mRNAs, which included 75 miRNAs and 178 mRNAs, 31 miRNAs were non-additively expressed target by 13 miRNAs. The related target genes were significantly enriched in the GO term ‘metabolic processes’. Non-additive related target genes regulation is involved in a range of biological pathways, like providing a driving force for variation and adaption in this allopolyploid. The integrative analysis of mRNA and miRNA profiling provides more information to elucidate gene expression mechanism and may supply a comprehensive and corresponding method to study genetic and transcription variation of allopolyploid. PMID:27874043
Transcriptomic analysis of flower development in tea (Camellia sinensis (L.)).

PubMed

Liu, Feng; Wang, Yu; Ding, Zhaotang; Zhao, Lei; Xiao, Jun; Wang, Linjun; Ding, Shibo

2017-10-05

Flowering is a critical and complicated process in plant development, involving interactions of numerous endogenous and environmental factors, but little is known about the complex network regulating flower development in tea plants. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptomic analysis assembles gene-related information involved in reproductive growth of C. sinensis. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction were enriched among the DEGs. Furthermore, 207 flowering-associated unigenes were identified from our database. Some transcription factors, such as WRKY, ERF, bHLH, MYB and MADS-box were shown to be up-regulated in floral transition, which might play the role of progression of flowering. Furthermore, 14 genes were selected for confirmation of expression levels using quantitative real-time PCR (qRT-PCR). The comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in C. sinensis. Our data also provided a useful database for further research of tea and other species of plants. Copyright © 2017 Elsevier B.V. All rights reserved.
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.

PubMed

Liu, Ruijie; Holik, Aliaksei Z; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E; Asselin-Labat, Marie-Liesse; Smyth, Gordon K; Ritchie, Matthew E

2015-09-03

Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification of the group IIa WRKY subfamily and the functional analysis of GhWRKY17 in upland cotton (Gossypium hirsutum L.)

PubMed Central

Gu, Lijiao; Li, Libei; Wei, Hengling; Wang, Hantao; Su, Junji; Guo, Yaning

2018-01-01

WRKY transcription factors play important roles in plant defense, stress response, leaf senescence, and plant growth and development. Previous studies have revealed the important roles of the group IIa GhWRKY genes in cotton. To comprehensively analyze the group IIa GhWRKY genes in upland cotton, we identified 15 candidate group IIa GhWRKY genes in the Gossypium hirsutum genome. The phylogenetic tree, intron-exon structure, motif prediction and Ka/Ks analyses indicated that most group IIa GhWRKY genes shared high similarity and conservation and underwent purifying selection during evolution. In addition, we detected the expression patterns of several group IIa GhWRKY genes in individual tissues as well as during leaf senescence using public RNA sequencing data and real-time quantitative PCR. To better understand the functions of group IIa GhWRKYs in cotton, GhWRKY17 (KF669857) was isolated from upland cotton, and its sequence alignment, promoter cis-acting elements and subcellular localization were characterized. Moreover, the over-expression of GhWRKY17 in Arabidopsis up-regulated the senescence-associated genes AtWRKY53, AtSAG12 and AtSAG13, enhancing the plant’s susceptibility to leaf senescence. These findings lay the foundation for further analysis and study of the functions of WRKY genes in cotton. PMID:29370286

Some links on this page may take you to non-federal websites. Their policies may differ from this site.